FRANÇAIS
A showcase of ÉTS researchers’ publications and other contributions
SEARCH

Mitigating alert fatigue in cloud monitoring systems: A machine learning perspective

Voutsas, Fotios, Violos, John and Leivadeas, Aris. 2024. « Mitigating alert fatigue in cloud monitoring systems: A machine learning perspective ». Computer Networks, vol. 250.
Compte des citations dans Scopus : 1.

[thumbnail of Leivadeas-A-2024-28822.pdf]
Preview
PDF
Leivadeas-A-2024-28822.pdf - Published Version
Use licence: Creative Commons CC BY-NC.

Download (1MB) | Preview

Abstract

Next generation networks will be largely based on monitoring and telemetry tools that are essential for maintaining optimal performance, ensuring security, managing costs, and performing fault detection and resolution. An integral part of the overall monitoring strategy is alerting, which provides administrators with the necessary information to proactively or reactively manage and optimize network services. However, when monitoring systems generate an excessive number of alerts, many of which may not be actionable or may not represent critical issues, the phenomenon of alert fatigue occurs. Alert fatigue refers to a situation where the volume and the speed of the continuous influx of alerts becomes so overwhelming that the network administrators become desensitized and do not respond to them. To this end, and inspired by recent trends in network automation, where human intervention tends to be minimized, we introduce an alert fatigue mitigation mechanism in monitoring focusing on cloud computing infrastructures. In particular, a composite machine learning methodology is proposed in order to select which alerts will be hidden and which ones will be presented to the administrators. Additionally, to personalize the results, the proposed approach considers the level of users’ experience along with the alert features to further optimize the accuracy of the alert filtering mechanism. The research has been conducted in a realistic environment of a leading monitoring enterprise, Netdata, which provided two datasets for testing our approach. Furthermore, the attained results of the filtering mechanism were evaluated by expert engineers of the company that verified the output of the proposed framework. Specifically, the outcomes confirm that our proposed methodology mitigates the alert fatigue problem with an accuracy that surpass 90% in most cases.

Item Type: Peer reviewed article published in a journal
Professor:
Professor
Leivadeas, Aris
Affiliation: Génie logiciel et des technologies de l'information
Date Deposited: 27 Jun 2024 13:36
Last Modified: 08 Jul 2024 18:28
URI: https://espace2.etsmtl.ca/id/eprint/28822

Actions (login required)

View Item View Item