ENGLISH
La vitrine de diffusion des publications et contributions des chercheurs de l'ÉTS
RECHERCHER

Mitigating alert fatigue in cloud monitoring systems: A machine learning perspective

Voutsas, Fotios, Violos, John et Leivadeas, Aris. 2024. « Mitigating alert fatigue in cloud monitoring systems: A machine learning perspective ». Computer Networks, vol. 250.
Compte des citations dans Scopus : 1.

[thumbnail of Leivadeas-A-2024-28822.pdf]
Prévisualisation
PDF
Leivadeas-A-2024-28822.pdf - Version publiée
Licence d'utilisation : Creative Commons CC BY-NC.

Télécharger (1MB) | Prévisualisation

Résumé

Next generation networks will be largely based on monitoring and telemetry tools that are essential for maintaining optimal performance, ensuring security, managing costs, and performing fault detection and resolution. An integral part of the overall monitoring strategy is alerting, which provides administrators with the necessary information to proactively or reactively manage and optimize network services. However, when monitoring systems generate an excessive number of alerts, many of which may not be actionable or may not represent critical issues, the phenomenon of alert fatigue occurs. Alert fatigue refers to a situation where the volume and the speed of the continuous influx of alerts becomes so overwhelming that the network administrators become desensitized and do not respond to them. To this end, and inspired by recent trends in network automation, where human intervention tends to be minimized, we introduce an alert fatigue mitigation mechanism in monitoring focusing on cloud computing infrastructures. In particular, a composite machine learning methodology is proposed in order to select which alerts will be hidden and which ones will be presented to the administrators. Additionally, to personalize the results, the proposed approach considers the level of users’ experience along with the alert features to further optimize the accuracy of the alert filtering mechanism. The research has been conducted in a realistic environment of a leading monitoring enterprise, Netdata, which provided two datasets for testing our approach. Furthermore, the attained results of the filtering mechanism were evaluated by expert engineers of the company that verified the output of the proposed framework. Specifically, the outcomes confirm that our proposed methodology mitigates the alert fatigue problem with an accuracy that surpass 90% in most cases.

Type de document: Article publié dans une revue, révisé par les pairs
Professeur:
Professeur
Leivadeas, Aris
Affiliation: Génie logiciel et des technologies de l'information
Date de dépôt: 27 juin 2024 13:36
Dernière modification: 08 juill. 2024 18:28
URI: https://espace2.etsmtl.ca/id/eprint/28822

Actions (Authentification requise)

Dernière vérification avant le dépôt Dernière vérification avant le dépôt