ENGLISH
La vitrine de diffusion des publications et contributions des chercheurs de l'ÉTS
RECHERCHER

Unveiling hidden patterns in infant cry audio: A multi-feature vision transformer approach with explainable AI

Hasasneh, Ahmad, Masri, Sari et Tadj, Chakib. 2025. « Unveiling hidden patterns in infant cry audio: A multi-feature vision transformer approach with explainable AI ». IEEE Access, vol. 13. pp. 161103-161115.

[thumbnail of Tadj-C-2025-32551.pdf]
Prévisualisation
PDF
Tadj-C-2025-32551.pdf - Version publiée
Licence d'utilisation : Creative Commons CC BY.

Télécharger (6MB) | Prévisualisation

Résumé

The early detection and diagnosis of neonatal problems are critical to ensuring that an infant receives timely medical attention, which greatly enhances health outcomes. In this study, we propose a novel deep learning framework that listens to an infant’s cry to identify and diagnose six separate conditions: one being healthy and the other five comprising sepsis, respiratory distress syndrome, jaundice, hyperbilirubinemia, and vomiting. The study utilizes a rich dataset of infant cry recordings from which key acoustic features such as spectrograms, Mel-spectrograms, and Gammatone Frequency Cepstral Coefficients (GFCCs) are extracted. A sophisticated Vision Transformer (ViT) model was developed and meticulously fine-tuned to achieve an impressive 99% classification accuracy through cross-validation. To enhance the model’s interpretability, powerful explainable artificial intelligence (XAI) methods such as LRP, LIME, and attention imaging were implemented to clarify the reasoning behind the model’s outputs. Through cross-validation tests, the model’s trustworthiness and extensive generalizability were assessed. The findings underscore the promising capabilities of employing transformer-based deep learning frameworks along with multimodal acoustic features and explanatory methods to improve cry analysis in infants and their usable scopes in pediatric medicine.

Type de document: Article publié dans une revue, révisé par les pairs
Professeur:
Professeur
Tadj, Chakib
Affiliation: Génie électrique
Date de dépôt: 16 oct. 2025 14:17
Dernière modification: 13 nov. 2025 20:37
URI: https://espace2.etsmtl.ca/id/eprint/32551

Actions (Authentification requise)

Dernière vérification avant le dépôt Dernière vérification avant le dépôt