Hasasneh, Ahmad, Masri, Sari et Tadj, Chakib.
2025.
« Unveiling hidden patterns in infant cry audio: A multi-feature vision transformer approach with explainable AI ».
IEEE Access, vol. 13.
pp. 161103-161115.
Prévisualisation |
PDF
Tadj-C-2025-32551.pdf - Version publiée Licence d'utilisation : Creative Commons CC BY. Télécharger (6MB) | Prévisualisation |
Résumé
The early detection and diagnosis of neonatal problems are critical to ensuring that an infant receives timely medical attention, which greatly enhances health outcomes. In this study, we propose a novel deep learning framework that listens to an infant’s cry to identify and diagnose six separate conditions: one being healthy and the other five comprising sepsis, respiratory distress syndrome, jaundice, hyperbilirubinemia, and vomiting. The study utilizes a rich dataset of infant cry recordings from which key acoustic features such as spectrograms, Mel-spectrograms, and Gammatone Frequency Cepstral Coefficients (GFCCs) are extracted. A sophisticated Vision Transformer (ViT) model was developed and meticulously fine-tuned to achieve an impressive 99% classification accuracy through cross-validation. To enhance the model’s interpretability, powerful explainable artificial intelligence (XAI) methods such as LRP, LIME, and attention imaging were implemented to clarify the reasoning behind the model’s outputs. Through cross-validation tests, the model’s trustworthiness and extensive generalizability were assessed. The findings underscore the promising capabilities of employing transformer-based deep learning frameworks along with multimodal acoustic features and explanatory methods to improve cry analysis in infants and their usable scopes in pediatric medicine.
| Type de document: | Article publié dans une revue, révisé par les pairs |
|---|---|
| Professeur: | Professeur Tadj, Chakib |
| Affiliation: | Génie électrique |
| Date de dépôt: | 16 oct. 2025 14:17 |
| Dernière modification: | 13 nov. 2025 20:37 |
| URI: | https://espace2.etsmtl.ca/id/eprint/32551 |
Actions (Authentification requise)
![]() |
Dernière vérification avant le dépôt |

