Bouserhal, Rachel E., Chabot, Philippe, Sarria-Paja, Milton, Cardinal, Patrick et Voix, Jérémie.
2018.
« Classification of nonverbal human produced audio events: A pilot study ».
In 19th Annual Conference of the International Speech Communication (INTERSPEECH 2018) (Hyderabad, India, Sept. 02-06, 2018)
pp. 1512-1516.
International Speech Communication Association.
Compte des citations dans Scopus : 8.
Prévisualisation |
PDF
Voix-J-2019-17579.pdf - Version publiée Licence d'utilisation : Tous les droits réservés aux détenteurs du droit d'auteur. Télécharger (411kB) | Prévisualisation |
Résumé
The accurate classification of nonverbal human produced audio events opens the door to numerous applications beyond health monitoring. Voluntary events, such as tongue clicking and teeth chattering, may lead to a novel way of silent interface command. Involuntary events, such as coughing and clearing the throat, may advance the current state-of-the-art in hearing health research. The challenge of such applications is the balance between the processing capabilities of a small intra-aural device and the accuracy of classification. In this pilot study, 10 nonverbal audio events are captured inside the ear canal blocked by an intra-aural device. The performance of three classifiers is investigated: Gaussian Mixture Model (GMM), Support Vector Machine and Multi-Layer Perceptron. Each classifier is trained using three different feature vector structures constructed using the mel-frequency cepstral (MFCC) coefficients and their derivatives. Fusion of the MFCCs with the auditory-inspired amplitude modulation features (AAMF) is also investigated. Classification is compared between binaural and monaural training sets as well as for noisy and clean conditions. The highest accuracy is achieved at 75.45% using the GMM classifier with the binaural MFCC+AAMF clean training set. Accuracy of 73.47% is achieved by training and testing the classifier with the binaural clean and noisy dataset.
Type de document: | Compte rendu de conférence |
---|---|
ISBN: | 2308457X |
Professeur: | Professeur Bouserhal, Rachel Cardinal, Patrick Voix, Jérémie |
Affiliation: | Autres, Génie logiciel et des technologies de l'information, Génie mécanique |
Date de dépôt: | 19 nov. 2018 21:50 |
Dernière modification: | 28 avr. 2022 19:34 |
URI: | https://espace2.etsmtl.ca/id/eprint/17579 |
Actions (Authentification requise)
Dernière vérification avant le dépôt |