Classification of nonverbal human produced audio events: A pilot study

Statistiques de téléchargement

Téléchargements

Téléchargements par mois depuis la dernière année

Bouserhal, Rachel E., Chabot, Philippe, Sarria-Paja, Milton, Cardinal, Patrick et Voix, Jérémie. 2018. « Classification of nonverbal human produced audio events: A pilot study ». In 19th Annual Conference of the International Speech Communication (INTERSPEECH 2018) (Hyderabad, India, Sept. 02-06, 2018) pp. 1512-1516. International Speech Communication Association.
Compte des citations dans Scopus : 6.

Prévisualisation

PDF
Voix-J-2019-17579.pdf - Version publiée
Licence d'utilisation : Tous les droits réservés aux détenteurs du droit d'auteur.
Télécharger (411kB) | Prévisualisation

URL Officielle: http://dx.doi.org/10.21437/Interspeech.2018-2299

Résumé

The accurate classification of nonverbal human produced audio events opens the door to numerous applications beyond health monitoring. Voluntary events, such as tongue clicking and teeth chattering, may lead to a novel way of silent interface command. Involuntary events, such as coughing and clearing the throat, may advance the current state-of-the-art in hearing health research. The challenge of such applications is the balance between the processing capabilities of a small intra-aural device and the accuracy of classification. In this pilot study, 10 nonverbal audio events are captured inside the ear canal blocked by an intra-aural device. The performance of three classifiers is investigated: Gaussian Mixture Model (GMM), Support Vector Machine and Multi-Layer Perceptron. Each classifier is trained using three different feature vector structures constructed using the mel-frequency cepstral (MFCC) coefficients and their derivatives. Fusion of the MFCCs with the auditory-inspired amplitude modulation features (AAMF) is also investigated. Classification is compared between binaural and monaural training sets as well as for noisy and clean conditions. The highest accuracy is achieved at 75.45% using the GMM classifier with the binaural MFCC+AAMF clean training set. Accuracy of 73.47% is achieved by training and testing the classifier with the binaural clean and noisy dataset.

Type de document:	Compte rendu de conférence
ISBN:	2308457X
Professeur:	Professeur Bouserhal, Rachel Cardinal, Patrick Voix, Jérémie
Affiliation:	Autres, Génie logiciel et des technologies de l'information, Génie mécanique
Date de dépôt:	19 nov. 2018 21:50
Dernière modification:	28 avr. 2022 19:34
URI:	https://espace2.etsmtl.ca/id/eprint/17579

Actions (Authentification requise)

Dernière vérification avant le dépôt