A showcase of ÉTS researchers’ publications and other contributions

Classification of nonverbal human produced audio events: A pilot study


Downloads per month over past year

Bouserhal, Rachel E., Chabot, Philippe, Sarria-Paja, Milton, Cardinal, Patrick and Voix, Jérémie. 2018. « Classification of nonverbal human produced audio events: A pilot study ». In 19th Annual Conference of the International Speech Communication (INTERSPEECH 2018) (Hyderabad, India, Sept. 02-06, 2018) pp. 1512-1516. International Speech Communication Association.
Compte des citations dans Scopus : 6.

[thumbnail of Voix-J-2019-17579.pdf]
Voix-J-2019-17579.pdf - Published Version
Use licence: All rights reserved to copyright holder.

Download (411kB) | Preview


The accurate classification of nonverbal human produced audio events opens the door to numerous applications beyond health monitoring. Voluntary events, such as tongue clicking and teeth chattering, may lead to a novel way of silent interface command. Involuntary events, such as coughing and clearing the throat, may advance the current state-of-the-art in hearing health research. The challenge of such applications is the balance between the processing capabilities of a small intra-aural device and the accuracy of classification. In this pilot study, 10 nonverbal audio events are captured inside the ear canal blocked by an intra-aural device. The performance of three classifiers is investigated: Gaussian Mixture Model (GMM), Support Vector Machine and Multi-Layer Perceptron. Each classifier is trained using three different feature vector structures constructed using the mel-frequency cepstral (MFCC) coefficients and their derivatives. Fusion of the MFCCs with the auditory-inspired amplitude modulation features (AAMF) is also investigated. Classification is compared between binaural and monaural training sets as well as for noisy and clean conditions. The highest accuracy is achieved at 75.45% using the GMM classifier with the binaural MFCC+AAMF clean training set. Accuracy of 73.47% is achieved by training and testing the classifier with the binaural clean and noisy dataset.

Item Type: Conference proceeding
ISBN: 2308457X
Bouserhal, Rachel
Cardinal, Patrick
Voix, Jérémie
Affiliation: Autres, Génie logiciel et des technologies de l'information, Génie mécanique
Date Deposited: 19 Nov 2018 21:50
Last Modified: 28 Apr 2022 19:34

Actions (login required)

View Item View Item