La vitrine de diffusion des publications et contributions des chercheurs de l'ÉTS

Robust watch-list screening using dynamic ensembles of SVMs based on multiple face representations

Bashbaghi, Saman et Granger, Eric et Sabourin, Robert et Bilodeau, Guillaume-Alexandre. 2017. « Robust watch-list screening using dynamic ensembles of SVMs based on multiple face representations ». Machine Vision and Applications, vol. 28, nº 1/2. p. 219-241.


Télécharger (2MB) | Prévisualisation


Still-to-video face recognition (FR) is an important function in video surveillance, where faces captured over a network of video cameras are matched against reference stills of target individuals. Screening faces against a watch-list is a challenging video surveillance application because the appearance of faces vary due to changing capture conditions and operational domains. The facial models used for matching may not be representative of faces captured with video cameras because they are typically designed a priori with only one reference still. In this paper, a multi-classifier framework is proposed for robust still-to-video FR based on multiple and diverse face representations of a single reference face still. During enrollment of a target individual, the single reference face still is modeled using an ensemble of SVM classifiers based on different patches and face descriptors. Multiple feature extraction techniques are applied to patches isolated in the reference still to generate a diverse SVM pool that provides robustness to common nuisance factors (e.g., variations in illumination and pose). The estimation of discriminant feature subsets, classifier parameters, decision thresholds, and ensemble fusion functions is achieved using the high-quality reference still and a large number of faces captured in lower quality video of non-target individuals in the scene. During operations, the most competent subset of SVMs are dynamically selected according to capture conditions. Finally, a head-face tracker gradually regroups faces captured from different people appearing in a scene, while each individual-specific ensemble performs face matching. The accumulation of matching scores per face track leads to a robust spatio-temporal FR when accumulated ensemble scores surpass a detection threshold. Experimental results obtained with the Chokepoint and COX-S2V datasets show a significant improvement in performance w.r.t. reference systems, especially when individual-specific ensembles (1) are designed using exemplar-SVMs rather than one-class SVMs, and (2) exploit score-level fusion of local SVMs (trained using features extracted from each patch), rather than using either decision-level or feature-level fusion with a global SVM (trained by concatenating features extracted from patches).

Type de document: Article publié dans une revue, révisé par les pairs
Granger, Éric
Sabourin, Robert
Affiliation: Génie de la production automatisée
Date de dépôt: 23 janv. 2017 15:56
Dernière modification: 19 avr. 2017 20:44

Actions (Authentification requise)

Dernière vérification avant le dépôt Dernière vérification avant le dépôt

Statistiques de téléchargement

Plus de statistiques ...