Sanchez-Cortina, Isaias; Andrés-Ferrer, Jesús; Sanchis, Alberto; Juan, Alfons Speaker-adapted confidence measures for speech recognition of video lectures Journal Article Computer Speech & Language, 37 , pp. 11–23, 2016, ISBN: 0885-2308. Abstract | Links | BibTeX | Tags: Confidence measures, Log-linear models, Online video lectures, Speaker adaptation, Speech Recognition @article{SanchezCortina2016,
title = {Speaker-adapted confidence measures for speech recognition of video lectures},
author = {Isaias Sanchez-Cortina and Jesús Andrés-Ferrer and Alberto Sanchis and Alfons Juan},
url = {http://www.sciencedirect.com/science/article/pii/S0885230815000960
http://authors.elsevier.com/a/1SAsB39HpSHRc0},
isbn = {0885-2308},
year = {2016},
date = {2016-01-01},
journal = {Computer Speech & Language},
volume = {37},
pages = {11--23},
abstract = {Abstract Automatic Speech Recognition applications can benefit from a confidence measure (CM) to predict the reliability of the output. Previous works showed that a word-dependent naïve Bayes (NB) classifier outperforms the conventional word posterior probability as a CM. However, a discriminative formulation usually renders improved performance due to the available training techniques. Taking this into account, we propose a logistic regression (LR) classifier defined with simple input functions to approximate to the \\{NB\\} behaviour. Additionally, as a main contribution, we propose to adapt the \\{CM\\} to the speaker in cases in which it is possible to identify the speakers, such as online lecture repositories. The experiments have shown that speaker-adapted models outperform their non-adapted counterparts on two difficult tasks from English (videoLectures.net) and Spanish (poliMedia) educational lectures. They have also shown that the \\{NB\\} model is clearly superseded by the proposed \\{LR\\} classifier.},
keywords = {Confidence measures, Log-linear models, Online video lectures, Speaker adaptation, Speech Recognition},
pubstate = {published},
tppubtype = {article}
}
Abstract Automatic Speech Recognition applications can benefit from a confidence measure (CM) to predict the reliability of the output. Previous works showed that a word-dependent naïve Bayes (NB) classifier outperforms the conventional word posterior probability as a CM. However, a discriminative formulation usually renders improved performance due to the available training techniques. Taking this into account, we propose a logistic regression (LR) classifier defined with simple input functions to approximate to the \{NB\} behaviour. Additionally, as a main contribution, we propose to adapt the \{CM\} to the speaker in cases in which it is possible to identify the speakers, such as online lecture repositories. The experiments have shown that speaker-adapted models outperform their non-adapted counterparts on two difficult tasks from English (videoLectures.net) and Spanish (poliMedia) educational lectures. They have also shown that the \{NB\} model is clearly superseded by the proposed \{LR\} classifier. |