
The NIST evaluations will resume in 2008, and
may be held in alternate years in the future. They will
feature an increased emphasis on cross channel recog-
nition. Whereas in 2005 and 2006 the core test involved
only telephone speech, with cross channel (tr ain on
telephone, test on microphone) as an optional addi-
tional test, the core test condition is expected to
require processing of a mix of training or test segments
including both telephone and microphone speech,
with some of the trials including different channels in
training and test. This will utilize at least both types of
data as in Mixer 3 and Mixer 5. Evaluation perfor-
mance, however, will be subsequently analyzed to dis-
tinguish performance on telephone, microphone, and
cross-channel trials. A number of different micro-
phone types from the Mixer 5 data will be included.
Related Entries
▶ Performance Evaluation, Overview
▶ Speaker Recognition, Overview
References
1. Cieri, C., Campbell, J.P., Nakasone, H., Miller, D., Walker, K.:
The Mixer Corpus of Multilingual, Multichannel Speaker
Recognition Data, LREC 2004: Fourth International Con-
ference on Language Resources and Evaluation, Lisbon (2004)
2. Cieri, C., Andrews, W., Campbell, J.P., Doddington, G., Godfrey,
J., Huang, S., Liberman, M., Martin, A., Nakasone, H.,
Przybocki, M., Walker, K.: The Mixer and Transcript Reading
Corpora: Resources for Multilingual, Crosschannel Speaker Rec-
ognition Research, LREC 2006: Fifth International Conference
on Language Resources and Evaluation (2006)
3. Cieri, C., Corson, L., Graff, D., Walker, K.: Resources for New
Research Directions in Speaker Recognition: The Mixer 3, 4 and
5 Corpora, Interspeech 2007, Antwerp (August 2007)
4. Martin, A.F., et al.: The DET cur ve in assessment of detection
task performance. In: Proceedings of Eurospeech ’97, vol. 4,
pp. 1899–1903. Rhodes, Greece (September 1997)
5. Brummer, N., du Preez, J.: Application-independent evaluation
of speaker detection. Comput. Speech Lang. 20(2–3), 230–275
(April–July 2006)
6. Doddington, G.: Speaker recognition based on idiolectal differ-
ences between speakers. In: Proceedings of Eurospeech ’01,
vol. 4, pp. 2521–2524. Aalborg, Denmark (September 2001)
7. Martin, A.F., Przybocki, M.A.: The NIST speaker recognition
evaluations: 1996–2001. In: Proceedings of 2001: A Speaker
Odyssey, pp. 39–43. pp. 39–43. Chainia, Crete, Greece ( June
2001)
8. Martin, A.F., Przybocki, M.A., Campbell, J.P.: The NIST speaker
recognition evaluation program. In: Wayman, J. (eds.) et al.:
Biometric Systems: Technology, Design and Performance Evalu-
ation, Chapter 8, pp. 241–262. ?pp. 241–262. Springer, Berlin
(2005)
9. Przybocki, M.A., Martin, A.F.: NIST speaker recognition evalua-
tion chronicles. In: Proceedings of Odyssey 2004: The Speaker
and Language Recognition Workshop. Toledo, Spain (2004)
10. Przybocki, M.A., Martin, A.F., Le, A.N.: NIST speaker recogni-
tion evaluation chronicles – Part 2. In: Proceedings of Odyssey
2006: The Speaker and Language Recognition Workshop.
San Juan, PR (2006)
11. Przybocki, M.A., Martin, A.F., Le, A.N.: NIST speaker recogni-
tion evaluations utililizing the mixer corpora – 2004, 2005, 2006.
IEEE Trans. Audio Speech Lang. Process. 15(7), (2007)
12. Martin, A.F.: Evaluations of automatic speaker classification
systems. In: Muller, C. (ed.) Speaker Classification I, pp. 313–
329. pp. 313–329. Springer, Berlin (2007)
13. Reynolds, D.A.: Keynote talk. In: Proceedings of Odyssey
2008: The Speaker and Language Recognition Workshop.
Stellenbosch, South Africa (January 2008)
14. van Leeuwen, D.A., et al.: NIST and NFI-TNO evaluations
of automatic speaker recognition. Comput. Speech Lang.
20(2), 128–158 (2006)
Speaker Detection
Speaker detection means determining whether or not a
particular speaker is present in an audio stream. The
term multispeaker detection refers to the task of deter-
mining whether a particular known speaker is speaking
in an audio stream containing speech from multiple
speakers.
▶ Speaker Segmentation
Speaker Diarization
This task consists of segmenting a conversation involv-
ing multiple speakers into homogeneous parts, which
contain the voice of only one speaker, and grouping
together all the segments that correspond to the same
speaker.
▶ Speaker Segmentation
Speaker Diarization
S
1253
S