Speaker Recognition Resources

Welcome! The use of common corpora for evaluation of speech and speaker recognition systems has proven invaluable in comparing different approaches, sharing results, and generally advancing the technology state-of-the-art. Within the last five years the number of publicly available speech corpora has increased dramatically. Unfortunately, the information describing these corpora is not centralized and is sometimes difficult to obtain. It is the aim of this site to act as a clearing house for cataloging and describing corpora suitable for the evaluation of speaker recognition systems. We encourage researchers in the field to use and report results on these standard corpora to help further advances and interactions.

The genesis of this project was a paper we published in the 1999 Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (provided below). However, to keep up with the evolving list of corpora available, we solicit input from you to alert us to new and overlooked speaker recognition corpora. A form is provided below to submit information about any such corpus. Pointers to papers or published results on these corpora are also most welcome and we have included a form for this too.

In addition, we invite feedback on this site and suggestions for any improvements.

Enjoy,
Joe Campbell & Doug Reynolds


ICASSP-99

ICASSP-99 banner

Our ICASSP-99 paper, Corpora for the Evaluation of Speaker Recognition Systems, is available in a variety of formats: web, Microsoft Word97, Portable Document Format (PDF), and PostScript (send to a PostScript Level 2 or 3 printer).

Here are the slides we used in our ICASSP-99 poster session presentation, which have information not contained in our ICASSP-99 paper. The slides show some example results of systems using these corpora.

We inadvertently missed referencing the paper by Godfrey, J., D. Graff, and A. Martin. "Public Databases for Speaker Recognition and Verification," ESCA Workshop on Automatic Speaker Recognition Identification and Verification, Martigny, Switzerland, April 1994, p. 39-42 (this paper is only available in PostScript format, you might want to use the Acrobat Reader 4 or RoPS viewer).

Corpora We Included

Corpora We Excluded

Below is a list of additional corpora of which we are aware. Due to page limit restrictions, we were unable to include all known corpora in our ICASSP paper. Next to each corpus title we indicate why we chose not to include it in the paper. Further information is available in "Excluded Corpora for the Evaluation of Speaker Recognition Systems" (Microsoft Word97 format). Please let us know if you disagree with these exclusions or if you know of any updates to these corpora.

New & Missed Corpora?

As promised, we are collecting lists and characteristics of publicly available speaker recognition corpora and evaluations. We plan to update this web site as new corpora become available and possibly write future papers on corpora. If you know of a new corpus or one we missed, please tell us via our Speaker Recognition Corpora Form.

New Results on Corpora?

We encourage researchers in the field to use and report results on standard corpora to help further advances and interactions. Pointers to papers or published results on these corpora are also most welcome. If you know of results on a publicly available standard speaker recognition corpus, please tell us via our Speaker Recognition Results Form.

NIST Evaluations

To understand and join the NIST Coordinated Speaker Recognition Evaluations, please visit the Speaker Recognition section of the NIST Spoken Language Technology Evaluations page.

We encourage participation in current and future NIST Evaluations. Some sites might prefer to begin by running a prior Evaluation, which is not pure in the blind testing sense, but is still very useful. To run a prior Evaluation, you will need 4 items:


Make your own DET curves, etc.

Free software to create detection error tradeoff (DET) curves, etc. is available from NIST's Spoken Language Technology Evaluation and Utility Software page. Additional information on DET curves is available in the paper by Martin, A., G. Doddington, T. Kamm, M. Ordowski, and M. Przybocki. "The DET Curve in Assessment of Detection Task Performance," Proceedings of Eurospeech Conference, Rhodes, Greece, Sep 1997, p. 1895-1898.

Key Sources of Corpora


Eurospeech-99

Speaker and Language Recognition Using Speech Codec Parameters

Authors: T.F. Quatieri, E. Singer, R.B. Dunn, D.A. Reynolds, J.P. Campbell*

MIT Lincoln Laboratory, Lexington, MA, USA
quatieri@ll.mit.edu

* Department of Defense

ABSTRACT

In this paper, we investigate the effect of speech coding on speaker and language recognition tasks. Three coders were selected to cover a wide range of quality and bit rates: GSM at 12.2 kb/s, G.729 at 8 kb/s, and G.723.1 at 5.3 kb/s. Our objective is to measure recognition performance from either the synthesized speech or directly from the coder parameters themselves. We show that using speech synthesized from the three codecs, GMM-based speaker verification and phone-based language recognition performance generally degrades with coder bit rate, i.e., from GSM to G.729 to G.723.1, relative to an uncoded baseline. In addition, speaker verification for all codecs shows a performance decrease as the degree of mismatch between training and testing conditions increases, while language recognition exhibited no decrease in performance. We also present initial results in determining the relative importance of codec system components in their direct use for recognition tasks. For the G.729 codec, it is shown that removal of the postfilter in the decoder helps speaker verification performance under the mismatched condition. On the other hand, with use of G.729 LSF-based mel-cepstra, performance decreases under all conditions, indicating the need for a residual contribution to the feature representation.

Volume 2, Page 787-790

Click here to download/view the file.


Please send your additions and requests to Joe and Doug.
16 October 1999
This URL is: http://www.apl.jhu.edu/Classes/Notes/Campbell/SpkrRec/