Home Feedback Contents Search

SpeechDat(E)

 

Up
Sampa Sk

The project is focused on Spoken Language Resources, namely speech databases for fixed telephone networks including associated annotations and pronunciation lexica. These database are useful for both training and testing of typical present-day teleservices as well as a phonetically rich set of material which can be used to train more advanced, vocabulary independent speech recognition systems. They comprise mostly read speech for ease of practical collection, but also some spontaneous speech forms representing common utterance types. The design is based on a 1000 - 2500 speaker collection, which is balanced for sex, age and dialect representation. This aims to provide unique resources which did not exist so far covering the languages Russian with 2500 speakers and, with 1000 speakers, Czech, Slovak , Polish and Hungarian. These databases will serve as an important resource for the performance of voice driven teleservice systems in practical implementations.

Official site of SpeechDat (E)

 

Up ] Sampa Sk ]

Send mail to trnka@savba.sk with questions or comments about this web site.
Last modified: July 21, 2000