Development of Slovenian Broadcast News Speech Database
Janez Žibert, France Mihelič
Faculty of Electrical Engineering University of Ljubljana Trzaska 25, 1000 Ljubljana, Slovenia
The paper reviews the development of a new Slovenian broadcast news speech database. The database consists of audio, video and annotation transcripts of about 34 hours of television daily news program captured from the public TV station RTVSLO. The paper addresses issues concerning transcription and annotation of the collected data, provides information on content analysis and basic statistics of the collected material and reports about preliminary evaluation of automatic segmentation.
broadcast news, transcription, annotation, audio segmentation, speech processing