Title Japanese MULTEXT: A Prosodic Corpus
Author(s) Kitazawa Shigeyoshi (1), Kiriyama Shinya (1), Itoh Toshihiko (1), Nick Campbell (2)

(1) Department of Computer Science, Faculty of Information, Shizuoka University; (2) ATR Human Information Science Research Labs

Abstract A prosodic corpus of Japanese was developed as a scheduled project by the university researchers in Japan. This paper describes the contents of the corpus: speakers, speaking style, recording conditions, prosodic annotations. The corpus is a Japanese version of the MULTEXT prosodic database of EUROM1. We adopted a J-ToBI prosodic labeling scheme as well as additional labels such as pitich range, prominence, devoicing, and nasalization. We developed an automatic generation of J-ToBI labels. It was proved that 71.6% of tone labels were placed on the correct positions with the correct symbols, and that 73.7% of BI labels were generated correctly. Automatic prosodic label generator was evaluated by expert labeler team and beginner team and found to be helpful for both of them. 
Keyword(s) MULTEXT, J-ToBI, Japanese, Prosodic Corpus 
Language(s) Japanese
