LREC 2000 2nd International Conference on Language Resources & Evaluation  
Home Basic Info Archaeological Zappeion Registration Conference

Conference Papers

Program
Papers
Sessions
Abstracts
Authors
Keywords
Search

Papers by paper title: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Papers by ID number: 1-50, 51-100, 101-150, 151-200, 201-250, 251-300, 301-350, 351-377.

List of all papers and abstracts.


Previous Paper   Next Paper  

Title Portuguese Corpora at CLUL
Authors Bacelar do Nascimento Maria Fernanda (Centro de Linguistica da Universidade de Lisboa, Av. 5 de Outubro, N?85, 5?-6? 1050-050 LISBOA, fbacelar.nascimento@clul.ul.pt)
Pereira Luisa (Centro de Linguistica da Universidade de Lisboa, Av. 5 de Outubro, N?85, 5?-6? 1050-050 LISBOA, luisa.alice.sp@clul.ul.pt)
Saramago Joao (Centro de Linguistica da Universidade de Lisboa, Av. 5 de Outubro, N?85, 5?-6? 1050-050 LISBOA, j.saramago@clul.ul.pt)
Keywords Applications, Oral Corpora, Portuguese Varieties, Tools, Written Corpora
Session Session WP7 - Corpus Projects
Abstract The Corpus de Referencia do Portugues Contemporaneo (CRPC) is being developed in the Centro de Linguistica da Universidade de Lisboa (CLUL) since 1988 under a perspective of research data enlargement, in the sense of concepts and hypothesis verification by rejecting the sole use of intuitive data. The intention of creating this open corpus is to establish an on-line representative sample collection of general usage contemporary Portuguese: a main corpus of great dimension as well as several specialized corpora. The CRPC has nowadays around 92 million words. Following the use in this area, the CRPC project intends to establish a linguistic database accessible to everyone interested in making theoretical and practical studies or applications. The Dialectal oral corpus of the Atlas Linguistico-Etnografico de Portugal e da Galiza (ALEPG) is constituted by approximately 3500 hours of speech collected by the CLUL Dialectal Studies Research Group and recorded in analogic audio tape. This corpus contains mainly directed speech: answers to a linguistic questionnaire essentially lexical, but also focusing on some phonetic and morpho-phonological phenomena. An important part of spontaneous speech enables other kind of studies such as syntactic, morphological or phonetic ones.

 

Verdana">