LREC 2000 2nd International Conference on Language Resources & Evaluation

Previous Paper   Next Paper

Title The American National Corpus: A Standardized Resource for American English
Authors Macleod Catherine (Computer Science Department, New York University, New York, New York 10003-6806,
Ide Nancy (Department of Computer Science, Vassar College, Poughkeepsie, NY 12604-0520 USA,
Grishman Ralph (Department of Computer Science, New York University, U.S.A,
Keywords Corpus, Corpus Architecture, Standards
Session Session WO12 - Language Resources: Infrastructural Issues
Full Paper, 196.pdf
Abstract At the first conference on Language Resources and Evaluation, Granada 1998, Charles Fillmore, Nancy Ide, Daniel Jurafsky, and Catherine Macleod proposed creating an American National Corpus (ANC) that would compare with the British National Corpus (BNC) both in balance and in size (one hundred million words). This paper reports on the progress made over the past two years in launching the project. At present, the ANC project is well underway, with commitments for support and contribution of texts from a number of publishers world-wide.