Discovery of (New) Knowledge and the Analysis of Text Corpora
Khurshid Ahmad (1), Maria Teresa Musacchio (2)
(1) Department of Computing, University of Surrey, Guildford GU2 5XH, United Kingdom, email@example.com; (2) Dipartimento di Lingue e Letterature AngloGermaniche e Slave, UniversitÓ di Padova, Via Beldomandi 1, 35137 Padova (Italy), firstname.lastname@example.org
This paper describes how methods and techniques developed in corpus linguistics can be used to compare and contrast samples of language use over time and across genres. A diachronic Italian corpus of nuclear physics texts belonging to different genres is collected, organised, and analysed to demonstrate the use of language in shaping one of the key sciences of the 20th century.
terms, lexical information, domain specific language resources, information extraction