Title

Title	Categorizing Web Pages as a Preprocessing Step for Information Extraction
Author(s)	Viktor Pekar, Richard Evans, Ruslan Mitkov Computational Linguistics Group, University of Wolverhampton
Session	O16-EW
Abstract	At present, information systems combining crawling and information extraction (IE) technologies acquire a lot of research and industrial interest. In this paper, we present an algorithm that exploits techniques for unsupervised IE pattern acquisition in order to facilitate identification of web pages containing information relevant to the IE task.
Keyword(s)	text categorization, information extraction, automated acquisition of IE patterns
Language(s)	English
Full Paper	534.pdf