SUMMARY : Session O31-W Corpus Construction & Annotation

 

Title Rule-Based Chunking and Reusability
Authors C. Grover, R. Tobin
Abstract In this paper we discuss a rule-based approach to chunking implemented using the LT-XML2 and LT-TTT2 tools. We describe the tools and the pipeline and grammars that have been developed for the task of chunking. We show that our rule-based approach is easy to adapt to different chunking styles and that the mark-up of further linguistic information such as nominal and verbal heads can be added to the rules at little extra cost. We evaluate our chunker against the CoNLL 2000 data and discuss discrepancies between our output and the CoNLL mark-up as well as discrepancies within the CoNLL data itself. We contrast our results with the higher scores obtained using machine learning and argue that the portability and flexibility of our approach still make it a more practical solution.
Keywords Chunking, XML, mark-up tools, rule-based, reusability
Full paper Rule-Based Chunking and Reusability