Assessing the difficulty of finding people in texts


Constantin Orasan (University of Wolverhampton)

Richard Evans (University of Wolverhampton)


WO15: Semantic Tagging


In this paper several methods for animacy recognition are evaluated. Each method has an increasing complexity over the previous one and involves more resources, and as a result, more computation. When assessing the performace of these methods we consider three factors: the results of an intrinsic evaluation, the results of an extrinsic evaluation, and the complexity of the method. For intrinsic evaluation the accuracy of the overall classification is considered as well as the precision and recall for each type classification. In the extrinsic evaluation, the animacy classifier is used to filter candidates in a pronominal anaphora resolution system. Given the wide variety of texts used, an anaphora resolution system could not be used for this evaluation because its performance depends upon the genre of the text being processed. For this reason, the reduction of the number of candidates, the reduction of the number of antecedents, and the increase in the number of pronouns without any antecedents were recorded and used to differentiate between the systems. Comparison between different systems showed that the best one is the system which uses machine learning, and that the additional information brought by different modules does not lead to an increase in the success of the system due to the errors introduced by them.


Learning animacy, Evaluation, System modularity, Machine learning

Full Paper