Learning of word sense disambiguation rules by Co-training, checking co-occurrence of features


Hiroyuki Shinnou (Ibaraki University, 4-12-1 Nakanarusawa Hitachi Ibaraki 316-8511 JAPAN)


WO15: Semantic Tagging


In this paper, we propose a method to improve Co-training and apply it to word sense disambiguation problems. Co-training is an unsupervised learning method to overcome the problem that labeled training data is fairly expensive to obtain. Co-training is theoretically promising, but it requires two feature sets with the conditional independence assumption. This assumption is too rigid. In fact there is no choice but to use incomplete feature sets, and then the accuracy of learned rules reaches a limit. In this paper, we check co-occurrence between two feature sets to avoid such undesirable situation when we add unlabeled instances to training data. In experiments, we applied our method to word sense disambiguation problems for the three Japanese words ‘koe’, ‘toppu’ and ‘kabe’ and demonstrated that it improved Co-training.


Co-training, Koe, Toppu, Kabe

Full Paper