Summary of the paper

Title M-CNER: A Corpus for Chinese Named Entity Recognition in Multi-Domains
Authors Qi Lu, YaoSheng Yang, Zhenghua Li, Wenliang Chen and Min Zhang
Abstract In this paper, we present a new corpus for Chinese Named Entity Recognition (NER) from three domains : human-computer interaction, social media, and e-commerce. The annotation procedure is conducted in two rounds. In the first round, one sentence is annotated by more than one persons independently. In the second round, the experts discuss the sentences for which the annotators do not make agreements. Finally, we obtain a corpus which have five data sets in three domains. We further evaluate three popular models on the newly created data sets. The experimental results show that the system based on Bi-LSTM-CRF performs the best among the comparison systems on all the data sets. The corpus can be used for further studies in research community.
Topics Named Entity Recognition, Information Extraction, Information Retrieval, Corpus (Creation, Annotation, Etc.)
Full paper M-CNER: A Corpus for Chinese Named Entity Recognition in Multi-Domains
Bibtex @InProceedings{LU18.682,
  author = {Qi Lu and YaoSheng Yang and Zhenghua Li and Wenliang Chen and Min Zhang},
  title = "{M-CNER: A Corpus for Chinese Named Entity Recognition in Multi-Domains}",
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {May 7-12, 2018},
  address = {Miyazaki, Japan},
  editor = {Nicoletta Calzolari (Conference chair) and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Hélène Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis and Takenobu Tokunaga},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {979-10-95546-00-9},
  language = {english}
Powered by ELDA © 2018 ELDA/ELRA