Summary of the paper

Title WikiDragon: A Java Framework For Diachronic Content And Network Analysis Of MediaWikis
Authors Rüdiger Gleim, Alexander Mehler and Sung Y. Song
Abstract We introduce WikiDragon, a Java Framework designed to give developers in computational linguistics an intuitive API to build, parse and analyze instances of MediaWikis such as Wikipedia, Wiktionary or WikiSource on their computers. It covers current versions of pages as well as the complete revision history, gives diachronic access to both page source code as well as accurately parsed HTML and supports the diachronic exploration of the page network. WikiDragon is self enclosed and only requires an XML dump of the official Wikimedia Foundation website for import into an embedded database. No additional setup is required. We describe WikiDragon’s architecture and evaluate the framework based on the simple English Wikipedia with respect to the accuracy of link extraction, diachronic network analysis and the impact of using different Wikipedia frameworks to text analysis.
Topics Information Extraction, Information Retrieval, Tools, Systems, Applications, Other
Full paper WikiDragon: A Java Framework For Diachronic Content And Network Analysis Of MediaWikis
Bibtex @InProceedings{GLEIM18.905,
  author = {Rüdiger Gleim and Alexander Mehler and Sung Y. Song},
  title = "{WikiDragon: A Java Framework For Diachronic Content And Network Analysis Of MediaWikis}",
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {May 7-12, 2018},
  address = {Miyazaki, Japan},
  editor = {Nicoletta Calzolari (Conference chair) and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Hélène Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis and Takenobu Tokunaga},
  publisher = {European Language Resources Association (ELRA)},
  isbn = {979-10-95546-00-9},
  language = {english}
  }
Powered by ELDA © 2018 ELDA/ELRA