LREC 2014 Proceedings

INTRODUCTORY MESSAGES:

Nicoletta Calzolari - Introduction of the Conference Chair & Message from ELRA President
Khalid Choukri - Message from ELRA Secretary General and ELDA Managing Director
Eiríkur Rögnvaldsson - Message of the Chair of the Local Organizing Committee
Joseph Mariani - Message of the ELRA Honorary President

INVITED TALK:

Thórhallur Eythórsson - Icelandic Quirks: Testing Linguistic Theories and Language Technology

KEYNOTES SPEECHES:

Hassan Sawaf - Language Technology for Commerce, the eBay Way
Luc Steels - When will robots speak like you and me?

CLOSING SESSION:

Joseph Mariani, Patrick Paroubek, Gil Francopoulo, Olivier Hamon - Rediscovering 15 Years of Discoveries in Language Resources and Evaluation: The LREC Anthology Analysis

SESSIONS: Browse articles of the conference sorted by session number

Day 1, Oral Sessions:

	Session O1 - NLP Workflows	Chairperson: Stephanie Strassel
11:35-11:55	Steve Cassidy, Dominique Estival, Timothy Jones, Denis Burnham and Jared Burghold	The Alveo Virtual Laboratory: A Web based Repository API
11:55-12:15	Xabier Artola, Zuhaitz Beloki and Aitor Soroa	A Stream Computing Approach Towards Scalable NLP
12:15-12:35	Hao Wu, Zhiye Fei, Aaron Dai, Mark Sammons, Dan Roth and Stephen Mayhew	ILLINOISCLOUDNLP: Text Analytics Services in the Cloud
12:35-12:55	Nancy Ide, James Pustejovsky, Christopher Cieri, Eric Nyberg, Di Wang, Keith Suderman, Marc Verhagen and Jonathan Wright	The Language Application Grid
12:55-13:15	George Christodoulides	Praaline: Integrating Tools for Speech Corpus Research

	Session O2 - Machine Translation and Evaluation (1)	Chairperson: Bente Maegaard
11:35-11:55	Anabela Barreiro, Johanna Monti, Brigitte Orliac, Susanne Preuß, Kutz Arrieta, Wang Ling, Fernando Batista and Isabel Trancoso	Linguistic Evaluation of Support Verb Constructions by OpenLogos and Google Translate
11:55-12:15	Sandipan Dandapat and Declan Groves	MTWatch: A Tool for the Analysis of Noisy Parallel Data
12:15-12:35	Thierry Etchegoyhen, Lindsay Bywood, Mark Fishel, Panayota Georgakopoulou, Jie Jiang, Gerard van Loenhout, Arantza del Pozo, Mirjam Sepesy Maucec, Anja Turner and Martin Volk	Machine Translation for Subtitling: A Large-Scale Evaluation
12:35-12:55	Joaquim Moré and Salvador Climent	Machine Translationness: Machine-likeness in Machine Translation Evaluation
12:55-13:15	Joke Daems, Lieve Macken and Sonia Vandepitte	On the Origin of Errors: a Fine-Grained Analysis of MT and PE Errors and their Relationship

	Session O3 - Grammar and Parsing (1)	Chairperson: Emily M. Bender
11:35-11:55	Fei Cheng, Kevin Duh and Yuji Matsumoto	Parsing Chinese Synthetic Words with a Character-based Dependency Model
11:55-12:15	Tomáš Jelínek	Improvements to Dependency Parsing Using Automatic Simplification of Data
12:15-12:35	Joost Bastings and Khalil Sima'an	All Fragments Count in Parser Evaluation
12:35-12:55	Maria Simi, Cristina Bosco and Simonetta Montemagni	Less is More? Towards a Reduced Inventory of Categories for Training a Parser for the Italian Stanford Dependencies
12:55-13:15	Anton Karl Ingason, Hrafn Loftsson, Eiríkur Rögnvaldsson, Einar Freyr Sigurðsson and Joel C. Wallenberg	Rapid Deployment of Phrase Structure Parsing for Related Languages: A Case Study of Insular Scandinavian

	Session O4 - Information Extraction and Knowledge Discovery	Chairperson: Ricardo Baeza-Yates
11:35-11:55	Clarissa Xavier and Vera Lima	Boosting Open Information Extraction with Noun-Based Relations
11:55-12:15	Travis Goodwin and Sanda Harabagiu	Clinical Data-Driven Probabilistic Graph Processing
12:15-12:35	Gongye Jin, Daisuke Kawahara and Sadao Kurohashi	A Framework for Compiling High Quality Knowledge Resources From Raw Corpora
12:35-12:55	Clément de Groc, Xavier Tannier and Claude de Loupy	Thematic Cohesion: Measuring Terms Discriminatory Power Toward Themes
12:55-13:15	Simon Scerri, Behrang Q. Zadeh, Maciej Dabrowski and Ismael Rivera	Extracting Information for Context-aware Meeting Preparation

	Session O5 - Linked Data (Special Session)	Chairperson: Asuncion Gomez-Perez
14:45-15:05	Marta Villegas, Maite Melero and Núria Bel	Metadata as Linked Open Data: mapping disparate XML metadata registries into one RDF/OWL registry.
15:05-15:25	Maud Ehrmann, Francesco Cecconi, Daniele Vannella, John Philip McCrae, Philipp Cimiano and Roberto Navigli	Representing Multilingual Data as Linked Data: the Case of BabelNet 2.0
15:25-15:45	Jorge Gracia, Elena Montiel-Ponsoda, Daniel Vila-Suero and Guadalupe Aguado-de-Cea	Enabling Language Resources to Expose Translations as Linked Data on the Web
15:45-16:05	Thierry Declerck, Karlheinz Mörth and Eveline Wandl-Vogt	A SKOS-based Schema for TEI encoded Dictionaries at ICLTT

	Session O6 - Audiovisual	Chairperson: Gilles Adda
14:45-15:05	Anindya Roy, Camille Guinaudeau, Herve Bredin and Claude Barras	TVD: A Reproducible and Multiply Aligned TV Series Dataset
15:05-15:25	James Pustejovsky and Zachary Yocum	Image Annotation with ISO-Space: Distinguishing Content from Structure
15:25-15:45	Arantza del Pozo, Carlo Aliprandi, Aitor Álvarez, Carlos Mendes, Joao P. Neto, Sérgio Paulo, Nicola Piccinini and Matteo Raffaelli	SAVAS: Collecting, Annotating and Sharing Audiovisual Language Resources for Automatic Subtitling
15:45-16:05	Pablo Ruiz, Aitor Álvarez and Haritz Arzelus	Phoneme Similarity Matrices to Improve Long Audio Alignment for Automatic Subtitling
16:05-16:25	Luis Javier Rodriguez-Fuentes, Mikel Penagarikano, Amparo Varona, Mireia Diez and German Bordel	KALAKA-3: a Database for the Recognition of Spoken European Languages on YouTube Audios

	Session O7 - Processing of Social Media	Chairperson: Paul Rayson
14:45-15:05	Dilek Kucuk, Guillaume Jacquet and Ralf Steinberger	Named Entity Recognition on Turkish Tweets
15:05-15:25	Nathan Schneider, Spencer Onuffer, Nora Kazour, Emily Danchik, Michael T. Mordowanec, Henrietta Conrad and Noah A. Smith	Comprehensive Annotation of Multiword Expressions in a Social Web Corpus
15:25-15:45	Clare Llewellyn, Claire Grover, Jon Oberlander and Ewan Klein	Re-using an Argument Corpus to Aid in the Curation of Social Media Collections
15:45-16:05	Marc Tomlinson, David Bracewell, Wayne Krug and David Hinote	#mygoal: Finding Motivations on Twitter
16:05-16:25	Andrew Yates, Jon Parker, Nazli Goharian and Ophir Frieder	A Framework for Public Health Surveillance

	Session O8 - Acquisition	Chairperson: Xavier Tannier
14:45-15:05	Ahmet Aker, Monica Paramita, Emma Barker and Robert Gaizauskas	Bootstrapping Term Extractors for Multiple Languages
15:05-15:25	Els Lefever, Marjan Van de Kauter and Veronique Hoste	Evaluation of Automatic Hypernym Extraction from Technical Corpora in English and Dutch
15:25-15:45	Lori Levin, Teruko Mitamura, Brian MacWhinney, Davida Fromm, Jaime Carbonell, Weston Feely, Robert Frederking, Anatole Gershman and Carlos Ramirez	Resources for the Detection of Conventionalized Metaphors in Four Languages
15:45-16:05	Yves Scherrer and Benoît Sagot	A Language-independent and fully Unsupervised Approach to Lexicon Induction and Part-of-Speech Tagging for Closely Related Languages
16:05-16:25	Stefan Bott and Sabine Schulte im Walde	Optimizing a Distributional Semantic Model for the Prediction of German Particle Verb Compositionality

	Session O9 - Sentiment Analysis and Social Media (1)	Chairperson: Stelios Piperidis
16:45-17:05	Hassan Saif, Miriam Fernandez, Yulan He and Harith Alani	On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter
17:05-17:25	Julien Velcin, Young-Min Kim, Caroline Brun, Jean-Yves Dormagen, Eric SanJuan, Leila Khouas, Anne Peradotto, Stéphane Bonnevay, Claude Roux, Julien Boyadjian, Alejandro Molina and Marie Neihouser	Investigating the Image of Entities in Social Media: Dataset Design and First Results
17:25-17:45	Ahmed Abbasi, Ammar Hassan and Milan Dhar	Benchmarking Twitter Sentiment Analysis Tools
17:45-18:05	Bart Desmet and Véronique Hoste	Recognising Suicidal Messages in Dutch Social Media

	Session O10 - Conversational (1)	Chairperson: Nick Campbell
16:45-17:05	Brigitte Bigi, Roxane Bertrand and Mathilde Guardiola	Automatic Detection of Other-Repetition Occurrences: Application to French Conversational Speech
17:05-17:25	Judith Muzerelle, Anaïs Lefeuvre, Emmanuel Schang, Jean-Yves Antoine, Aurore Pelletier, Denis Maurel, Iris Eshkol and Jeanne Villaneau	ANCOR_Centre, a Large Free Spoken French Coreference Corpus: Description of the Resource and Reliability Measures
17:25-17:45	Yi-Fen Liu, Shu-Chuan Tseng and J.-S Roger Jang	Phone Boundary Annotation in Conversational Speech
17:45-18:05	Alexis Nasr, Frederic Bechet, Benoit Favre, Thierry Bazillon, Jose Deulofeu and Andre Valli	Automatically Enriching Spoken Corpora with Syntactic Information for Linguistic Studies

	Session O11 - Collaborative Resources (1)	Chairperson: Iryna Gurevych
16:45-17:05	Marta Sabou, Kalina Bontcheva, Leon Derczynski and Arno Scharl	Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines
17:05-17:25	Mikaël Morardo and Eric De La Clergerie	Towards an Environment for the Production and the Validation of Lexical Semantic Resources
17:25-17:45	Dan Flickinger, Emily M. Bender and Stephan Oepen	Towards an Encyclopedia of Compositional Semantics: Documenting the Interface of the English Resource Grammar
17:45-18:05	Octavian Popescu, Martha Palmer and Patrick Hanks	Mapping CPA Patterns onto OntoNotes Senses

	Session O12 - Semantics (1)	Chairperson: Eva Hajičová
16:45-17:05	Elisabetta Jezek, Bernardo Magnini, Anna Feltracco, Alessia Bianchini and Octavian Popescu	T-PAS; A resource of Typed Predicate Argument Structures for linguistic analysis and semantic processing
17:05-17:25	Mark Finlayson, Jeffry Halverson and Steven Corman	The N2 Corpus: a Semantically Annotated Collection of Islamist Extremist Stories
17:25-17:45	Maddalen Lopez de Lacalle, Egoitz Laparra and German Rigau	Predicate Matrix: extending SemLink through WordNet mappings
17:45-18:05	Archna Bhatia, Mandy Simons, Lori Levin, Yulia Tsvetkov, Chris Dyer and Jordan Bender	A Unified Annotation Scheme for the Semantic/Pragmatic Components of Definiteness

	Session O13 - Sentiment Analysis (1)	Chairperson: Paul Buitelaar
18:10-18:30	Isa Maks, Ruben Izquierdo, Francesca Frontini, Rodrigo Agerri, Piek Vossen and Andoni Azpeitia	Generating Polarity Lexicons with WordNet Propagation in 5 Languages
18:30-18:50	Muhammad Abdul-Mageed and Mona Diab	SANA: A Large Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis
18:50-19:10	Heeyoung Lee, Mihai Surdeanu, Bill MacCartney and Dan Jurafsky	On the Importance of Text Analysis for Stock Price Prediction

	Session O14 - Paralinguistics	Chairperson: Sophie Rosset
18:10-18:30	Jamie Bost and Johanna Moore	An Analysis of Older Users' Interactions with Spoken Dialogue Systems
18:30-18:50	Milan Rusko, Sakhia Darjaa, Marian Trnka, Marian Ritomsky and Robert Sabo	Alert!... Calm Down, There is Nothing to Worry About. Warning and Soothing Speech Synthesis
18:50-19:10	Ana Isabel Mata, Helena Moniz, Telmo Móia, Anabela Gonçalves, Fátima Silva, Fernando Batista, Inês Duarte, Fátima Oliveira and Isabel Falé	Prosodic, Syntactic, Semantic Guidelines for Topic Structures Across Domains and Corpora

	Session O15 - Multiword Expressions	Chairperson: Aline Villavicencio
18:10-18:30	Corina Dima, Verena Henrich, Erhard Hinrichs and Christina Hoppermann	How to Tell a Schneemann from a Milchmann: An Annotation Scheme for Compound-Internal Relations
18:30-18:50	Béatrice Daille and Amir Hazem	Semi-Compositional Method for Synonym Extraction of Multi-Word Terms
18:50-19:10	Valia Kordoni and Iliana Simova	Multiword Expressions in Machine Translation

	Session O16 - Spelling Normalisation	Chairperson: Hrafn Loftsson
18:10-18:30	Kay Berkling, Johanna Fay, Masood Ghayoomi, Katrin Hein, Rémi Lavalley, Ludwig Linhuber and Sebastian Stüker	A Database of Freely Written Texts of German School Students for the Purpose of Automatic Spelling Error Classification
18:30-18:50	Orphee De Clercq, Sarah Schulz, Bart Desmet and Veronique Hoste	Towards Shared Datasets for Normalization Research
18:50-19:10	Martin Reynaert	Synergy of Nederlab and @PhilosTEI: Diachronic and Multilingual Text-induced Corpus Clean-up

Day 1, Poster Sessions:

	Session P1 - Corpora and Annotation	Chair : Marko Tadić
11:35-13:15	AiTi Aw, Sharifah Mahani Aljunied, Nattadaporn Lertcheva and Sasiwimon Kalunsima	TaLAPi ― A Thai Linguistically Annotated Corpus for Language Processing
11:35-13:15	Guiyao Ke, Pierre-Francois Marteau and Gildas Menier	Variations on Quantitative Comparability Measures and their Evaluations on Synthetic French-English Comparable Corpora
11:35-13:15	Paul Felt, Eric Ringger, Kevin Seppi and Kristian Heal	Using Transfer Learning to Assist Exploratory Corpus Annotation
11:35-13:15	Miguel B. Almeida, Mariana S. C. Almeida, André F. T. Martins, Helena Figueira, Pedro Mendes and Cláudia Pinto	Priberam Compressive Summarization Corpus: A New Multi-Document Summarization Corpus for European Portuguese
11:35-13:15	Patrick Schone, Heath Nielson and Mark Ward	Corpus and Evaluation of Handwriting Recognition of Historical Genealogical Records
11:35-13:15	Milena Hnátková, Michal Křen, Pavel Procházka and Hana Skoumalová	The SYN-series Corpora of Written Czech
11:35-13:15	Karel Kučera and Martin Stluka	Corpus of 19th-century Czech Texts: Problems and Solutions
11:35-13:15	Maik Stührenberg	Extending Standoff Annotation
11:35-13:15	Stefan Höfler and Kyoko Sugisaki	Constructing and Exploiting an Automatically Annotated Resource of Legislative Texts

	Session P2 - Crowdsourcing	Chair : Alain Couillault
11:35-13:15	Yuan Luo, Thomas Boucher, Tolga Oral, David Osofsky and Sara Weber	A Study on Expert Sourcing Enterprise Question Collection and Classification
11:35-13:15	A.R Balamurali	Can the Crowd be Controlled?: A Case Study on Crowd Sourcing and Automatic Validation of Completed Tasks based on User Modeling
11:35-13:15	Mitesh M. Khapra, Ananthakrishnan Ramanathan, Anoop Kunchukuttan, Karthik Visweswariah and Pushpak Bhattacharyya	When Transliteration Met Crowdsourcing : An Empirical Study of Transliteration via Crowdsourcing using Efficient, Non-redundant and Fair Quality Control
11:35-13:15	manjira sinha, Tirthankar Dasgupta and Anupam Basu	Design and Development of an Online Computational Framework to Facilitate Language Comprehension Research on Indian Languages
11:35-13:15	Martin Benjamin	Collaboration in the Production of a Massively Multilingual Lexicon
11:35-13:15	Marco Marelli, Stefano Menini, Marco Baroni, Luisa Bentivogli, Raffaella bernardi and Roberto Zamparelli	A SICK Cure for the Evaluation of Compositional Distributional Semantic Models
11:35-13:15	Wajdi Zaghouani and Kais Dukes	Can Crowdsourcing be used for Effective Annotation of Arabic?
11:35-13:15	Héctor Martínez Alonso and Lauren Romeo	Crowdsourcing as a Preprocessing for Complex Semantic Annotation Tasks
11:35-13:15	Christoph Draxler	Online Experiments with the Percy Software Framework - Experiences and some Early Results
11:35-13:15	Ryan Cotterell and Chris Callison-Burch	A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic

	Session P3 - Dialogue	Chair : Dan Cristea
11:35-13:15	Stefan Ultes, Hüseyin Dikme and Wolfgang Minker	First Insight into Quality-Adaptive Dialogue
11:35-13:15	Volha Petukhova, Martin Gropp, Dietrich Klakow, Gregor Eigner, Mario Topf, Stefan Srb, Petr Motlicek, Blaise Potard, John Dines, Olivier Deroo, Ronny Egeler, Uwe Meinz, Steffen Liersch and Anna Schmidt	The DBOX Corpus Collection of Spoken Human-Human and Human-Machine Dialogues
11:35-13:15	Dietmar Rösner, Rafael Friesen, Stephan Günther and Rico Andrich	Modeling and Evaluating Dialog Success in the LAST MINUTE Corpus
11:35-13:15	Layla El Asri, Rémi Lemonnier, Romain Laroche, Olivier Pietquin and Hatim Khouzaimi	NASTIA: Negotiating Appointment Setting Interface
11:35-13:15	Layla El Asri, Romain Laroche and Olivier Pietquin	DINASTI: Dialogues with a Negotiating Appointment Setting Interface
11:35-13:15	Thomas Pellegrini, Vahid Hedayati and Angela Costa	El-WOZ: a Client-Server Wizard-of-Oz Interface

	Session P4 - Phonetic Databases and Prosody	Chair : Philippe Martin
11:35-13:15	Claire Brierley, Majdi Sawalha and Eric Atwell	Tools for Arabic Natural Language Processing: a Case Study in Qalqalah Prosody
11:35-13:15	Johann-Mattis List and Jelena Prokić	A Benchmark Database of Phonetic Alignments in Historical Linguistics and Dialectology
11:35-13:15	Anne Lacheret, Sylvain Kahane, Julie Beliao, Anne Dister, Kim Gerdes, Jean-Philippe Goldman, Nicolas Obin, Paola Pietrandrea and Atanas Tchobanov	Rhapsodie: a Prosodic-Syntactic Treebank for Spoken French
11:35-13:15	Jean-Philippe Goldman, Tea Prsir and Antoine Auchlin	C-PhonoGenre: a 7-hour Corpus of 7 Speaking Styles in French: Relations between Situational Features and Prosodic Properties
11:35-13:15	Abir Masmoudi, Mariem Ellouze Khmekhem, Yannick Esteve, Lamia Hadrich Belguith and Nizar Habash	A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition
11:35-13:15	Yuichi Ishimoto, Tomoyuki Tsuchiya, Hanae Koiso and Yasuharu Den	Towards Automatic Transformation Between Different Transcription Conventions: Prediction of Intonation Markers from Linguistic and Acoustic Features
11:35-13:15	Tiberiu Boroș, Adriana Stan, Oliver Watts and Stefan Daniel Dumitrescu	RSS-TOBI - a Prosodically Enhanced Romanian Speech Corpus
11:35-13:15	Klim Peshkov and Laurent Prévot	Segmentation Evaluation Metrics, a Comparison Grounded on Prosodic and Discourse Units
11:35-13:15	Bistra Andreeva, William Barry and Jacques Koreman	A Cross-language Corpus for Studying the Phonetics and Phonology of Prominence
11:35-13:15	Liviu Dinu, Alina Maria Ciobanu, Ioana Chitoran and Vlad Niculae	Using a Machine Learning Model to Assess the Complexity of Stress Systems
11:35-13:15	Tanja Schultz and Tim Schlippe	GlobalPhone: Pronunciation Dictionaries in 20 Languages

	Session P5 - Speech Resources	Chair : Martine Adda-Decker
11:35-13:15	Juan Rafael Orozco-Arroyave, Julián David Arias-Londoño, Jesús Francisco Vargas-Bonilla, María Claudia Gonzalez-Rátiva and Elmar Nöth	New Spanish Speech Corpus Database for the Analysis of People Suffering from Parkinson's Disease
11:35-13:15	François Salmon and Félicien Vallet	An Effortless Way To Create Large-Scale Datasets For Famous Speakers
11:35-13:15	Florian Schiel and Thomas Kisler	German Alcohol Language Corpus - the Question of Dialect
11:35-13:15	Jetske Klatter, Roeland Van Hout, Henk van den Heuvel, Paula Fikkert, Anne Baker, Jan De Jong, Frank Wijnen, Eric Sanders and Paul Trilsbeek	Vulnerability in Acquisition, Language Impairments in Dutch: Creating a VALID Data Archive
11:35-13:15	Mirjam Ernestus, Lucie Kočková-Amortová and Petr Pollak	The Nijmegen Corpus of Casual Czech
11:35-13:15	Carlos Daniel Hernandez Mena and Abel Herrera Camacho	CIEMPIESS: A New Open-Sourced Mexican Spanish Radio Corpus
11:35-13:15	Marie Kopřivová, Hana Goláňová, Petra Klimešová and David Lukeš	Mapping Diatopic and Diachronic Variation in Spoken Czech: the Ortofon and Dialekt Corpora
11:35-13:15	Thomas Schmidt	The Research and Teaching Corpus of Spoken German ― FOLK
11:35-13:15	Niklas Vanhainen and Giampiero Salvi	Free Acoustic and Language Models for Large Vocabulary Continuous Speech Recognition in Swedish

	Session P6 - Endangered Languages	Chair : Laurette Pretorius
14:45-16:25	Kristiina Jokinen	Open-domain Interaction and Online Content in the Sami Language
14:45-16:25	Tjerk Hagemeijer, Michel Généreux, Iris Hendrickx, Amália Mendes, Abigail Tiny and Armando Zamora	The Gulf of Guinea Creole Corpora
14:45-16:25	Dagmar Jung, Katarzyna Klessa, Zsuzsa Duray, Beatrix Oszkó, Mária Sipos, Sándor Szeverényi, Zsuzsa Várnai, Trilsbeek Paul and Tamás Váradi	Languagesindanger.eu - Including Multimedia Language Resources to disseminate Knowledge and Create Educational Material on less-Resourced Languages
14:45-16:25	José Pedro Ferreira, Cristiano Chesi, Daan Baldewijns, Fernando Miguel Pinto, Margarita Correia, Daniela Braga, Hyongsil Cho, Amadeu Ferreira and Miguel Dias	Casa De La Lhéngua: a Set of Language Resources and Natural Language Processing Tools for Mirandese
14:45-16:25	Christian Curtis	A Finite-State Morphological Analyzer for a Lakota HPSG Grammar

	Session P7 - Evaluation Methodologies	Chair : Violeta Seretan
14:45-16:25	Adam Kilgarriff, Pavel Rychlý, Milos Jakubicek, Vojtěch Kovář, Vit Baisa and Lucia Kocincová	Extrinsic Corpus Evaluation with a Collocation Dictionary Task
14:45-16:25	Nancy Underwood, Bartolomé Mesa-Lao, Mercedes García Martínez, Michael Carl, Vicent Alabau, Jesús González-Rubio, Luis A. Leiva, Germán Sanchis-Trilles, Daniel Ortíz-Martínez and Francisco Casacuberta	Evaluating the Effects of Interactivity in a Post-Editing Workbench
14:45-16:25	Bogdan Ludusan, Maarten Versteegh, Aren Jansen, Guillaume Gravier, Xuan-Nga Cao, Mark Johnson and Emmanuel Dupoux	Bridging the Gap between Speech Technology and Natural Language Processing: An Evaluation Toolbox for Term Discovery Systems
14:45-16:25	Paula Lopez-Otero, Laura Docio-Fernandez and Carmen Garcia-Mateo	Introducing a Framework for the Evaluation of Music Detection Tools
14:45-16:25	Bartosz Broda, Bartłomiej Nitoń, Włodzimierz Gruszczyński and Maciej Ogrodniczuk	Measuring Readability of Polish Texts: Baseline Experiments
14:45-16:25	Jason Utt, Sylvia Springorum, Maximilian Köper and Sabine Schulte im Walde	Fuzzy V-Measure - An Evaluation Method for Cluster Analyses of Ambiguous Data
14:45-16:25	Andrea Horbach, Alexis Palmer and Magdalena Wolska	Finding a Tradeoff between Accuracy and Rater's Workload in Grading Clustered Short Answers
14:45-16:25	Petra Barancikova, Rudolf Rosa and Ales Tamchyna	Improving Evaluation of English-Czech MT through Paraphrasing
14:45-16:25	Chi-kiu Lo and Dekai Wu	On the Reliability and Inter-Annotator Agreement of Human Semantic MT Evaluation via HMEANT

	Session P8 - Language Resource Infrastructures	Chair : Georg Rehm
14:45-16:25	Nelleke Oostdijk and Henk van den Heuvel	The Evolving Infrastructure for Language Resources and the Role for Data Scientists
14:45-16:25	Dorte Haltrup Hansen, Lene Offersgaard and Sussi Olsen	Using TEI, CMDI and ISOcat in CLARIN-DK
14:45-16:25	Jonathan Chevelu, Gwénolé Lecorvé and Damien Lolive	ROOTS: a Toolkit for Easy, Fast and Consistent Processing of Large Sequential Annotated Data Collections
14:45-16:25	Matteo Abrate, Angelo Mario Del Grosso, Emiliano Giovannetti, Angelica Lo Duca, Damiana Luzzi, Lorenzo Mancini, Andrea Marchetti, Irene Pedretti and Silvia Piccini	Sharing Cultural Heritage: the Clavius on the Web Project
14:45-16:25	Verena Lyding, Lionel Nicolas and Egon Stemle	'interHist' - an Interactive Visual Interface for Corpus Exploration

	Session P9 - Machine Translation	Chair : Jordi Atserias
14:45-16:25	Chenhui Chu, Toshiaki Nakazawa and Sadao Kurohashi	Constructing a Chinese―Japanese Parallel Corpus from Wikipedia
14:45-16:25	Lise Rebout and Phillippe Langlais	An Iterative Approach for Mining Parallel Sentences in a Comparable Corpus
14:45-16:25	Dan Tufiș	Large SMT Data-sets Extracted from Wikipedia
14:45-16:25	Juan Luo and Yves Lepage	Production of Phrase Tables in 11 European Languages using an Improved Sub-sentential Aligner
14:45-16:25	Hiroaki Shimizu, Graham Neubig, Sakriani Sakti, Tomoki Toda and Satoshi Nakamura	Collection of a Simultaneous Translation Corpus for Comparative Analysis
14:45-16:25	Sharid Loaiciga, Thomas Meyer and Andrei Popescu-Belis	English-French Verb Phrase Alignment in Europarl for Tense Translation Modeling
14:45-16:25	Bushra Jawaid and Ondrej Bojar	Two-Step Machine Translation with Lattices

	Session P10 - Metadata	Chair : Victoria Arranz
14:45-16:25	Matej Durco and Menzo Windhouwer	The CMD Cloud
14:45-16:25	Fritz Kliche, Andre Blessing, Dr. Ulrich Heid and Jonathan Sonntag	The eIdentity Text Exploration Workbench
14:45-16:25	Damir Cavar and Malgorzata Cavar	Visualization of Language Relations and Families: MultiTree

	Session P11 - MultiWord Expressions and Terms	Chair : Valeria Quochi
14:45-16:25	Pierre André Ménard and Caroline Barriere	Linked Open Data and Web Corpus Data for noun compound bracketing
14:45-16:25	Anita Rácz, István Nagy T. and Veronika Vincze	4FX: Light Verb Constructions in a Multilingual Parallel Corpus
14:45-16:25	Wan Yu Ho, Christine Kng, Shan Wang and Francis Bond	Identifying Idioms in Chinese Translations
14:45-16:25	Kara Warburton	Narrowing the Gap Between Termbases and Corpora in Commercial Environments
14:45-16:25	Rodrigo Boos, Kassius Prestes and Aline Villavicencio	Identification of Multiword Expressions in the brWaC
14:45-16:25	Lis Pereira, Elga Strafella and Yuji Matsumoto	Collocation or Free Combination? ― Applying Machine Translation Techniques to identify collocations in Japanese
14:45-16:25	Irina Temnikova, Andrea Varga and Dogan Biyikli	Building a Crisis Management Term Resource for Social Media: The Case of Floods and Protests

	Session P12 - Treebanks	Chair : Beatrice Daille
14:45-16:25	Riyaz Ahmad Bhat, Shahid Musjtaq Bhat and Dipti Misra Sharma	Towards Building a Kashmiri Treebank: Setting up the Annotation Pipeline
14:45-16:25	Shinsuke Mori, Hideki Ogura and Tetsuro Sasada	A Japanese Word Dependency Corpus
14:45-16:25	Chris Culy, Marco Passarotti and Ulla König-Cardanobile	A Compact Interactive Visualization of Dependency Treebank Query Results
14:45-16:25	Scott Martens and Marco Passarotti	Thomas Aquinas in the TüNDRA: Integrating the Index Thomisticus Treebank into CLARIN-D
14:45-16:25	Blanca Arias, Nuria Bel, Mercè Lorente, Montserrat Marimón, Alba Milà, Jorge Vivaldi, Muntsa Padró, Marina Fomicheva and Imanol Larrea	Boosting the Creation of a Treebank
14:45-16:25	Montserrat Marimon, Núria Bel, Beatriz Fisas, Blanca Arias, Silvia Vázquez, Jorge Vivaldi, Carlos Morell and Mercè Lorente	The IULA Spanish LSP Treebank
14:45-16:25	Per Erik Solberg, Arne Skjærholt, Lilja Øvrelid, Kristin Hagen and Janne Bondi Johannessen	The Norwegian Dependency Treebank
14:45-16:25	Mojgan Seraji, Carina Jahani, Beáta Megyesi and Joakim Nivre	A Persian Treebank with Stanford Typed Dependencies
14:45-16:25	Masood Ghayoomi and Jonas Kuhn	Converting an HPSG-based Treebank into its Parallel Dependency-based Treebank

	Session P13 - Discourse Annotation, Representation and Processing	Chair : Ann Bies
16:45-18:05	Kasia Budzynska, Mathilde Janier, Chris Reed, Patrick Saint-Dizier, Manfred Stede and Olena yakorska	A Model for Processing Illocutionary Structures and Argumentation in Debates
16:45-18:05	Manfred Stede and Arne Neumann	Potsdam Commentary Corpus 2.0: Annotation for Discourse Research
16:45-18:05	Magdalena Rysova	Verbs of Saying with a Textual Connecting Function in the Prague Discourse Treebank
16:45-18:05	Ryu Iida and Takenobu Tokunaga	Building a Corpus of Manually Revised Texts from Discourse Perspective
16:45-18:05	Lanjun Zhou, Binyang Li, Zhongyu Wei and Kam-Fai Wong	The CUHK Discourse TreeBank for Chinese: Annotating Explicit Discourse Connectives for the Chinese TreeBank
16:45-18:05	Thomas Bögel, Jannik Strötgen and Michael Gertz	Computational Narratology: Extracting Tense Clusters from Narrative Texts
16:45-18:05	Susana Bautista and Horacio Saggion	Can Numerical Expressions Be Simpler? Implementation and Demonstration of a Numerical Simplification System for Spanish
16:45-18:05	Cristina Grisot and Thomas Meyer	Cross-Linguistic Annotation of Narrativity for English/French Verb Tense Disambiguation

	Session P14 - Grammar and Syntax	Chair : Cristina Bosco
16:45-18:05	Richard Sproat, Bruno Cartoni, HyunJeong Choe, David Huynh, Linne Ha, Ravindran Rajakumar and Evelyn Wenzel-Grondie	A Database for Measuring Linguistic Information Content
16:45-18:05	Katerina Rysova and Jiří Mírovský	Valency and Word Order in Czech ― A Corpus Probe
16:45-18:05	Ludger Zeevaert	Mörkum Njálu. An Annotated Corpus to Analyse and Explain Grammatical Divergences Between 14th-century Manuscripts of Njál's Saga.
16:45-18:05	Roman Schneider	GenitivDB ― a Corpus-Generated Database for German Genitive Classification
16:45-18:05	Tibor Kiss, Francis Jeffry Pelletier and Tobias Stadtfeld	Building a Reference Lexicon for Countability in English

	Session P15 - Lexicons	Chair : Amália Mendes
16:45-18:05	Ismail El Maarouf, Jane Bradbury, Vít Baisa and Patrick Hanks	Disambiguating Verbs by Collocation: Corpus Lexicography meets Natural Language Processing
16:45-18:05	Nabil Hathout, Franck Sajous and Basilio Calderone	GLÀFF, a Large Versatile French Lexicon
16:45-18:05	John Richardson, Toshiaki Nakazawa and Sadao Kurohashi	Bilingual Dictionary Construction with Transliteration Filtering
16:45-18:05	Krasimir Angelov	Bootstrapping Open-Source English-Bulgarian Computational Dictionary
16:45-18:05	Mathieu Mangeot	MotàMot Project: Conversion of a French-Khmer Published Dictionary for Building a Multilingual Lexical System
16:45-18:05	Menzo Windhouwer, Justin Petro and Shakila Shayan	RELISH LMF: Unlocking the Full Power of the Lexical Markup Framework
16:45-18:05	Liviu Dinu and Alina Maria Ciobanu	Building a Dataset of Multilingual Cognates for the Romanian Lexicon
16:45-18:05	Palmira Marrafa, Raquel Amaro and Sara Mendes	LexTec - a Rich Language Resource for Technical Domains in Portuguese

	Session P16 - Morphology	Chair : Benoît Sagot
16:45-18:05	Fadoua Ataa Allah and Siham Boulaknadel	Amazigh Verb Conjugator
16:45-18:05	Menno van Zaanen, Gerhard Van Huyssteen, Suzanne Aussems, Chris Emmery and Roald Eiselen	The Development of Dutch and Afrikaans Language Resources for Compound Boundary Analysis.
16:45-18:05	Rico Sennrich and Beat Kunz	Zmorge: A German Morphological Lexicon Extracted from Wiktionary
16:45-18:05	Attila Novák	A New Form of Humor ― Mapping Constraint-Based Computational Morphologies to a Finite-State Representation
16:45-18:05	Veronika Vincze, Viktor Varga, Katalin Ilona Simkó, János Zsibrita, Ágoston Nagy, Richárd Farkas and János Csirik	Szeged Corpus 2.5: Morphological Modifications in a Manually POS-tagged Hungarian Corpus
16:45-18:05	Çağrı Çöltekin	A Set of Open Source Tools for Turkish Natural Language Processing
16:45-18:05	Magda Sevcikova and Zdenek Zabokrtsky	Word-Formation Network for Czech
16:45-18:05	Arfath Pasha, Mohamed Al-Badrashiny, Mona Diab, Ahmed El Kholy, Ramy Eskander, Nizar Habash, Manoj Pooleery, Owen Rambow and Ryan Roth	MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic
16:45-18:05	Yvonne Adesam, Malin Ahlberg, Peter Andersson, Gerlof Bouma, Markus Forsberg and Mans Hulden	Computer-aided Morphology Expansion for Old Swedish
16:45-18:05	Marcin Woliński	Morfeusz Reloaded

	Session P17 - WordNet	Chair : Francis Bond
16:45-18:05	Antoni Oliver and Salvador Climent	Automatic Creation of WordNets from Parallel Corpora
16:45-18:05	Spandana Gella, Carlo Strapparava and Vivi Nastase	Mapping WordNet Domains, WordNet Topics and Wikipedia Categories to Generate Multilingual Domain Specific Resources
16:45-18:05	Quentin Pradet, Laurence Danlos and Gaël de Chalendar	Adapting VerbNet to French using Existing Resources
16:45-18:05	Gianluca Lebani, Veronica Viola and Alessandro Lenci	Bootstrapping an Italian VerbNet: data-driven analysis of verb alternations
16:45-18:05	Ahti Lohk, Kaarel Allik, Heili Orav and Leo Võhandu	Dense Components in the Structure of WordNet
16:45-18:05	Yuri Bizzoni, Federico Boschetti, Harry Diakoff, Riccardo Del Gratta, Monica Monachini and Gregory Crane	The Making of Ancient Greek WordNet
16:45-18:05	Gerard de Melo	Etymological WordNet: Tracing the History of Words

	Session P18 - Corpora and Annotation	Chair : Steve Cassidy
18:10-19:30	Angela Costa, Tiago Luís and Luísa Coheur	Translation Errors from English to Portuguese: an Annotated Corpus
18:10-19:30	Verginica Barbu Mititelu, Elena Irimia and Dan Tufiș	CoRoLa ― The Reference Corpus of Contemporary Romanian Language
18:10-19:30	Houda Bouamor, Nizar Habash and Kemal Oflazer	A Multidialectal Parallel Corpus of Arabic
18:10-19:30	Ahmed Salama, Houda Bouamor, Behrang Mohit and Kemal Oflazer	YouDACC: the Youtube Dialectal Arabic Comment Corpus
18:10-19:30	Miquel Esplà-Gomis, Filip Klubička, Nikola Ljubešić, Sergio Ortiz-Rojas, Vassilis Papavassiliou and Prokopis Prokopidis	Comparing Two Acquisition Systems for Automatically Building an English-Croatian Parallel Corpus from Multilingual Websites
18:10-19:30	Siim Orasmaa	Towards an Integration of Syntactic and Temporal Annotations in Estonian
18:10-19:30	Louise Deleger, Anne-Laure Ligozat, Cyril Grouin, Pierre Zweigenbaum and Aurelie Neveol	Annotation of Specialized Corpora using a Comprehensive Entity and Relation Scheme
18:10-19:30	Ritesh Kumar	Developing Politeness Annotated Corpus of Hindi Blogs
18:10-19:30	Adriane Boyd, Jirka Hana, Lionel Nicolas, Detmar Meurers, Katrin Wisniewski, Andrea Abel, Karin Schöne, Barbora Štindlová and Chiara Vettori	The MERLIN corpus: Learner Language and the CEFR
18:10-19:30	Luz Rello, Ricardo Baeza-Yates and Joaquim Llisterri	DysList: An Annotated Resource of Dyslexic Errors
18:10-19:30	Jena D. Hwang, Annie Zaenen and Martha Palmer	Criteria for Identifying and Annotating Caused Motion Constructions in Corpus Data
18:10-19:30	Ann Irvine, Joshua Langfus and Chris Callison-Burch	The American Local News Corpus

	Session P19 - Document Classification, Text Categorisation	Chair : Karën Fort
18:10-19:30	Mohamed Morchid, Richard Dufour and Georges Linares	A LDA-based Topic Classification Approach from highly Imperfect Automatic Transcriptions
18:10-19:30	Juan Soler and Leo Wanner	How to Use Less Features and Reach Better Performance in Author Gender Identification
18:10-19:30	Lucie Poláková, Pavlína Jínová and Jiří Mírovský	Genres in the Prague Discourse Treebank
18:10-19:30	Stefania Degaetano-Ortlieb, Peter Fankhauser, Hannah Kermes, Ekaterina Lapshinova-Koltunski, Noam Ordan and Elke Teich	Data Mining with Shallow vs. Linguistic Features to Study Diversification of Scientific Registers
18:10-19:30	Mahmoud El-Haj, Paul Rayson, Steve Young and Martin Walker	Detecting Document Structure in a Very Large Corpus of UK Financial Reports
18:10-19:30	Noushin Rezapour Asheghi, Serge Sharoff and Katja Markert	Designing and Evaluating a Reliable Corpus of Web Genres via Crowd-Sourcing
18:10-19:30	Ioannis Korkontzelos and Sophia Ananiadou	Locating Requests among Open Source Software Communication Messages
18:10-19:30	Thamar Solorio, Ragib Hasan and Mainul Mizan	Sockpuppet Detection in Wikipedia: A Corpus of Real-World Deceptive Writing for Linking Identities

	Session P20 - FrameNet	Chair : Alessandro Lenci
18:10-19:30	Ildikó Pilán and Elena Volodina	Reusing Swedish FrameNet for Training Semantic Roles
18:10-19:30	Marie-Claude L'Homme, Benoît Robichaud and Carlos Subirats Rüggeberg	Discovering Frames in Specialized Domains
18:10-19:30	Marie Candito, Pascal Amsili, Lucie Barque, Farah Benamara, Gaël de Chalendar, Marianne Djemaa, Pauline Haas, Richard Huyghe, Yvette Yannick Mathieu, Philippe Muller, Benoît Sagot and Laure Vieu	Developing a French FrameNet: Methodology and First results

	Session P21 - Semantics	Chair : Peter Anick
18:10-19:30	Reinhard Rapp	Corpus-Based Computation of Reverse Associations
18:10-19:30	Haritz Salaberri, Olatz Arregi and Beñat Zapirain	First Approach toward Semantic Role Labeling for Basque
18:10-19:30	Tomoko Izumi, Tomohide Shibata, Hisako Asano, Yoshihiro Matsuo and Sadao Kurohashi	Constructing a Corpus of Japanese Predicate Phrases for Synonym/Antonym Relations
18:10-19:30	Martin Riedl, Richard Steuer and Chris Biemann	Distributed Distributional Similarities of Google Books over the Centuries
18:10-19:30	Kostadin Cholakov, Chris Biemann, Judith Eckle-Kohler and Iryna Gurevych	Lexical Substitution Dataset for German
18:10-19:30	Nianwen Xue and Yuchen Zhang	Buy One Get One Free: Distant Annotation of Chinese Tense, Event Type and Modality
18:10-19:30	Dan Stefanescu, Rajendra Banjade and Vasile Rus	Latent Semantic Analysis Models on Wikipedia and TASA
18:10-19:30	Yuka Tateisi, Yo Shidahara, Yusuke Miyao and Akiko Aizawa	Annotation of Computer Science Papers for Semantic Relation Extrac-tion
18:10-19:30	Moritz Wittmann, Marion Weller and Sabine Schulte im Walde	Automatic Extraction of Synonyms for German Particle Verbs from Parallel Data with Distributional Similarity as a Re-Ranking Feature
18:10-19:30	Gregor Titze, Volha Bryl, Cäcilia Zirn and Simone Paolo Ponzetto	DBpedia Domains: Augmenting DBpedia with Domain Information
18:10-19:30	Elena Cabrio, Serena Villata and Fabien Gandon	Classifying Inconsistencies in DBpedia Language Specific Chapters

	Session P22 - Speech Resources	Chair : Giuseppe Riccardi
18:10-19:30	Thomas Schmidt	The Database for Spoken German ― DGD2
18:10-19:30	Annika Hämäläinen, Jairo Avelar, Silvia Rodrigues, Miguel Sales Dias, Artur Kolesiński, Tibor Fegyó, Géza Németh, Petra Csobánka, Karine Lan and David Hewson	The EASR Corpora of European Portuguese, French, Hungarian and Polish Elderly Speech
18:10-19:30	Barbara Schuppler, Martin Hagmueller, Juan A. Morales-Cordovilla and Hannes Pessentheiner	GRASS: the Graz corpus of Read And Spontaneous Speech
18:10-19:30	Hanae Koiso, Yasuharu Den, Ken'ya Nishikawa and Kikuo Maekawa	Design and Development of an RDB Version of the Corpus of Spontaneous Japanese
18:10-19:30	Camille Fauth, Anne Bonneau, Frank Zimmerer, Juergen Trouvain, Bistra Andreeva, Vincent Colotte, Dominique Fohr, Denis Jouvet, Jeanin Jügler, Yves Laprie, Odile Mella and Bernd Möbius	Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process
18:10-19:30	Rosemary Orr, Marijn Huijbregts, Roeland van Beek, Lisa Teunissen, Kate Backhouse and David van Leeuwen	Semi-Automatic Annotation of the UCU Accents Speech Corpus
18:10-19:30	Ana Lúcia Santos, Michel Généreux, Aida Cardoso, Celina Agostinho and Silvana Abalada	A Corpus of European Portuguese Child and Child-directed Speech
18:10-19:30	Anna Polychroniou, Hugues Salamin and Alessandro Vinciarelli	The SSPNet-Mobile Corpus: Social Signal Processing Over Mobile Phones.
18:10-19:30	Katarzyna Klessa and Dafydd Gibbon	Annotation Pro + TGA: Automation of Speech Timing Analysis
18:10-19:30	Björn Schuller, Felix Friedmann and Florian Eyben	The Munich Biovoice Corpus: Effects of Physical Exercising, Heart Rate, and Skin Conductance on Human Speech Production

Day 2, Oral Sessions:

	Session O17 - Infrastructures for LRs	Chairperson: Nancy Ide
9:45-10:05	Victoria Arranz, Khalid Choukri, Valérie Mapelli and Hélène Mazo	ELRA's Consolidated Services for the HLT Community
10:05-10:25	Georg Rehm, Hans Uszkoreit, Sophia Ananiadou, Núria Bel, Audroné Bielevičiené, Lars Borin, António Branco, Gerhard Budin, Nicoletta Calzolari, Walter Daelemans, Radovan Garabík, Marko Grobelnik, Carmen Garcia-Mateo, Josef van Genabith, Jan Hajic, Inma Hernaez, John Judge, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Linden, Bernardo Magnini, Joseph Mariani, John McNaught, Maite Melero, Monica Monachini, Asuncion Moreno, Jan Odijk, Maciej Ogrodniczuk, Piotr Pezik, Stelios Piperidis, Adam Przepiórkowski, Eiríkur Rögnvaldsson, Michael Rosner, Bolette Pedersen, Inguna Skadina, Koenraad De Smedt, Marko Tadić, Paul Thompson, Dan Tufiș, Tamás Váradi, Andrejs Vasiļjevs, Kadri Vider and Jolanta Zabarskaite	The Strategic Impact of META-NET on the Regional, National and International Level
10:25-10:45	Erhard Hinrichs and Steven Krauwer	The CLARIN Research Infrastructure: Resources and Tools for eHumanities Scholars
10:45-11:05	Stelios Piperidis, Harris Papageorgiou, Christian Spurk, Georg Rehm, Khalid Choukri, Olivier Hamon, Nicoletta Calzolari, Riccardo Del Gratta, Bernardo Magnini and Christian Girardi	META-SHARE: One Year After
11:05-11:25	Christopher Cieri, Denise DiPersio, Mark Liberman, Andrea Mazzucchi, Stephanie Strassel and Jonathan Wright	New Directions for Language Resource Development and Distribution

	Session O18 - Speech Resources Annotation	Chairperson: Satoshi Nakamura
9:45-10:05	Mārcis Pinnis, Ilze Auziņa and Kārlis Goba	Designing the Latvian Speech Recognition Corpus
10:05-10:25	Eunah Cho, Sarah Fünfer, Sebastian Stüker and Alex Waibel	A Corpus of Spontaneous Speech in Lectures: The KIT Lecture Corpus for Spoken Language Processing and Translation
10:25-10:45	Sian Alsop and Hilary Nesi	The Pragmatic Annotation of a Corpus of Academic Lectures
10:45-11:05	Sara Candeias, Dirce Celorico, Jorge Proença, Arlindo Veiga, Carla Lopes and Fernando Perdigão	HESITA(te) in Portuguese
11:05-11:25	Ana Aguiar, Mariana Kaiseler, Hugo Meinedo, Pedro Almeida, Mariana Cunha and Jorge Silva	VOCE Corpus: Ecologically Collected Speech Annotated with Physiological and Psychological Stress Assessments

	Session O19 - Summarisation	Chairperson: Horacio Saggion
9:45-10:05	Evelina Rennes and Arne Jonsson	The Impact of Cohesion Errors in Extraction Based Summaries
10:05-10:25	Matthew Shardlow	Out in the Open: Finding and Categorising Errors in the Lexical Simplification Pipeline
10:25-10:45	Annemarie Friedrich, Marina Valeeva and Alexis Palmer	LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization
10:45-11:05	Natalia Loukachevitch and Aleksey Alekseev	Summarizing News Clusters on the Basis of Thematic Chains
11:05-11:25	Kai Hong, John Conroy, Benoit Favre, Alex Kulesza, Hui Lin and Ani Nenkova	A Repository of State of the Art and Competitive Baseline Summaries for Generic News Summarization

	Session O20 - Grammar, Lexicon and Morphology	Chairperson: Lori Levin
9:45-10:05	Victoria Rosén, Petter Haugereid, Martha Thunes, Gyri S. Losnegaard and Helge Dyvik	The Interplay Between Lexical and Syntactic Resources in Incremental Parsebanking
10:05-10:25	László Laki and György Orosz	An Efficient Language Independent Toolkit for Complete Morphological Disambiguation
10:25-10:45	Shinsuke Mori and Graham Neubig	Language Resource Addition: Dictionary or Corpus?
10:45-11:05	Kristín Bjarnadóttir and Jón Daðason	Utilizing Constituent Structure for Compound Analysis
11:05-11:25	Kalliopi Zervanou, Elias Iosif and Alexandros Potamianos	Word Semantic Similarity for Morphologically Rich Languages

	Session O21 - Collaborative Resources (2)	Chairperson: Thierry Declerck
11:45-12:05	Georgios Petasis	Annotating Arguments: The NOMAD Collaborative Annotation Tool
12:05-12:25	Judit Ács	Pivot-based Multilingual Dictionary Building using Wiktionary
12:25-12:45	Tatiana Gornostay and Andrejs Vasiļjevs	Terminology Resources and Terminology Work Benefit from Cloud Services
12:45-13:05	Rachele Sprugnoli and Alessandro Lenci	Crowdsourcing for the Identification of Event Nominals: an Experiment

	Session O22 - Conversational (2)	Chairperson: Dafydd Gibbon
11:45-12:05	Deryle Lonsdale and Carl Christensen	Combining Elicited Imitation and Fluency Features for Oral Proficiency Measurement
12:05-12:25	David Escudero, Aguilar-Cuevas Lourdes, González-Ferreras César, Gutiérrez-González Yurena and Valentín Cardeñoso-Payo	On the Use of a Fuzzy Classifier to Speed Up the Sp ToBI Labeling of the Glissando Spanish Corpus
12:25-12:45	David Graff, Kevin Walker, Stephanie Strassel, Xiaoyi Ma, Karen Jones and Ann Sawyer	The RATS Collection: Supporting HLT Research with Degraded Audio Data
12:45-13:05	Heather Pon-Barry, Stuart Shieber and Nicholas Longenbaugh	Eliciting and Annotating Uncertainty in Spoken Language

	Session O23 - Text Mining	Chairperson: Lucia Specia
11:45-12:05	Claudiu Mihăilă and Sophia Ananiadou	The Meta-knowledge of Causality in Biomedical Scientific Discourse
12:05-12:25	Guiyao Ke and Pierre-Francois Marteau	Co-clustering of Bilingual Datasets as a Mean for Assisting the Construction of Thematic Bilingual Comparable Corpora
12:25-12:45	Piek Vossen, German Rigau, Luciano Serafini, Pim Stouten, Francis Irving and Willem Van Hage	NewsReader: Recording History from Daily News Streams
12:45-13:05	Peter Anick, Marc Verhagen and James Pustejovsky	Identification of Technology Terms in Patents

	Session O24 - Document Classification	Chairperson: Robert Frederking
11:45-12:05	Dasha Bogdanova and Angeliki Lazaridou	Cross-Language Authorship Attribution
12:05-12:25	Robert Remus and Dominique Ziegelmayer	Learning from Domain Complexity
12:25-12:45	Reinhard Rapp	Using Word Familiarities and Word Associations to Measure Corpus Representativeness
12:45-13:05	Marco Del Tredici and Malvina Nissim	A Modular System for Rule-based Text Categorisation

	Session O25 - Machine Translation and Evaluation (2)	Chairperson: Toru Ishida
14:55-15:15	Eleftherios Avramidis, Aljoscha Burchardt, Sabine Hunsicker, Maja Popović, Cindy Tscherwinka, David Vilar and Hans Uszkoreit	The taraXÜ Corpus of Human-Annotated Machine Translations
15:15-15:35	Irina Galinskaya, Valentin Gusev, Elena Mescheryakova and Mariya Shmatova	Measuring the Impact of Spelling Errors on the Quality of Machine Translation
15:35-15:55	Varvara Logacheva and Lucia Specia	A Quality-based Active Sample Selection Strategy for Statistical Machine Translation
15:55-16:15	Homa B. Hashemi and Rebecca Hwa	A Comparison of MT Errors and ESL Errors
16:15-16:35	Elisabet Comelles, Jordi Atserias, Victoria Arranz, Irene Castellon and Jordi Sesé	VERTa: Facing a Multilingual Experience of a Linguistically-based MT Evaluation

	Session O26 - Computer Aided Language Learning	Chairperson: Justus Roux
14:55-15:15	Catia Cucchiarini, Steve Bodnar, Bart Penning de Vries, Roeland Van Hout and Helmer Strik	ASR-based CALL Systems and Learner Speech Data: New Resources and Opportunities for Research and Development in Second Language Learning
15:15-15:35	Eric Sanders, Ineke van de Craats and Vanja de Lint	The Dutch LESLLA Corpus
15:35-15:55	Deryle Lonsdale and Benjamin Millard	Student Achievement and French Sentence Repetition Test Scores
15:55-16:15	Claudia Baur, Manny Rayner and Nikos Tsourakis	Using a Serious Game to Collect a Child Learner Speech Corpus
16:15-16:35	Renlong Ai, Marcela Charfuelan, Walter Kasper, Tina Klüwer, Hans Uszkoreit, Feiyu Xu, Sandra Gasber and Philip Gienandt	Sprinter: Language Technologies for Interactive and Multimedia Language Learning

	Session O27 - Information Extraction (1)	Chairperson: Sophia Ananiadou
14:55-15:15	Leonardo Sameshima Taba and Helena Caseli	Automatic Semantic Relation Extraction from Portuguese Texts
15:15-15:35	Raymond Shen and Hideaki Kikuchi	Estimation of Speaking Style in Speech Corpora Focusing on Speech Transcriptionss
15:35-15:55	Prescott Klassen, Fei Xia, Lucy Vanderwende and Meliha Yetisgen	Annotating Clinical Events in Text Snippets for Phenotype Detection
15:55-16:15	Jennifer D'Souza and Vincent Ng	Annotating Inter-Sentence Temporal Relations in Clinical Notes
16:15-16:35	Mohamed Morchid, Georges Linares and Richard Dufour	Characterizing and Predicting Bursty Events: the Buzz Case Study on Twitter

	Session O28 - Lexicon	Chairperson: Núria Bel
14:55-15:15	Lucas Hilgert, Lucelene Lopes, Artur Freitas, Renata Vieira, Denise Hogetop and Aline Vanin	Building Domain Specific Bilingual Dictionaries
15:15-15:35	Benoît Sagot	DeLex, a Freely-available, Large-scale and Linguistically Grounded Morphological Lexicon for German
15:35-15:55	Adam Przepiórkowski, Elżbieta Hajnicz, Agnieszka Patejuk, Marcin Woliński, Filip Skwarski and Marek Świdziński	Walenty: Towards a Comprehensive Valence Dictionary of Polish
15:55-16:15	Marion Baranes and Benoît Sagot	A Language-independent Approach to Extracting Derivational Relations from an Inflectional Lexicon
16:15-16:35	Ting Liu, Kit Cho, G. Aaron Broadwell, Samira Shaikh, Tomek Strzalkowski, John Lien, Sarah Taylor, Laurie Feldman, Boris Yamrom, Nick Webb, Umit Boz, Ignacio Cases and Ching-Sheng Lin	Automatic Expansion of the MRC Psycholinguistic Database Imageability Ratings

	Session O29 - Sentiment Analysis (2)	Chairperson: Frédérique Segond
16:55-17:15	Ben Verhoeven and Walter Daelemans	CLIPS Stylometry Investigation (CSI) Corpus: a Dutch Corpus for the Detection of Age, Gender, Personality, Sentiment and Deception in Text
17:15-17:35	Motaz Saad, David Langlois and Kamel Smaili	Building and Modelling Multilingual Subjective Corpora
17:35-17:55	Subhabrata Mukherjee and Sachindra Joshi	Author-Specific Sentiment Aggregation for Polarity Prediction of Reviews
17:55-18:15	Mark Cieliebak, Oliver Dürr and Fatih Uzdilli	Meta-Classifiers Easily Improve Commercial Sentiment Detection Tools

	Session O30 - Multimodality	Chairperson: Michael Kipp
16:55-17:15	Dominique Estival, Steve Cassidy, Felicity Cox and Denis Burnham	AusTalk: an Audio-Visual Corpus of Australian English
17:15-17:35	Roberto Bartolini, Valeria Quochi, Irene De Felice, Irene Russo and Monica Monachini	From Synsets to Videos: Enriching ItalWordNet Multimodally
17:35-17:55	Veronica Perez-Rosas, Rada Mihalcea, Alexis Narvaez and Mihai Burzo	A Multimodal Dataset for Deception Detection
17:55-18:15	Jonathan Gratch, Ron Artstein, Gale Lucas, Giota stratou, Stefan Scherer, Angela Nazarian, Rachel Wood, Jill Boberg, David DeVault, Stacy Marsella, David Traum, Albert "Skip" Rizzo and Louis-Philippe Morency	The Distress Analysis Interview Corpus of Human and Computer Interviews

	Session O31 - Under-resourced Languages	Chairperson: Andrejs Vasiļjevs
16:55-17:15	Yan Song and Fei Xia	Modern Chinese Helps Archaic Chinese Processing: Finding and Exploiting the Shared Properties
17:15-17:35	Lars Borin, Anju Saxena, Taraka Rama and Bernard Comrie	Linguistic Landscaping of South Asia using Digital Language Resources: Genetic vs. Areal Linguistics
17:35-17:55	David Kamholz, Jonathan Pool and Susan Colowick	PanLex: Building a Resource for Panlingual Lexical Translation
17:55-18:15	Fei Xia, William Lewis, Michael Wayne Goodman, Joshua Crowgey and Emily M. Bender	Enriching ODIN

	Session O32 - Parallel Corpora	Chairperson: Patrizia Paggio
16:55-17:15	Thomas Mayer and Michael Cysouw	Creating a Massively Parallel Bible Corpus
17:15-17:35	Najeh Hajlaoui, David Kolovratnik, Jaakko Väyrynen, Ralf Steinberger and Daniel Varga	DCEP -Digital Corpus of the European Parliament
17:35-17:55	Martin Volk, Johannes Graën and Elena Callegaro	Innovations in Parallel Corpus Search Tools
17:55-18:15	Valérie Hanoka and Benoît Sagot	An Open-Source Heavily Multilingual Translation Graph Extracted from Wiktionaries and Parallel Corpora

	Session O33 - Linked Data and Semantic Web	Chairperson: Guadalupe Aguado-de-Cea
18:20-18:40	Hans-Ulrich Krieger and Thierry Declerck	TMO ― The Federated Ontology of the TrendMiner Project
18:40-19:00	Chahinez Benkoussas, Hussam Hamdan, Patrice Bellot, Frédéric Béchet and Elodie Faath	A Collection of Scholarly Book Reviews from the Platforms of Electronic Sources in Humanities and Social Sciences OpenEdition.org
19:00-19:20	Manuel Fiorelli, Maria Teresa Pazienza and Armando Stellato	A Meta-data Driven Platform for Semi-automatic Configuration of Ontology Mediators

	Session O34 - Dialogue (1)	Chairperson: Francesco Cutugno
18:20-18:40	Bayu Rahayudi, Ronald Poppe and Dirk Heylen	Twente Debate Corpus ― A Multimodal Corpus for Head Movement Analysis
18:40-19:00	Maike Paetzel, David Nicolas Racca and David DeVault	A Multimodal Corpus of Rapid Dialogue Games
19:00-19:20	Maria Koutsombogera, Samer Al Moubayed, Bajibabu Bollepalli, Ahmed Hussen Abdelaziz, Martin Johansson, José David Aguas Lopes, Jekaterina Novikova, Catharine Oertel, Kalin Stefanov and Gül Varol	The Tutorbot Corpus ― A Corpus for Studying Tutoring Behaviour in Multiparty Face-to-Face Spoken Dialogue

	Session O35 - Word Sense Annotation and Disambiguation	Chairperson: Patrik Lambert
18:20-18:40	Hugo Gonçalo Oliveira, Inês Coelho and Paulo Gomes	Exploiting Portuguese Lexical Knowledge Bases for Answering Open Domain Cloze Questions Automatically
18:40-19:00	Daisuke Kawahara and Martha Palmer	Single Classifier Approach for Verb Sense Disambiguation based on Generalized Features
19:00-19:20	Andrea Moro, Roberto Navigli, Francesco Maria Tucci and Rebecca J. Passonneau	Annotating the MASC Corpus with BabelNet

	Session O36 - Legal and Ethical Issues	Chairperson: Christopher Cieri
18:20-18:40	Pawel Kamocki	The Liability of Service Providers in e-Research Infrastructures: Killing the Messenger?
18:40-19:00	Alain Couillault, Karën Fort, Gilles Adda and Hugues Mazancourt (de)	Evaluating Corpora Documentation with regards to the Ethics and Big Data Charter
19:00-19:20	Erik Faessler, Johannes Hellrich and Udo Hahn	Disclose Models, Hide the Data - How to Make Use of Confidential Corpora without Seeing Sensitive Raw Data

Day 2, Poster Sessions:

	Session P23 - Collaborative Resource Construction	Chair : Christian Chiarcos
9:45-11:25	Włodzimierz Gruszczyński and Maciej Ogrodniczuk	Digital Library 2.0: Source of Knowledge and Research Collaboration Platform
9:45-11:25	Livio Robaldo, Guido Boella, Luigi Di Caro and Andrea Violato	Exploiting networks in Law
9:45-11:25	Alex Rudnick, Taylor Skidmore, Alberto Samaniego and Michael Gasser	Guampa: a Toolkit for Collaborative Translation
9:45-11:25	Billy T.M. Wong, Ian C. Chow, Jonathan J. Webster and Hengbin Yan	The Halliday Centre Tagger: An Online Platform for Semi-automatic Text Annotation and Analysis
9:45-11:25	Mauro Dragoni, Alessio Bosca, Matteo Casu and Andi Rexha	Modeling, Managing, Exposing, and Linking Ontologies with a Wiki-based Tool
9:45-11:25	Mathieu Lafourcade and Karën Fort	Propa-L: a Semantic Filtering Service from a Lexical Network Created using Games With A Purpose
9:45-11:25	Frederik Baumgardt, Giuseppe Celano, Gregory R. Crane, Stella Dee, Maryam Foradi, Emily Franzini, Greta Franzini, Monica Lent, Maria Moritz and Simona Stoyanova	Open Philology at the University of Leipzig
9:45-11:25	Joshua Elliot, Logan Kearsley, Jason Housley and Alan Melby	LexTerm Manager: Design for an Integrated Lexicography and Terminology System
9:45-11:25	Jonathan Wright	RESTful Annotation and Efficient Collaboration

	Session P24 - Corpora and Annotation	Chair : Maria Gavrilidou
9:45-11:25	Zhiyi Song, Stephanie Strassel, Haejoong Lee, Kevin Walker, Jonathan Wright, Jennifer Garland, Dana Fore, Brian Gainor, Preston Cabe, Thomas Thomas, Brendan Callahan and Ann Sawyer	Collecting Natural SMS and Chat Conversations in Multiple Languages: The BOLT Phase 2 Corpus
9:45-11:25	Daniel Hladek, Jan Stas and Jozef Juhar	The Slovak Categorized News Corpus
9:45-11:25	Matus Pleva and Jozef Juhar	TUKE-BNews-SK: Slovak Broadcast News Corpus Construction and Evaluation
9:45-11:25	Irina Temnikova, William A. Baumgartner Jr., Negacy D. Hailu, Ivelina Nikolova, Tony McEnery, Adam Kilgarriff, Galia Angelova and K. Bretonnel Cohen	Sublanguage Corpus Analysis Toolkit: a Tool for Assessing the Representativeness and Sublanguage Characteristics of Corpora
9:45-11:25	Csaba Oravecz, Tamás Váradi and Bálint Sass	The Hungarian Gigaword Corpus
9:45-11:25	Željko Agić and Nikola Ljubešić	The SETimes.HR Linguistically Annotated Corpus of Croatian
9:45-11:25	Nikola Ljubešić and Antonio Toral	caWaC - a Web Corpus of Catalan and its Application to Language Modeling and Machine Translation
9:45-11:25	Jerid Francom, Mans Hulden and Adam Ussishkin	ACTIV-ES: a Comparable, Cross-Dialect Corpus of "everyday" Spanish from Argentina, Mexico, and Spain
9:45-11:25	Vidas Daudaravicius	Language Editing Dataset of Academic Texts
9:45-11:25	Suguru Matsuyoshi, Ryo Otsuki and Fumiyo Fukumoto	Annotating the Focus of Negation in Japanese Text
9:45-11:25	Siddharth Jain, Archna Bhatia, Angelique Rein and Eduard Hovy	A Corpus of Participant Roles in Contentious Discussions

	Session P25 - Machine Translation	Chair : Holger Schwenk
9:45-11:25	Michael Carl, Mercedes Martínez García and Bartolomé Mesa-Lao	CFT13: a Resource for Research into the Post-editing Process
9:45-11:25	Nianwen Xue, Ondrej Bojar, Jan Hajic, Martha Palmer, Zdenka Uresova and Xiuhong Zhang	Not an Interlingua, But Close: Comparison of English AMRs to Chinese and Czech
9:45-11:25	Miriam Kaeshammer and Anika Westburg	On Complex Word Alignment Configurations
9:45-11:25	Anoop Kunchukuttan, Abhijit Mishra, Rajen Chatterjee, Ritesh Shah and Pushpak Bhattacharyya	Shata-Anuvadak: Tackling Multiway Translation of Indian Languages
9:45-11:25	Marco Turchi and Matteo Negri	Automatic Annotation of Machine Translation Datasets with Binary Quality Judgements
9:45-11:25	Violeta Seretan, Pierrette Bouillon and Johanna Gerlach	A Large-Scale Evaluation of Pre-editing Strategies for Improving User-Generated Content Translation
9:45-11:25	Nicolas Pécheux, Alexander Allauzen and François Yvon	Rule-based Reordering Space in Statistical Machine Translation
9:45-11:25	Kunal Sachdeva, Rishabh Srivastava, Sambhav Jain and Dipti Sharma	Hindi to English Machine Translation: Using Effective Selection in Multi-Model SMT

	Session P26 - Parallel Corpora	Chair : Dan Tufiș
9:45-11:25	Jayendra Rakesh Yeka, Prasanth Kolachina and Dipti Misra Sharma	Benchmarking of English-Hindi Parallel Corpora
9:45-11:25	Petic Mircea and Daniela Gîfu	Transliteration and Alignment of Parallel Texts from Cyrillic to Latin
9:45-11:25	Manuela Sanguinetti, Cristina Bosco and Loredana Cupi	Exploiting Catenae in a Parallel Treebank Alignment
9:45-11:25	Yves Scherrer, Luka Nerima, Lorenza Russo, Maria Ivanova and Eric Wehrli	SwissAdmin: a Multilingual Tagged Parallel Corpus of Press Releases
9:45-11:25	Liang Tian, Derek F. Wong, Lidia S. Chao, Paulo Quaresma, Francisco Oliveira and Lu Yi	UM-Corpus: A Large English-Chinese Parallel Corpus for Statistical Machine Translation
9:45-11:25	Raphael Rubino, Antonio Toral, Nikola Ljubešić and Gema Ramírez-Sánchez	Quality Estimation for Synthetic Parallel Data Generation
9:45-11:25	Raivis Skadiņš, Jörg Tiedemann, Roberts Rozis and Daiga Deksne	Billions of Parallel Words for Free: Building and Using the EU Bookshop Corpus
9:45-11:25	Ahmed Abdelali, Francisco Guzman, Hassan Sajjad and Stephan Vogel	The AMARA Corpus: Building Parallel Language Resources for the Educational Domain
9:45-11:25	Ann Bies, Justin Mott, Seth Kulick, Jennifer Garland and Colin Warner	Incorporating Alternate Translations into English Translation Treebank
9:45-11:25	Shikun Zhang, Wang Ling and Chris Dyer	Dual Subtitles as Parallel Corpora
9:45-11:25	Pavel Vondřička	Aligning Parallel Texts with InterText

	Session P27 - Sign Language	Chair : Thomas Hanke
9:45-11:25	Rosalee Wolfe, John McDonald, Larwan Berke and Marie Stumbo	Expanding N-gram Analytics in ELAN and a Case Study for Sign Synthesis
9:45-11:25	Matti Karppa, Ville Viitaniemi, Marcos Luzardo, Jorma Laaksonen and Tommi Jantunen	SLMotion - an Extensible Sign Language Oriented Video Analysis Tool
9:45-11:25	Ville Viitaniemi, Tommi Jantunen, Leena Savolainen, Matti Karppa and Jorma Laaksonen	S-pot - a Benchmark in Spotting Signs Within Continuous Signing
9:45-11:25	Mayumi Bono, Kouhei Kikuchi, Paul Cibulka and Yutaka Osugi	A Colloquial Corpus of Japanese Sign Language: Linguistic Resources for Observing Sign Language Conversations
9:45-11:25	Leah Geer and Jonathan Keane	Exploring Factors that Contribute to Successful Fingerspelling Comprehension
9:45-11:25	Jens Forster, Christoph Schmidt, Oscar Koller, Martin Bellgardt and Hermann Ney	Extensions of the Sign Language Recognition and Translation Corpus RWTH-PHOENIX-Weather
9:45-11:25	Julie Hochgesang	The Use of a FileMaker Pro Database in Evaluating Sign Language Notation Systems
9:45-11:25	Mark Dilsizian, Polina Yanovich, Shu Wang, Carol Neidle and Dimitris Metaxas	A New Framework for Sign Language Recognition Based on 3D Handshape Identification and Linguistic Modeling

	Session P28 - Information Extraction	Chair : Diana Maynard
11:45-13:25	Xavier Tannier	Extracting News Web Page Creation Time with DCTFinder
11:45-13:25	Hans-Ulrich Krieger, Christian Spurk, Hans Uszkoreit, Feiyu Xu, Yi Zhang, Frank Müller and Thomas Tolxdorff	Information Extraction from German Patient Records via Hybrid Parsing and Relation Extraction Strategies
11:45-13:25	Júlia Pajzs, Ralf Steinberger, Maud Ehrmann, Mohamed Ebrahim, Leonida Della Rocca, Stefano Bucci, Eszter Simon and Tamás Váradi	Media Monitoring and Information Extraction for the Highly Inflected Agglutinative Language Hungarian
11:45-13:25	Antje Schlaf, Claudia Bobach and Matthias Irmer	Creating a Gold Standard Corpus for the Extraction of Chemistry-Disease Relations from Patent Texts
11:45-13:25	Felice Dell'Orletta, Giulia Venturi, Andrea Cimino and Simonetta Montemagni	T2K^2: a System for Automatically Extracting and Organizing Knowledge from Texts
11:45-13:25	Johannes Kirschnick, Alan Akbik and Holmer Hemsen	Freepal: A Large Collection of Deep Lexico-Syntactic Patterns for Relation Extraction
11:45-13:25	Marc Poch, Núria Bel, Sergio Espeja and Felipe Navio	Ranking Job Offers for Candidates: Learning Hidden Knowledge from Big Data
11:45-13:25	Paul Buitelaar, Georgeta Bordea and Barry Coughlan	Hot Topics and Schisms in NLP: Community and Trend Analysis with Saffron on ACL and LREC Proceedings
11:45-13:25	Andre Blessing and Jonas Kuhn	Textual Emigration Analysis (TEA)

	Session P29 - Lexicons	Chair : Nianwen Xue
11:45-13:25	Tristan Miller and Iryna Gurevych	WordNet―Wikipedia―Wiktionary: Construction of a Three-way Alignment
11:45-13:25	Lei Zhang, Michael Färber and Achim Rettinger	xLiD-Lexica: Cross-lingual Linked Data Lexica
11:45-13:25	Begum Erten, Cem Bozsahin and Deniz Zeyrek	Turkish Resources for Visual Word Recognition
11:45-13:25	Martin Jansche	Computer-Aided Quality Assurance of an Icelandic Pronunciation Dictionary
11:45-13:25	Lars Borin, Jens Allwood and Gerard de Melo	Bring vs. MTRoget: Evaluating Automatic Thesaurus Translation
11:45-13:25	Wushouer Mairidan, Toru Ishida, Donghui Lin and Katsutoshi Hirayama	Bilingual Dictionary Induction as an Optimization Problem
11:45-13:25	Tommaso Caselli, Laure Vieu, Carlo Strapparava and Guido Vetere	Enriching the "Senso Comune" Platform with Automatically Acquired Data
11:45-13:25	Sameh Alansary	MUHIT: A Multilingual Harmonized Dictionary
11:45-13:25	Aurelie Neveol, Julien Grosjean, Stéfan Darmoni and Pierre Zweigenbaum	Language Resources for French in the Biomedical Domain
11:45-13:25	Pyry Takala, Pekka Malo, Ankur Sinha and Oskar Ahlgren	Gold-standard for Topic-specific Sentiment Analysis of Economic Texts

	Session P30 - Large Projects and Infrastructural Issues	Chair : Yohei Murakami
11:45-13:25	Peter Spyns and Remco van Veenendaal	A Decade of HLT Agency Activities in the Low Countries: from Resource Maintenance (BLARK) to Service Offerings (BLAISE)
11:45-13:25	Koenraad De Smedt, Erhard Hinrichs, Detmar Meurers, Inguna Skadina, Bolette Pedersen, Costanza Navarretta, Núria Bel, Krister Linden, Marketa Lopatkova, Jan Hajic, Gisle Andersen and Przemyslaw Lenkiewicz	CLARA: A New Generation of Researchers in Common Language Resources and Their Applications
11:45-13:25	Lina Henriksen, Dorte Haltrup Hansen, Bente Maegaard, Bolette Sandford Pedersen and Claus Povlsen	Encompassing a Spectrum of LT Users in the CLARIN-DK Infrastructure
11:45-13:25	Maarten Truyens and Patrick Van Eecke	Legal Aspects of Text Mining
11:45-13:25	Jan Odijk	CLARIN-NL: Major Results
11:45-13:25	Auður Hauksdóttir	An Innovative World Language Centre : Challenges for the Use of Language Technology
11:45-13:25	Joseph Mariani, Christopher Cieri, Gil Francopoulo, Patrick Paroubek and Marine Delaborde	Facing the Identification Problem in Language-Related Scientific Data Analysis.
11:45-13:25	Frank Landsbergen, Carole Tiberius and Roderik Dernison	Taalportaal: an Online Grammar of Dutch and Frisian

	Session P31 - Opinion Mining and Reviews Analysis	Chair : Manfred Stede
11:45-13:25	Roman Klinger and Philipp Cimiano	The USAGE Review Corpus for Fine Grained Multi Lingual Opinion Analysis
11:45-13:25	Christian Haenig, Andreas Niekler and Carsten Wuensch	PACE Corpus: a Multilingual Corpus of Polarity-Annotated Textual Data from the Domains Automotive and CEllphone
11:45-13:25	Patrik Lambert and Carlos Rodriguez-Penagos	Adapting Freely Available Resources to Build an Opinion Mining Pipeline in Portuguese
11:45-13:25	Roser Saurí, Judith Domingo and Toni Badia	The NewSoMe Corpus: a Unifying Opinion Annotation Framework Across Genres and in Multiple Languages
11:45-13:25	André Bittar, dini luca, Sigrid Maurel and Mathieu Ruhlmann	The Dangerous Myth of the Star System
11:45-13:25	Wiltrud Kessler and Jonas Kuhn	A Corpus of Comparisons in Product Reviews

	Session P32 - Social Media Processing	Chair : Fei Xia
11:45-13:25	Clare Voss, Stephen Tratz, Jamal Laoudi and Douglas Briesch	Finding Romanized Arabic Dialect in Code-Mixed Tweets
11:45-13:25	Fabrizio Gotti, Phillippe Langlais and Atefeh Farzindar	Hashtag Occurrences, Layout and Translation: A Corpus-driven Analysis of Tweets Published by the Canadian Government
11:45-13:25	Guoyu Tang, Yunqing Xia, Weizhi Wang, Raymond Lau and Fang Zheng	Clustering Tweets using Wikipedia Concepts
11:45-13:25	Eshrag Refaee and Verena Rieser	An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis
11:45-13:25	Iñaki Alegria, Nora Aranberri, Pere Comas, Victor Fresno, Pablo Gamallo, Lluís Padró, Iñaki San Vicente, Jordi Turmo and Arkaitz Zubiaga	TweetNorm_es: an Annotated Corpus for Spanish Microtext Normalization
11:45-13:25	Nikola Ljubešić, Darja Fišer and Tomaž Erjavec	TweetCaT: a Tool for Building Twitter Corpora of Smaller Languages
11:45-13:25	Tatjana Scheffler	A German Twitter Snapshot

	Session P33 - Treebanks	Chair : Montserrat Marimón
11:45-13:25	Elżbieta Hajnicz	The Procedure of Lexico-Semantic Annotation of Składnica Treebank
11:45-13:25	Marie Candito, Guy Perrier, Bruno Guillaume, Corentin Ribeyre, Karën Fort, Djamé Seddah and Eric de la Clergerie	Deep Syntax Annotation of the Sequoia French Treebank
11:45-13:25	Alina Wróblewska and Adam Przepiórkowski	Projection-based Annotation of a Polish Dependency Treebank
11:45-13:25	Željko Agić, Daša Berović, Danijela Merkler and Marko Tadić	Croatian Dependency Treebank 2.0: New Annotation Guidelines for Improved Parsing
11:45-13:25	Rachel Bawden, Marie-Amélie Botalla, kim gerdes and Sylvain Kahane	Correcting and Validating Syntactic Dependency in the Spoken French Treebank Rhapsodie
11:45-13:25	Kilian A. Foth, Arne Köhn, Niels Beuck and Wolfgang Menzel	Because Size Does Matter: The Hamburg Dependency Treebank
11:45-13:25	Rudolf Rosa, Jan Mašek, David Mareček, Martin Popel, Daniel Zeman and Zdeněk Žabokrtský	HamleDT 2.0: Thirty Dependency Treebanks Stanfordized
11:45-13:25	Munshi Asadullah, Patrick Paroubek and Anne Vilnat	Bidirectionnal Converter Between Syntactic Annotations: from French Treebank Dependencies to PASSAGE Annotations, and back
11:45-13:25	Mohamed Maamouri, Ann Bies, Seth Kulick, Michael Ciul, Nizar Habash and Ramy Eskander	Developing an Egyptian Arabic Treebank: Impact of Dialectal Morphology on Annotation and Tool Development

	Session P34 - Corpora and Annotation	Chair : Zygmunt Vetulani
14:55-16:35	Inès Zribi, Rahma Boujelbane, Abir Masmoudi, Mariem Ellouze, Lamia Belguith and Nizar Habash	A Conventional Orthography for Tunisian Arabic
14:55-16:35	Wajdi Zaghouani, Behrang Mohit, Nizar Habash, Ossama Obeid, Nadi Tomeh, Alla Rozovskaya, Noura Farra, Sarah Alkuhlani and Kemal Oflazer	Large Scale Arabic Error Annotation: Guidelines and Framework
14:55-16:35	Shinsuke Mori, Hirokuni Maeta, Yoko Yamakata and Tetsuro Sasada	Flow Graph Corpus from Recipe Texts
14:55-16:35	Marc Kupietz and Harald Lüngen	Recent Developments in DeReKo
14:55-16:35	Shu-Kai Hsieh	Why Chinese Web-as-Corpus is Wacky? Or: How Big Data is Killing Chinese Corpus Linguistics
14:55-16:35	Jannik Strötgen, Thomas Bögel, Julian Zell, Ayser Armiti, Tran Van Canh and Michael Gertz	Extending HeidelTime for Temporal Expressions Referring to Historic Dates
14:55-16:35	Thomas Eckart, Erla Hallsteinsdóttir, Sigrún Helgadóttir, Uwe Quasthoff and Dirk Goldhahn	A 500 Million Word POS-Tagged Icelandic Corpus
14:55-16:35	Shan Wang and Francis Bond	Building The Sense-Tagged Multilingual Parallel Corpus
14:55-16:35	Anik Dey and Pascale Fung	A Hindi-English Code-Switching Corpus
14:55-16:35	Andrea Abel, Aivars Glaznieks, Lionel Nicolas and Egon Stemle	KoKo: an L1 Learner Corpus for German
14:55-16:35	Vasile Rus, Rajendra Banjade and Mihai Lintean	On Paraphrase Identification Corpora
14:55-16:35	Anne Garcia-Fernandez, Anne-Laure Ligozat and Anne Vilnat	Construction and Annotation of a French Folkstale Corpus
14:55-16:35	shyam sundar agrawal,Mandal Abhimanue, Shweta Bansal and Minakshi Mahajan	Statistical Analysis of Multilingual Text Corpus and Development of Language Models
14:55-16:35	Vanessa Loza, Shibamouli Lahiri, Rada Mihalcea and Po-Hsiang Lai	Building a Dataset for Summarization and Keyword Extraction from Emails

	Session P35 - Grammar and Syntax	Chair : Tamás Váradi
14:55-16:35	Emily M. Bender	Language CoLLAGE: Grammatical Description with the LinGO Grammar Matrix
14:55-16:35	Anna Vernerová, Václava Kettnerová and Marketa Lopatkova	To Pay or to Get Paid: Enriching a Valency Lexicon with Diatheses
14:55-16:35	Georgios Petasis	The Ellogon Pattern Engine: Context-free Grammars over Annotations
14:55-16:35	Dana Dannells and Normunds Gruzitis	Extracting a bilingual semantic grammar from FrameNet-annotated corpora
14:55-16:35	Kyoko Ohara	Relating Frames and Constructions in Japanese FrameNet
14:55-16:35	Lars Hellan, Dorothee Beermann, Tore Bruland, Mary Esther Kropp Dakubu and Montserrat Marimon	MultiVal - Towards a Multilingual Valence Lexicon
14:55-16:35	Emanuele Di Buccio, Giorgio Maria Di Nunzio and Gianmaria Silvello	A Vector Space Model for Syntactic Distances Between Dialects
14:55-16:35	Jana Sindlerova, Zdenka Uresova and Eva Fucikova	Resources in Conflict: A Bilingual Valency Lexicon vs. a Bilingual Treebank vs. a Linguistic Theory

	Session P36 - Metaphors	Chair : Walter Daelemans
14:55-16:35	Samira Shaikh, Tomek Strzalkowski, Ting Liu, George Aaron Broadwell, Boris Yamrom, Sarah Taylor, Laurie Feldman, Kit Cho, Umit Boz, Ignacio Cases, Yuliya Peshkova and Ching-Sheng Lin	A Multi-Cultural Repository of Automatically Discovered Linguistic and Conceptual Metaphors
14:55-16:35	Brian MacWhinney and Davida Fromm	Two Approaches to Metaphor Detection
14:55-16:35	Andrew Gargett and John Barnden	Mining Online Discussion Forums for Metaphors

	Session P37 - Named Entity Recognition	Chair : German Rigau
14:55-16:35	Kareem Darwish and Wei Gao	Simple Effective Microblog Named Entity Recognition: Arabic as an Example
14:55-16:35	Cyril Grouin	Biomedical Entity Extraction using Machine-Learning Based Approaches
14:55-16:35	Darina Benikova, Chris Biemann and Marc Reznicek	NoSta-D Named Entity Annotation for German: Guidelines and Dataset
14:55-16:35	Haibo Li, Masato Hagiwara, Qi Li and Heng Ji	Comparison of the Impact of Word Segmentation on Name Tagging for Chinese and Japanese
14:55-16:35	Dimitrios Kokkinakis, Jyrki Niemi, Sam Hardwick, Krister Lindén and Lars Borin	HFST-SweNER ― A New NER Resource for Swedish
14:55-16:35	Hege Fromreide, Dirk Hovy and Anders Søgaard	Crowdsourcing and Annotating NER for Twitter #drift
14:55-16:35	Guillaume Jacquet, Maud Ehrmann and Ralf Steinberger	Clustering of Multi-Word Named Entity Variants: Multilingual Evaluation
14:55-16:35	Daniela Amaral, Evandro Fonseca, Lucelene Lopes and Renata Vieira	Comparative Analysis of Portuguese Named Entities Recognition Tools
14:55-16:35	Cédric Lopez, Frédérique Segond, Olivier Hondermarck, Paolo Curtoni and Luca Dini	Generating a Resource for Products and Brandnames Recognition. Application to the Cosmetic Domain.
14:55-16:35	Younggyun Hahm, Jungyeul Park, Kyungtae Lim, Youngsik Kim, Dosam Hwang and Key-Sun Choi	Named Entity Corpus Construction using Wikipedia and DBpedia Ontology
14:55-16:35	Andrea Glaser and Jonas Kuhn	Exploring the Utility of Coreference Chains for Improved Identification of Personal Names
14:55-16:35	Joachim Bingel and Thomas Haider	Named Entity Tagging a Very Large Unbalanced Corpus: Training and Evaluating NE Classifiers

	Session P38 - Question Answering	Chair : António Branco
14:55-16:35	Peter Exner and Pierre Nugues	REFRACTIVE: An Open Source Tool to Extract Knowledge from Syntactic and Semantic Relations
14:55-16:35	Akira Fujita, Akihiro Kameda, Ai Kawazoe and Yusuke Miyao	Overview of Todai Robot Project and Evaluation Framework of its NLP-based Problem Solving
14:55-16:35	Kirk Roberts, Kate Masterton, Marcelo Fiszman, Halil Kilicoglu and Dina Demner-Fushman	Annotating Question Decomposition on Complex Medical Questions
14:55-16:35	Sérgio Curto, Ana C. Mendes, Pedro Curto, Luísa Coheur and Angela Costa	JUST.ASK, a QA System that Learns to Answer New Questions from Previous Interactions
14:55-16:35	Kugatsu Sadamitsu, Ryuichiro Higashinaka and Yoshihiro Matsuo	Extraction of Daily Changing Words for Question Answering
14:55-16:35	Artem Ostankov, Florian Röhrbein and Ulli Waltinger	LinkedHealthAnswers: Towards Linked Data-driven Question Answering for the Health Care Domain
14:55-16:35	Axel-Cyrille Ngonga Ngomo, Norman Heino, René Speck and Prodromos Malakasiotis	A Tool Suite for Creating Question Answering Benchmarks

	Session P39 - Speech Resources	Chair : Henk van den Heuvel
14:55-16:35	Luca Cristoforetti, Mirco Ravanelli, Maurizio Omologo, Alessandro Sosi, Alberto Abad, Martin Hagmueller and Petros Maragos	The DIRHA simulated corpus
14:55-16:35	Roberto Gretter	Euronews: a Multilingual Speech Corpus for ASR
14:55-16:35	Sakriani Sakti, Keigo Kubo, Sho Matsumiya, Graham Neubig, Tomoki Toda, Satoshi Nakamura, Fumihiro Adachi and Ryosuke Isotani	Towards Multilingual Conversations in the Medical Domain: Development of Multilingual Medical Data and A Network-based ASR System
14:55-16:35	Andrej Zgank, Ana Zwitter Vitez and Darinka Verdonik	The Slovene BNSI Broadcast News Database and Reference Speech Corpus GOS: Towards the Uniform Guidelines for Future Work
14:55-16:35	Jan Gorisch, Corine Astésano, Ellen Gurman Bard, Brigitte Bigi and Laurent Prévot	Aix Map Task Corpus: the French Multimodal Corpus of Task-oriented Dialogue
14:55-16:35	Carmen Garcia-Mateo, Antonio Cardenal, Xose Luis Regueira, Elisa Fernández Rei, Marta Martinez, Roberto Seara, Rocío Varela and Noemí Basanta	CORILGA: a Galician Multilevel Annotated Speech Corpus for Linguistic Analysis
14:55-16:35	Igor Odriozola, Inma Hernaez, María Inés Torres, Luis Javier Rodriguez-Fuentes, Mikel Penagarikano and Eva Navas	Basque Speecon-like and Basque SpeechDAT MDB-600: Speech Databases for the Development of ASR Technology for Basque
14:55-16:35	David Tavarez, Eva Navas, Daniel Erro, Ibon Saratxaga and Inma Hernaez	New Bilingual Speech Databases for Audio Diarization
14:55-16:35	Tobias Bocklet, Andreas Maier, Korbinian Riedhammer, Ulrich Eysholdt and Elmar Nöth	Erlangen-CLP: A Large Annotated Corpus of Speech from Children with Cleft Lip and Palate
14:55-16:35	Evgeny Stepanov, Giuseppe Riccardi and Ali Orkan Bayer	The Development of the Multilingual LUNA Corpus for Spoken Language System Porting

	Session P40 - Lexicons	Chair : Yoshihiko Hayashi
16:55-18:15	Bruno Guillaume, Karën Fort, Guy Perrier and Paul Bédaride	Mapping the Lexique des Verbes du français (Lexicon of French Verbs) to a NLP Lexicon using Examples
16:55-18:15	Satoshi Sato	Text Readability and Word Distribution in Japanese
16:55-18:15	Uwe Quasthoff, Dirk Goldhahn, Thomas Eckart, Erla Hallsteinsdóttir and Sabine Fiedler	High Quality Word Lists as a Resource for Multiple Purposes
16:55-18:15	Þórdís Úlfarsdóttir	ISLEX ― a Multilingual Web Dictionary
16:55-18:15	Eduard Bejček, Kettnerová Václava and Marketa Lopatkova	Automatic Mapping Lexical Resources: A Lexical Unit as the Keystone
16:55-18:15	Cédric Lopez, Reda Bestandji, Mathieu Roche and Rachel Panckhurst	Towards Electronic SMS Dictionary Construction: An Alignment-based Approach
16:55-18:15	Ahmet Aker, Monica Paramita, Marcis Pinnis and Robert Gaizauskas	Bilingual dictionaries for all EU languages
16:55-18:15	Tafseer Ahmed Khan	Automatic Acquisition of Urdu Nouns (along with Gender and Irregular Plurals))
16:55-18:15	Valeria de Paiva, Livy Real, Alexandre Rademaker and Gerard de Melo	NomLex-PT: A Lexicon of Portuguese Nominalizations

	Session P41 - Parsing	Chair : Simonetta Montemagni
16:55-18:15	Hen-Hsen Huang, Huan-Yuan Chen, Chang-Sheng Yu, Hsin-Hsi Chen, Po-Ching Lee and Chun-Hsun Chen	Sentence Rephrasing for Parsing Sentences with OOV Words
16:55-18:15	Cheikh M. Bamba Dione	Pruning the Search Space of the Wolof LFG Grammar Using a Probabilistic and a Constraint Grammar Parser
16:55-18:15	Elena Mitocariu, Daniel Anechitei and Dan Cristea	How Could Veins Speed Up the Process of Discourse Parsing
16:55-18:15	Achim Stein	Parsing Heterogeneous Corpora with a Rich Dependency Grammar
16:55-18:15	Angelina Ivanova and Gertjan van Noord	Treelet Probabilities for HPSG Parsing and Error Correction
16:55-18:15	Arda Celebi and Arzucan Özgür	Self-training a Constituency Parser using n-gram Trees
16:55-18:15	Natalia Silveira, Timothy Dozat, Marie-Catherine de Marneffe, Samuel Bowman, Miriam Connor, John Bauer and Chris Manning	A Gold Standard Dependency Corpus for English
16:55-18:15	Wolfgang Maier, Miriam Kaeshammer, Peter Baumann and Sandra Kübler	Discosuite - A Parser Test Suite for German Discontinuous Structures

	Session P42 - Part-of-Speech Tagging	Chair : Krister Linden
16:55-18:15	Timur Gilmanov, Olga Scrivner and Sandra Kübler	SWIFT Aligner, A Multifunctional Tool for Parallel Corpora: Visualization, Word Alignment, and (Morpho)-Syntactic Cross-Language Transfer
16:55-18:15	Saba Urooj, Sarmad Hussain, Asad Mustafa, Rahila Parveen, Farah Adeeba, Tafseer Ahmed Khan, Miriam Butt and Annette Hautli	The CLE Urdu POS Tagset
16:55-18:15	Kareem Darwish, Ahmed Abdelali and Hamdy Mubarak	Using Stem-Templates to Improve Arabic POS and Gender/Number Tagging
16:55-18:15	Gaël de Chalendar	The LIMA Multilingual Analyzer Made Free: FLOSS Resources Adaptation and Correction
16:55-18:15	Bushra Jawaid, Amir Kamran and Ondrej Bojar	A Tagged Corpus and a Tagger for Urdu
16:55-18:15	Sigrún Helgadóttir, Hrafn Loftsson and Eiríkur Rögnvaldsson	Correcting Errors in a New Gold Standard for Tagging Icelandic Text
16:55-18:15	Łukasz Kobyliński	PoliTa: a Multitagger for Polish

	Session P43 - Semantics	Chair : Marc Verhagen
16:55-18:15	Francesca Frontini, Valeria Quochi, Sebastian Padó, Monica Monachini and Jason Utt	Polysemy Index for Nouns: an Experiment on Italian using the PAROLE SIMPLE CLIPS Lexical Database
16:55-18:15	Muntsa Padró, Marco Idiart, Aline Villavicencio and Carlos Ramisch	Comparing Similarity Measures for Distributional Thesauri
16:55-18:15	Elisa Omodei, Jean-Philippe Cointet and Thierry Poibeau	Reconstructing the Semantic Landscape of Natural Language Processing
16:55-18:15	Olivier Ferret	Compounds and Distributional Thesauri
16:55-18:15	Kyle Richardson and Jonas Kuhn	UnixMan Corpus: A Resource for Language Learning in the Unix Domain
16:55-18:15	Tatiana Erekhinskaya, Meghana Satpute and Dan Moldovan	Multilingual eXtended WordNet Knowledge Base: Semantic Parsing and Translation of Glosses
16:55-18:15	Manel Zarrouk and Mathieu Lafourcade	Relation Inference in Lexical Networks ... with Refinements
16:55-18:15	Raquel Amaro	Extracting Semantic Relations from Portuguese Corpora using Lexical-Syntactic Patterns
16:55-18:15	David Jurgens	An Analysis of Ambiguity in Word Sense Annotations
16:55-18:15	Claire Bonial, Julia Bonn, Kathryn Conger, Jena D. Hwang and Martha Palmer	PropBank: Semantics of New Predicate Types
16:55-18:15	Michael Mohler, Marc Tomlinson, David Bracewell and Bryan Rink	Semi-Supervised Methods for Expanding Psycholinguistics Norms by Integrating Distributional Similarity with the Structure of WordNet
16:55-18:15	Gemma Bel Enguix, Reinhard Rapp and Michael Zock	A Graph-Based Approach for Computing Free Word Associations
16:55-18:15	Martin Gleize and Brigitte Grau	A Hierarchical Taxonomy for Classifying Hardness of Inference Tasks

	Session P44 - Speech Recognition and Synthesis	Chair : Denise DiPersio
16:55-18:15	Joris Pelemans, Kris Demuynck, Hugo Van hamme and Patrick Wambacq	Speech Recognition Web Services for Dutch
16:55-18:15	Maria Goryainova, Cyril Grouin, Sophie Rosset and Ioana Vasilescu	Morpho-Syntactic Study of Errors from Speech Recognition System
16:55-18:15	Daniel Luzzati, Cyril Grouin, Ioana Vasilescu, Martine Adda-Decker, Eric Bilinski, Nathalie Camelin, Juliette Kahn, Carole Lailler, Lori Lamel and Sophie Rosset	Human Annotation of ASR Error Regions: is ""gravity"" a Sharable Concept for Human Annotators?
16:55-18:15	Mohamed Elmahdy, Mark Hasegawa-Johnson and Eiman Mustafawi	Development of a TV Broadcasts Speech Recognition System for Qatari Arabic
16:55-18:15	Mohamed Elmahdy, Mark Hasegawa-Johnson and Eiman Mustafawi	Automatic Long Audio Alignment and Confidence Scoring for Conversational Arabic Speech
16:55-18:15	Giampiero Salvi and Niklas Vanhainen	The WaveSurfer Automatic Speech Recognition Plugin
16:55-18:15	Matti Varjokallio and mikko kurimo	A Toolkit for Efficient Learning of Lexical Units for Speech Recognition
16:55-18:15	Aimilios Chalamandaris, Pirros Tsiakoulis, Sotiris Karabetsos and Spyros Raptis	Using Audio Books for Training a Text-to-Speech System

	Session P45 - Anaphora and Coreference	Chair : Costanza Navarretta
18:20-19:20	Panot Chaimongkol, Akiko Aizawa and Yuka Tateisi	Corpus for Coreference Resolution on Scientific Papers
18:20-19:20	Liane Guillou, Christian Hardmeier, Aaron Smith, Jörg Tiedemann and Bonnie Webber	ParCor 1.0: A Parallel Pronoun-Coreference Corpus to Support Statistical MT
18:20-19:20	Nobal Niraula, Vasile Rus, Rajendra Banjade, Dan Stefanescu, William Baggett and Brent Morgan	The DARE Corpus: A Resource for Anaphora Resolution in Dialogue Based Intelligent Tutoring Systems
18:20-19:20	Christian Girardi, Manuela Speranza, Rachele Sprugnoli and Sara Tonelli	CROMER: a Tool for Cross-Document Event and Entity Coreference
18:20-19:20	Arturs Znotins and Peteris Paikens	Coreference Resolution for Latvian
18:20-19:20	Nadjet Bouayad-Agha, Alicia Burga, Gerard Casamayor, Joan Codina, Rogelio Nazar and Leo Wanner	An Exercise in Reuse of Resources: Adapting General Discourse Coreference Resolution for Detecting Lexical Chains in Patent Documentation
18:20-19:20	Anders Björkelund, Kerstin Eckart, Arndt Riester, Nadja Schauffler and Katrin Schweitzer	The Extended DIRNDL Corpus as a Resource for Coreference and Bridging Resolution
18:20-19:20	Marcos Garcia and Pablo Gamallo	Multilingual Corpora with Coreferential Annotation of Person Entities
18:20-19:20	Maciej Ogrodniczuk, Mateusz Kopeć and Agata Savary	Polish Coreference Corpus in Numbers

	Session P46 - Information Extraction and Information Retrieval	Chair : Dimitrios Kokkinakis
18:20-19:20	Véronique Moriceau and Xavier Tannier	French Resources for Extraction and Normalization of Temporal Expressions with HeidelTime
18:20-19:20	Zdenka Uresova, Jan Hajic, Pavel Pecina and Ondrej Dusek	Multilingual Test Sets for Machine Translation of Search Queries for Cross-Lingual Information Retrieval in the Medical Domain
18:20-19:20	Huijing Deng and Grzegorz Chrupała	Semantic Approaches to Software Component Retrieval with English Queries
18:20-19:20	Hong Li, Sebastian Krause, Feiyu Xu, Hans Uszkoreit, Robert Hummel and Veselina Mironova	Annotating Relation Mentions in Tabloid Press
18:20-19:20	Shaoda He, Xiaojun Zou, Liumingjing Xiao and Junfeng Hu	Construction of Diachronic Ontologies from People's Daily of Fifty Years
18:20-19:20	Maria Evangelia Chatzimina, Cyril Grouin and Pierre Zweigenbaum	Use of Unsupervised Word Classes for Entity Recognition: Application to the Detection of Disorders in Clinical Reports
18:20-19:20	Alan Akbik and Thilo Michael	The Weltmodell: A Data-Driven Commonsense Knowledge Base
18:20-19:20	Marieke van Erp, Gleb Satyukov, Piek Vossen and Marit Nijsen	Discovering and Visualising Stories in News
18:20-19:20	Tomohide Shibata, Shotaro Kohama and Sadao Kurohashi	A Large Scale Database of Strongly-related Events in Japanese
18:20-19:20	Steven Bethard, Philip Ogren and Lee Becker	ClearTK 2.0: Design Patterns for Machine Learning in UIMA

	Session P47 - Language Identification	Chair : Michael Rosner
18:20-19:20	Dirk Goldhahn and Uwe Quasthoff	Vocabulary-Based Language Similarity using Web Corpora
18:20-19:20	Thomas Lavergne, Gilles Adda, Martine Adda-Decker and Lori Lamel	Automatic Language Identity Tagging on Word and Sentence-Level in Multilingual Text Sources: a Case-Study on Luxembourgish
18:20-19:20	Marcos Zampieri and Binyam Gebre	VarClass: An Open-source Language Identification Tool for Language Varieties
18:20-19:20	Xiao Jiang, Yufan Guo, Jeroen Geertzen, Dora Alexopoulou, Lin Sun and Anna Korhonen	Native Language Identification Using Large, Longitudinal Data
18:20-19:20	Liviu Dinu and Alina Maria Ciobanu	On the Romance Languages Mutual Intelligibility

	Session P48 - Morphology	Chair : Karel Pala
18:20-19:20	Senka Drobac, Krister Lindén, Tommi Pirinen and Miikka Silfverberg	Heuristic Hyper-minimization of Finite State Lexicons
18:20-19:20	Claudia Borg and Albert Gatt	Crowd-sourcing Evaluation of Automatically Acquired, Morphologically Related Word Groupings
18:20-19:20	Patrick Littell, Kaitlyn Price and Lori Levin	Morphological Parsing of Swahili using Crowdsourced Lexical Resources
18:20-19:20	Carla Parra Escartín	Chasing the Perfect Splitter: A Comparison of Different Compound Splitting Tools
18:20-19:20	Vincent Claveau and Ewa Kijak	Generating and using Probabilistic Morphological Resources for the Biomedical Domain
18:20-19:20	Peter Baumann and Janet Pierrehumbert	Using Resource-Rich Languages to Improve Morphological Analysis of Under-Resourced Languages
18:20-19:20	Ozlem Cetinoglu	Turkish Treebank as a Gold Standard for Morphological Disambiguation and Its Influence on Parsing
18:20-19:20	Krešimir Šojat, Matea Srebačić, Marko Tadić and Tin Pavelić	CroDeriV: a New Resource for Processing Croatian Morphology
18:20-19:20	Jan Šnajder	DerivBase.hr: A High-Coverage Derivational Morphology Resource for Croatian
18:20-19:20	Jonathan Washington, Ilnar Salimzyanov and Francis Tyers	Finite-State Morphological Transducers for Three Kypchak Languages

	Session P49 - Multimodality	Chair : Volker Steinbiss
18:20-19:20	Brigitte Bigi, Tatsuya Watanabe and Laurent Prévot	Representing Multimodal Linguistic Annotated Data
18:20-19:20	Michael kipp, Levin Freiherr von Hollen, Michael Christopher Hrstka and Franziska Zamponi	Single-Person and Multi-Party 3D Visualizations for Nonverbal Communication Analysis
18:20-19:20	Huseyin Cakmak, Jerome Urbain, Thierry Dutoit and Joelle Tilmanne	The AV-LASYN Database: a Synchronous Corpus of Audio and 3D Facial Marker Data for Audio-Visual Laughter Synthesis
18:20-19:20	Vincent Vandeghinste and Ineke Schuurman	Linking Pictographs to Synsets: Sclera2Cornetto
18:20-19:20	Dietmar Schabus, Michael Pucher and Phil Hoole	The MMASCS Multi-Modal Annotated Synchronous Corpus of Audio, Video, Facial Motion and Tongue Motion Data of Normal, Fast and Slow Speech
18:20-19:20	Mathieu Chollet, Magalie Ochs and Catherine Pelachaud	Mining a Multimodal Corpus for Non-Verbal Behavior Sequences Conveying Attitudes
18:20-19:20	Massimo Moneglia, Susan Brown, Francesca Frontini, Gloria Gagliardi, Fahad Khan, Monica Monachini and Alessandro Panunzi	The IMAGACT Visual Ontology. an Extendable Multilingual Infrastructure for the Representation of Lexical Encoding of Action
18:20-19:20	Kodai Takahashi and Masashi Inoue	Multimodal Dialogue Segmentation with Gesture Post-Processing
18:20-19:20	Shannon Hennig, Ryad Chellali and Nick Campbell	The D-ANS corpus: the Dublin-Autonomous Nervous System corpus of biosignal and multimodal recordings of conversational speech

Day 3, Oral Sessions:

	Session O37 - Sentiment Analysis and Social Media (2)	Chairperson: Piek Vossen
9:45-10:05	Diana Maynard and Mark Greenwood	Who cares about Sarcastic Tweets? Investigating the Impact of Sarcasm on Sentiment Analysis.
10:05-10:25	Olga Uryupina, Barbara Plank, Aliaksei Severyn, Agata Rotondi and Alessandro Moschitti	SenTube: A Corpus for Sentiment Analysis on YouTube Social Media
10:25-10:45	Reid Swanson, Stephanie Lukin, Luke Eisenberg, Thomas Corcoran and Marilyn Walker	Getting Reliable Annotations for Sarcasm in Online Dialogues
10:45-11:05	Francesco Barbieri and Horacio Saggion	Modelling Irony in Twitter: Feature Analysis and Evaluation
11:05-11:25	Alexandra Balahur, Marco Turchi, Ralf Steinberger, Jose Manuel Perea-Ortega, Guillaume Jacquet, Dilek Kucuk, Vanni Zavarella and Adil El Ghali	Resource Creation and Evaluation for Multilingual Sentiment Analysis in Social Media Texts

	Session O38 - Paraphases	Chairperson: Bernardo Magnini
9:45-10:05	Marianna Apidianaki, Emilia Verzeni and Diana McCarthy	Semantic Clustering of Pivot Paraphrases
10:05-10:25	Juri Ganitkevitch and Chris Callison-Burch	The Multilingual Paraphrase Database
10:25-10:45	Van-Minh Pho, Thibault André, Anne-Laure Ligozat, Brigitte Grau, Gabriel Illouz and Thomas Francois	Multiple Choice Question Corpus Analysis for Distractor Characterization
10:45-11:05	Sander Wubben, Antal van den Bosch and Emiel Krahmer	Creating and Using Large Monolingual Parallel Corpora for Sentential Paraphrase Generation
11:05-11:25	Michaela Regneri, Rui Wang and Manfred Pinkal	Aligning Predicate-Argument Structures for Paraphrase Fragment Extraction

	Session O39 - Information Extraction (2)	Chairperson: Eduard Hovy
9:45-10:05	Silvia Necsulescu, Sara Mendes and Núria Bel	Combining Dependency Information and Generalization in a Pattern-based Approach to the Classification of Lexical-Semantic Relation Instances
10:05-10:25	Yifan He and Adam Meyers	Corpus and Method for Identifying Citations in Non-Academic Text
10:25-10:45	Sebastian Krause, Hong Li, Feiyu Xu, Hans Uszkoreit, Robert Hummel and Luise Spielhagen	Language Resources and Annotation Tools for Cross-Sentence Relation Extraction
10:45-11:05	Carlo Strapparava, Lorenzo Gatti, Marco Guerini and Oliviero Stock	Creative Language Explorations through a high-Expressivity N-grams Query Language
11:05-11:25	Milen Kouylekov and Stephan Oepen	Semantic Technologies for Querying Linguistic Annotations: An Experiment Focusing on Graph-Structured Data

	Session O40 - Lexicons and Ontologies	Chairperson: Gudrun Magnusdottir
9:45-10:05	Ingrid Falk, Delphine Bernhard and Christophe Gérard	From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers
10:05-10:25	Nitsan Chrizman and Alon Itai	How to Construct a Multi-Lingual Domain Ontology
10:25-10:45	Paweł Kędzia and Maciej Piasecki	Ruled-based, Interlingual Motivated Mapping of plWordNet onto SUMO Ontology
10:45-11:05	Yulia Tsvetkov, Nathan Schneider, Dirk Hovy, Archna Bhatia, Manaal Faruqui and Chris Dyer	Augmenting English Adjective Senses with Supersenses
11:05-11:25	Lauren Romeo, Gianluca Lebani, Núria Bel and Alessandro Lenci	Choosing which to Use? A Study of Distributional Models for Nominal Lexical Semantic Classification

	Session O41 - Machine Translation	Chairperson: Alan Melby
11:45-12:05	Mikel Forcada	On the Annotation of TMX Translation Memories for Advanced Leveraging in Computer-aided Translation
12:05-12:25	Teresa Herrmann, Jan Niehues and Alex Waibel	Manual Analysis of Structurally Informed Reordering in German-English Machine Translation
12:25-12:45	Gregor Thurmair	Conceptual Transfer: Using Local Classifiers for Transfer Selection
12:45-13:05	Grégoire Détrez, Víctor M. Sánchez-Cartagena and Aarne Ranta	Sharing Resources Between Free/Open-Source Rule-based Machine Translation Systems: Grammatical Framework and Apertium
13:05-13:25	Friedel Wolff, Laurette Pretorius and Paul Buitelaar	Missed Opportunities in Translation Memory Matching

	Session O42 - Dialogue (2)	Chairperson: Shyam Agrawal
11:45-12:05	Volha Petukhova, Andrei Malchanau and Harry Bunt	Interoperability of Dialogue Corpora through ISO 24617-2-based Querying
12:05-12:25	Sabrina Campano, Jessica Durand and Chloé Clavel	Comparative Analysis of Verbal Alignment in Human-Human and Human-Agent Interactions
12:25-12:45	Matěj Korvas, Ondřej Plátek, Ondřej Dušek, Lukáš Žilka and Filip Jurčíček	Free English and Czech Telephone Speech Corpus Shared Under the CC-BY-SA 3.0 License
12:45-13:05	Hiroaki Noguchi, Yasuhiro Katagiri and Yasuharu Den	Japanese Conversation Corpus for Training and Evaluation of Backchannel Prediction Model.
13:05-13:25	Andrew Gargett, Sam Hellmuth and Ghazi AlGethami	DiVE-Arabic: Gulf Arabic Dialogue in a Virtual Environment

	Session O43 - Semantics (2)	Chairperson: James Pustejovsky
11:45-12:05	Tamara Polajnar, Laura Rimell and Stephen Clark	Evaluation of Simple Distributional Compositional Operations on Longer Texts
12:05-12:25	Akira Utsumi	A Character-based Approach to Distributional Semantic Models: Exploiting Kanji Characters for Constructing JapaneseWord Vectors
12:25-12:45	Lauren Romeo, Sara Mendes and Núria Bel	A Cascade Approach for Complex-type Classification
12:45-13:05	Maximilian Köper and Sabine Schulte im Walde	A Rank-based Distance Measure to Detect Polysemy and to Determine Salient Vector-Space Features for German Prepositions
13:05-13:25	Daniel Peterson, Martha Palmer and Shumin Wu	Focusing Annotation for Semantic Role Labeling

	Session O44 - Grammar and Parsing (2)	Chairperson: Sadao Kurohashi
11:45-12:05	Dirk Hovy, Barbara Plank and Anders Søgaard	When POS Data Sets Don't Add Up: Combatting Sample Bias
12:05-12:25	Guntis Barzdins, Didzis Gosko, Laura Rituma and Peteris Paikens	Using C5.0 and Exhaustive Search for Boosting Frame-Semantic Parsing Accuracy
12:25-12:45	Eckhard Bick	ML-Optimization of Ported Constraint Grammars
12:45-13:05	Simon Fuller, Phil Maguire and Philippe Moser	A Deep Context Grammatical Model For Authorship Attribution
13:05-13:25	Fabienne Braune, Daniel Bauer and Kevin Knight	Mapping Between English Strings and Reentrant Semantic Graphs

	Session O45 - Environment and Machine Interactions - Special Session	Chairperson: Laurence Devillers
14:55-15:15	Michel Vacher, Benjamin Lecouteux, Pedro Chahuara, François Portet, Brigitte Meillon and Nicolas Bonnefond	The Sweet-Home Speech and Multimodal Corpus for Home Automation Interaction
15:15-15:35	João Freitas, António Teixeira and Miguel Dias	Multimodal Corpora for Silent Speech Interaction
15:35-15:55	Bo Liu, Jingjing Liu, Xiang Yu, Dimitris Metaxas and Carol Neidle	3D Face Tracking and Multi-Scale, Spatio-temporal Analysis of Linguistically Significant Facial Expressions and Head Positions in ASL
15:55-16:15	Emanuele Bastianelli, Giuseppe Castellucci, Danilo Croce, Luca Iocchi, Roberto Basili and Daniele Nardi	HuRIC: a Human Robot Interaction Corpus

	Session O46 - Event Extraction and Event Coreference	Chairperson: Martha Palmer
14:55-15:15	Kevin Reschke, Martin Jankowiak, Mihai Surdeanu, Christopher Manning and Daniel Jurafsky	Event Extraction Using Distant Supervision
15:15-15:35	Chen Chen and Vincent Ng	SinoCoreferencer: An End-to-End Chinese Event Coreference Resolver
15:35-15:55	Zhengzhong Liu, Jun Araki, Eduard Hovy and Teruko Mitamura	Supervised Within-Document Event Coreference using Information Propagation
15:55-16:15	Agata Cybulska and Piek Vossen	Using a Sledgehammer to Crack a Nut? Lexical Diversity and Event Coreference Resolution
16:15-16:35	Jun Araki, Zhengzhong Liu, Eduard Hovy and Teruko Mitamura	Detecting Subevent Structure for Event Coreference Resolution

	Session O47 - Standards and Interoperability	Chairperson: Key-Sun Choi
14:55-15:15	Eva Hajičová	Three Dimensions of the so-called "Interoperability" of Annotation Schemes
15:15-15:35	Daan Broeder, Ineke Schuurman and Menzo Windhouwer	Experiences with the ISOcat Data Category Registry
15:35-15:55	Christian Chiarcos	Towards Interoperable Discourse Annotation. Discourse Features in the Ontologies of Linguistic Annotation
15:55-16:15	Emanuele Lapponi, Erik Velldal, Stephan Oepen and Rune Lain Knudsen	Off-Road LAF: Encoding and Processing Annotations in NLP Workflows
16:15-16:35	Marie-Catherine de Marneffe, Timothy Dozat, Natalia Silveira, Katri Haverinen, Filip Ginter, Joakim Nivre and Christopher D. Manning	Universal Stanford Dependencies: a Cross-Linguistic Typology

	Session O48 - Information Extraction and Text Structure	Chairperson: Mark Liberman
14:55-15:15	Giuseppe Rizzo, Marieke van Erp and Raphaël Troncy	Benchmarking the Extraction and Disambiguation of Named Entities on the Semantic Web
15:15-15:35	Adam Meyers, Giancarlo Lee, Angus Grieve-Smith, Yifan He and Harriet Taber	Annotating Relations in Scientific Articles
15:35-15:55	Eric Charton, Marie-Jean Meurs, Ludovic Jean-Louis and Michel Gagnon	Improving Entity Linking using Surface Form Refinement
15:55-16:15	Karteek Addanki and Dekai Wu	Evaluating Improvised Hip Hop Lyrics - Challenges and Observations
16:15-16:35	Jessica Ouyang and Kathy McKeown	Towards Automatic Detection of Narrative Structure

Day 3, Poster Sessions:

	Session P50 - Crowdsourcing	Chair : Cristina Vertan
9:45-11:25	Jean-Philippe Goldman, Adrian Leeman, Marie-José Kolly, Ingrid Hove, Ibrahim Almajai, Volker Dellwo and Steven Moran	A Crowdsourcing Smartphone Application for Swiss German: Putting Language Documentation in the Hands of the Users
9:45-11:25	Theodosia Togia and Ann Copestake	TagNText: a Parallel Corpus for the Induction of Resource-specific non-Taxonomical Relations from Tagged Images
9:45-11:25	Shinsuke Goto, Donghui Lin and Toru Ishida	Crowdsourcing for Evaluating Machine Translation Quality
9:45-11:25	George Kiomourtzis, George Giannakopoulos, Georgios Petasis, Pythagoras Karampiperis and Vangelis Karkaletsis	NOMAD: Linguistic Resources and Tools Aimed at Policy Formulation and Validation
9:45-11:25	Darja Fišer, Aleš Tavčar and Tomaž Erjavec	sloWCrowd: a Crowdsourcing Tool for Lexicographic Tasks

	Session P51 - Emotion Recognition and Generation	Chair : Patrick Paroubek
9:45-11:25	Maxim Sidorov, Stefan Ultes and Alexander Schmitt	Comparison of Gender- and Speaker-adaptive Emotion Recognition
9:45-11:25	Maxim Sidorov, Christina Brester, Wolfgang Minker and Eugene Semenkin	Speech-Based Emotion Recognition: Feature Selection by Self-Adaptive Multi-Criteria Genetic Algorithm
9:45-11:25	Nesrine Fourati and Catherine Pelachaud	Emilya: Emotional Body Expression in Daily Actions Database
9:45-11:25	Juan-María Garrido, Yesika Laplaza, Benjamin Kolz and Miquel Cornudella	TexAFon 2.0: a Text Processing Tool for the Generation of Expressive Speech in TTS Applications
9:45-11:25	Giovanni Costantini, Iacopo Iaderola, Andrea Paoloni and Massimiliano Todisco	EMOVO Corpus: an Italian Emotional Speech Database
9:45-11:25	Demulier Virginie, Elisabetta Bevacqua, Florian Focone, Tom Giraud, Pamela Carreno, Brice Isableu, Sylvie Gibet, Pierre De Loor and Jean-Claude Martin	A Database of Full Body Virtual Interactions Annotated with Expressivity Scores
9:45-11:25	Sophia Lee, Shoushan Li and Chu-Ren Huang	Annotating Events in an Emotion Corpus

	Session P52 - Linked Data	Chair : John Philip McCrae
9:45-11:25	Tomáš Kliegr and Ondřej Zamazal	Towards Linked Hypernyms Dataset 2.0: Complementing DBpedia with Hypernym Discovery
9:45-11:25	Mohamed Sherif, Sandro Coelho, Ricardo Usbeck, Sebastian Hellmann, Jens Lehmann, Martin Brümmer and Andreas Both	NIF4OGGD - NLP Interchange Format for Open German Governmental Data
9:45-11:25	Michael Röder, Ricardo Usbeck, Sebastian Hellmann, Daniel Gerber and Andreas Both	N³ - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format
9:45-11:25	Riccardo Del Gratta, Gabriella Pardelli and Sara Goggi	The LRE Map disclosed
9:45-11:25	Clara Bacciu, Angelica Lo Duca, Andrea Marchetti and Maurizio Tesconi	Accommodations in Tuscany as Linked Data
9:45-11:25	David Lewis, Rob Brennan, Leroy Finn, Dominic Jones, Alan Meehan, Declan O'sullivan, Sebastian Hellmann and Felix Sasaki	Global Intelligent Content: Active Curation of Language Resources using Linked Data

	Session P53 - Machine Translation	Chair : Mikel Forcada
9:45-11:25	Ondrej Bojar, Vojtěch Diatka, Pavel Rychlý, Pavel Stranak, Vit Suchomel, Aleš Tamchyna and Daniel Zeman	HindEnCorp - Hindi-English and Hindi-only Corpus for Machine Translation
9:45-11:25	Mara Chinea-Rios, Germán Sanchis Trilles, Daniel Ortiz-Martínez and Francisco Casacuberta	Online Optimisation of Log-linear Weights in Interactive Machine Translation
9:45-11:25	Kashif Shah, Marco Turchi and Lucia Specia	An Efficient and User-friendly Tool for Machine Translation Quality Estimation
9:45-11:25	Santanu Pal, Sudip Kumar Naskar and Sivaji Bandyopadhyay	Word Alignment-Based Reordering of Source Chunks in PB-SMT
9:45-11:25	Bruno Laranjeira, Viviane Moreira, Aline Villavicencio, Carlos Ramisch and Maria José Finatto	Comparing the Quality of Focused Crawlers and of the Translation Resources Obtained from them
9:45-11:25	Christian Buck, Kenneth Heafield and Bas van Ooyen	N-gram Counts and Language Models from the Common Crawl
9:45-11:25	Guillaume Wisniewski, Natalie Kübler and François Yvon	A Corpus of Machine Translation Errors Extracted from Translation Students Exercises
9:45-11:25	Alexandru Ceausu and Sabine Hunsicker	Pre-ordering of Phrase-based Machine Translation Input in Translation Workflow
9:45-11:25	Jennifer Drexler, Pushpendre Rastogi, Jacqueline Aguilar, Benjamin Van Durme and Matt Post	A Wikipedia-based Corpus for Contextualized Machine Translation

	Session P54 - Multimodality	Chair : Kristina Jokinen
9:45-11:25	Costanza Navarretta and Magdalena Lis	Transfer Learning of Feedback Head Expressions in Danish and Polish Comparable Multimodal Corpora
9:45-11:25	Onno Crasborn and Han Sloetjes	Improving the Exploitation of Linguistic Annotations in ELAN
9:45-11:25	Yoshihiko Hayashi	Web-imageability of the Behavioral Features of Basic-level Concepts
9:45-11:25	Zoraida Callejas, Brian Ravenet, Magalie Ochs and Catherine Pelachaud	A Model to Generate Adaptive Multimodal Job Interviews with a Virtual Recruiter
9:45-11:25	Coline Claude-Lachenaud, Eric Charton, Benoit Ozell and Michel Gagnon	A Multimodal Interpreter for 3D Visualization and Animation of Verbal Concepts
9:45-11:25	Philippe Martin	New Functions for a Multipurpose Multimodal Tool for Phonetic and Linguistic Analysis of Very Large Speech Corpora
9:45-11:25	Mariette Soury and Laurence Devillers	Smile and Laughter in Human-Machine Interaction: a Study of Engagement
9:45-11:25	Hendrik Buschmeier, Zofia Malisz, Joanna Skubisz, Marcin Wlodarczak, Ipke Wachsmuth, Stefan Kopp and Petra Wagner	ALICO: a Multimodal Corpus for the Study of Active Listening
9:45-11:25	Przemyslaw Lenkiewicz, Olha Shkaravska, Twan Goosen, Daan Broeder, Menzo Windhouwer, Stephanie Roth and Olof Olsson	The DWAN Framework: Application of a Web Annotation Framework for the General Humanities to the Domain of Language Resources
9:45-11:25	Nicolas Auguin and Pascale Fung	Co-Training for Classification of Live or Studio Music Recordings

	Session P55 - Ontologies	Chair : Monica Monachini
9:45-11:25	Chetana Gavankar, Ashish Kulkarni and Ganesh Ramakrishnan	Efficient Reuse of Structured and Unstructured Resources for Ontology Population
9:45-11:25	Maria Pia di Buono and Mario Monteleone	From Natural Language to Ontology Population in the Cultural Heritage Domain. A Computational Linguistics-based approach.
9:45-11:25	Alessio Bosca, Matteo Casu, Matteo Dragoni and Nikolaos Marianos	A Gold Standard for CLIR evaluation in the Organic Agriculture Domain
9:45-11:25	Bernardo Severo, Cassia Trojahn and Renata Vieira	VOAR: A Visual and Integrated Ontology Alignment Environment

	Session P56 - Corpora and Annotation	Chair : Tomaž Erjavec
11:45-13:25	Goran Glavaš, Jan Šnajder, Marie-Francine Moens and Parisa Kordjamshidi	HiEve: A Corpus for Extracting Event Hierarchies from News Stories
11:45-13:25	Masaya Yamaguchi	Building a Database of Japanese Adjective Examples from Special Purpose Web Corpora
11:45-13:25	Antonio Toral	TLAXCALA: a Multilingual Corpus of Independent News
11:45-13:25	Nathan Green and Septina Dian Larasati	Votter Corpus: A Corpus of Social Polling Language
11:45-13:25	Roald Eiselen and Martin Puttkammer	Developing Text Resources for Ten South African Languages
11:45-13:25	Paul Felt, Robbie Haertel, Eric Ringger and Kevin Seppi	Momresp: A Bayesian Model for Multi-Annotator Document Labeling
11:45-13:25	Maciej Ogrodniczuk and Mateusz Kopeć	The Polish Summaries Corpus

	Session P57 - Information Extraction and Information Retrieval	Chair : Feiyu Xu
11:45-13:25	Clément de Groc and Xavier Tannier	Evaluating Web-as-corpus Topical Document Retrieval with an Index of the OpenDirectory
11:45-13:25	Jordan Schmidek and Denilson Barbosa	Improving Open Relation Extraction via Sentence Re-Structuring
11:45-13:25	Pavel Smrz and Jan Kouril	Semantic Search in Documents Enriched by LOD-based Annotations
11:45-13:25	Antske Fokkens, Serge Ter Braake, Niels Ockeloen, Piek Vossen, Susan Legêne and Guus Schreiber	BiographyNet: Methodological Issues when NLP Supports Historical Research
11:45-13:25	Tilia Ellendorff, Fabio Rinaldi and Simon Clematide	Using Large Biomedical Databases as Gold Annotations for Automatic Relation Extraction
11:45-13:25	Yutaka Mitsuishi, Vit Novacek and Pierre-Yves Vandenbussche	A Method for Building Burst-Annotated Co-Occurrence Networks for Analysing Trends in Textual Data

	Session P58 - Lexicons	Chair : Kiril Simov
11:45-13:25	Antonio San Martín and Marie-Claude L'Homme	Definition Patterns for Predicative Terms in Specialized Lexical Resources
11:45-13:25	Tim vor der Brück, Alexander Mehler and Zahurul Islam	ColLex.en: Automatically Generating and Evaluating a Full-form Lexicon for English
11:45-13:25	Ajay Dubey, Parth Gupta, Vasudeva Varma and Paolo Rosso	Enrichment of Bilingual Dictionary through News Stream Data
11:45-13:25	Thomas Francois, Nùria Gala, Patrick Watrin and Cédrick Fairon	FLELex: a graded Lexical Resource for French Foreign Learners
11:45-13:25	Anabela Barreiro, Fernando Batista, Ricardo Ribeiro, Helena Moniz and Isabel Trancoso	OpenLogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries
11:45-13:25	Mona Diab, Mohamed AlBadrashiny, Maryam Aminian, Mohammed Attia, Heba Elfardy, Nizar Habash, Abdelati Hawwari, Wael Salloum, Pradeep Dasigi and Ramy Eskander	Tharwa: A Large Scale Dialectal Arabic - Standard Arabic - English Lexicon
11:45-13:25	Michael Rosner and Kurt Sultana	Automatic Methods for the Extension of a Bilingual Dictionary using Comparable Corpora
11:45-13:25	Kevin Black, Eric Ringger, Paul Felt, Kevin Seppi, Kristian Heal and Deryle Lonsdale	Evaluating Lemmatization Models for Machine-Assisted Corpus-Dictionary Linkage

	Session P59 - Language Resource Infrastructures	Chair : Martin Wynne
11:45-13:25	Menzo Windhouwer and Ineke Schuurman	Linguistic Resources and Cats: How to Use ISOcat, RELcat and SCHEMAcat
11:45-13:25	Lluís Padró, Zeljko Agic, Xavier Carreras, Blaz Fortuna, Esteban García-Cuesta, Zhixing Li, Tadej Stajner and Marko Tadić	Language Processing Infrastructure in the XLike Project
11:45-13:25	Piotr Banski, Nils Diewald, Michael Hanl, Marc Kupietz and Andreas Witt	Access Control by Query Rewriting: the Case of KorAP
11:45-13:25	Rodrigo Agerri, Josu Bermudez and German Rigau	IXA pipeline: Efficient and Ready to Use Multilingual NLP tools
11:45-13:25	Trang Mai Xuan, Yohei Murakami, Donghui Lin and Toru Ishida	Integration of Workflow and Pipeline for Language Service Composition
11:45-13:25	Rafal Rak, Jacob Carter, Andrew Rowley, Riza Theresa Batista-Navarro and Sophia Ananiadou	Interoperability and Customisation of Annotation Schemata in Argo

	Session P60 - Metadata	Chair : Gil Francopoulo
11:45-13:25	Penny Labropoulou, Christopher Cieri and Maria Gavrilidou	Developing a Framework for Describing Relations among Language Resources
11:45-13:25	Thorsten Trippel, Daan Broeder, Matej Durco and Oddrun Ohren	Towards Automatic Quality Assessment of Component Metadata

	Session P61 - Opinion Mining and Sentiment Analysis	Chair : Gerard de Melo
11:45-13:25	Chantal van Son, Marieke van Erp, Antske Fokkens and Piek Vossen	Hope and Fear: How Opinions Influence Factuality
11:45-13:25	Nathan Hartmann, Lucas Avanço, Pedro Balage, Magali Duran, Maria das Graças Volpe Nunes, Thiago Pardo and Sandra Aluísio	A Large Corpus of Product Reviews in Portuguese: Tackling Out-Of-Vocabulary Words
11:45-13:25	Thierry Declerck and Hans-Ulrich Krieger	Harmonization of German Lexical Resources for Opinion Mining
11:45-13:25	Anne Garcia-Fernandez, Olivier Ferret and Marco Dinarelli	Evaluation of Different Strategies for Domain Adaptation in Opinion Mining
11:45-13:25	Amel Fraisse and Patrick Paroubek	Toward a Unifying Model for Opinion, Sentiment and Emotion Information Extraction

	Session P62 - Speech Resources	Chair : Christopher Draxler
11:45-13:25	Michael Stadtschnitzer, Jochen Schwenninger, Daniel Stein and Joachim Koehler	Exploiting the Large-Scale German Broadcast Corpus to Boost the Fraunhofer IAIS Speech Recognition System
11:45-13:25	Ilaine Wang, Sylvain Kahane and Isabelle Tellier	Macrosyntactic Segmenters of a French Spoken Corpus
11:45-13:25	Iolanda Alfano, Francesco Cutugno, Aurelio De Rosa, Claudio Iacobini, Renata Savy and Miriam Voghera	VOLIP: a Corpus of Spoken Italian and a Virtuous Example of Reuse of Linguistic Resources
11:45-13:25	George Christodoulides, Mathieu Avanzi and Jean-Philippe Goldman	DisMo: A Morphosyntactic, Disfluency and Multi-Word Unit Annotator. An Evaluation on a Corpus of French Spontaneous and Read Speech
11:45-13:25	Vera Cabarrão, Helena Moniz, Fernando Batista, Ricardo Ribeiro, Nuno Mamede, Hugo Meinedo, Isabel Trancoso, Ana Isabel Mata and David Martins de Matos	Revising the Annotation of a Broadcast News Corpus: a Linguistic Approach
11:45-13:25	Ana Isabel Mata, Helena Moniz, Fernando Batista and Julia Hirschberg	Teenage and Adult Speech in School Context: Building and Processing a Corpus of European Portuguese
11:45-13:25	Arjan van Hessen, Franciska de Jong, Stef Scagliola and Tanja Petrovic	Croatian Memories
11:45-13:25	Ines Rehbein, Sören Schalowski and Heike Wiese	The KiezDeutsch Korpus (KiDKo) Release 1.0
11:45-13:25	Anthony Rousseau, Paul Deléglise and Yannick Estève	Enhancing the TED-LIUM Corpus with Selected Data for Language Modeling and More TED Talks
11:45-13:25	Jan Strunk, Florian Schiel and Frank Seifart	Untrained Forced Alignment of Transcriptions and Audio for Language Documentation Corpora using WebMAUS

	Session P63 - Computer-Assisted Language Learning	Chair : Keith Miller
14:55-16:35	Xiaoyun Wang, Jinsong Zhang, Masafumi Nishida and Seiichi Yamamoto	Phoneme Set Design Using English Speech Database by Japanese for Dialogue-Based English CALL Systems
14:55-16:35	Lianet Sepúlveda Torres, Magali Sanches Duran and Sandra Aluísio	Generating a Lexicon of Errors in Portuguese to Support an Error Identification System for Spanish Native Learners
14:55-16:35	Veronika Vincze, János Zsibrita, Péter Durst and Martina Katalin Szabó	Automatic Error Detection Concerning the Definite and Indefinite Conjugation in the HunLearner Corpus
14:55-16:35	Gabriele Pallotti, Francesca Frontini, Fabio Affè, Monica Monachini and Stefania Ferrari	Presenting a System of Human-Machine Interaction for Performing Map Tasks.
14:55-16:35	Valentín Cardeñoso-Payo, César González-Ferreras and David Escudero	Assessment of Non-native Prosody for Spanish as L2 using Quantitative Scores and Perceptual Evaluation
14:55-16:35	Elena Volodina, Ildikó Pilán, Lars Borin and Therese Lindström Tiedemann	A Flexible Language Learning Platform Based on Language Resources and Web Services
14:55-16:35	Renlong Ai and Marcela Charfuelan	MAT: a Tool for L2 Pronunciation Errors Annotation
14:55-16:35	Chris Hokamp, Rada Mihalcea and Peter Schuelke	Modeling Language Proficiency Using Implicit Feedback

	Session P64 - Evaluation Methodologies	Chair : Kevin Bretonnel Cohen
14:55-16:35	Mohamed Ben Jannet, Martine Adda-Decker, Olivier Galibert, Juliette Kahn and Sophie Rosset	ETER: a New Metric for the Evaluation of Hierarchical Named Entity Recognition
14:55-16:35	Olivier Galibert, Jeremy Leixa, Gilles Adda, Khalid Choukri and Guillaume Gravier	The ETAPE Speech Processing Evaluation
14:55-16:35	Achim Rettinger, Lei Zhang, Daša Berović, Danijela Merkler, Matea Srebačić and Marko Tadić	RECSA: Resource for Evaluating Cross-lingual Semantic Annotation
14:55-16:35	Helen Hastie and Anja Belz	A Comparative Evaluation Methodology for NLG in Interactive Systems
14:55-16:35	Juris Borzovs, Ilze Ilziņa, Iveta Keiša, Mārcis Pinnis and Andrejs Vasiļjevs	Terminology Localization Guidelines for the National Scenario

	Session P65 - MultiWord Expressions and Terms	Chair : Valia Kordoni
14:55-16:35	Kris Heylen, Stephen Bond, Dirk De Hertog De Hertog, Ivan Vulić and Hendrik Kockaert	TermWise: A CAT-tool with Context-Sensitive Terminological Support.
14:55-16:35	Pollet Samvelian, Pegah Faghiri and Sarra El Ayari	Extending the Coverage of a MWE Database for Persian CPs Exploiting Valency Alternations
14:55-16:35	Behrang Zadeh and Siegfried Handschuh	Evaluation of Technology Term Recognition with Random Indexing
14:55-16:35	Johannes Hellrich, Simon Clematide, Udo Hahn and Dietrich Rebholz-Schuhmann	Collaboratively Annotating Multilingual Parallel Corpora in the Biomedical Domain―some MANTRAs
14:55-16:35	Anca Dinu, Liviu Dinu and Ionut Sorodoc	Aggregation Methods for Efficient Collocation Detection
14:55-16:35	Sandra Antunes and Amália Mendes	An Evaluation of the Role of Statistical Measures and Frequency for MWE Identification

	Session P66 - Parsing	Chair : Giuseppe Attardi
14:55-16:35	Weston Feely, Mehdi Manshadi, Robert Frederking and Lori Levin	The CMU METAL Farsi NLP Approach
14:55-16:35	Masood Ghayoomi, Kiril Simov and Petya Osenova	Constituency Parsing of Bulgarian: Word- vs Class-based Parsing
14:55-16:35	Kiril Simov, Iliana Simova, Ginka Ivanova, Maria Mateva and Petya Osenova	A System for Experiments with Dependency Parsers
14:55-16:35	Wolfgang Seeker and Jonas Kuhn	An Out-of-Domain Test Suite for Dependency Parsing of German
14:55-16:35	Lauma Pretkalniņa, Artūrs Znotiņš, Laura Rituma and Didzis Goško	Dependency Parsing Representation Effects on the Accuracy of Semantic Applications - an Example of an Inflective Language
14:55-16:35	Ophélie Lacroix and Denis Béchet	Validation Issues induced by an Automatic Pre-Annotation Mechanism in the Building of Non-projective Dependency Treebanks
14:55-16:35	Jianqiang Ma	Automatic Refinement of Syntactic Categories in Chinese Word Structures

	Session P67 - Part-of-Speech Tagging	Chair : Daniel Flickinger
14:55-16:35	Stephen Wattam, Paul Rayson, Marc Alexander and Jean Anderson	Experiences with Parallelisation of an Existing NLP Pipeline: Tagging Hansard
14:55-16:35	Heike Zinsmeister, Ulrich Heid and Kathrin Beck	Adapting a Part-of-Speech Tagset to Non-Standard Text: the Case of STTS
14:55-16:35	Antonio Balvet, Dejan Stosic and Aleksandra Miletic	TALC-Sef a Manually-revised POS-Tagged Literary Corpus in Serbian, English and French
14:55-16:35	Cristina Sánchez Marco	An Open Source Part-of-Speech Tagger for Norwegian: Building on Existing Language Resources
14:55-16:35	Antonio Pareja-Lora, Guillermo Cárcamo-Escorza and Alicia Ballesteros-Calvo	Standardisation and Interoperation of Morphosyntactic and Syntactic Annotation Tools for Spanish and their Annotations

	Session P68 - Tools, Systems, Applications	Chair : Lluís Padró
14:55-16:35	Peter Fankhauser, Jörg Knappen and Elke Teich	Exploring and Visualizing Variation in Language Resources
14:55-16:35	Raphael Winkelmann and Georg Raess	Introducing a Web Application for Labeling, Visualizing Speech and Correcting Derived Speech Signals
14:55-16:35	Maha Althobaiti, Udo Kruschwitz and Massimo Poesio	AraNLP: a Java-based Library for the Processing of Arabic Text.
14:55-16:35	Silvia Rodríguez Vázquez, Pierrette Bouillon and Anton Bolfing	Applying Accessibility-Oriented Controlled Language (CL) Rules to Improve Appropriateness of Text Alternatives for Images: an Exploratory Study
14:55-16:35	Jonathan Sonntag and Manfred Stede	GraPAT: a Tool for Graph Annotations
14:55-16:35	Vincenzo Galatà, Alberto Benin, Piero Cosi, Giuseppe Riccardo Leone, Giulio Paci, Giacomo Sommavilla and Fabio Tesser	Discovering the Italian Literature: Interactive Access to Audio-indexed Text Resources
14:55-16:35	Horacio Saggion	Creating Summarization Systems with SUMMA

Powered by ELDA © 2014 ELDA/ELRA