| Title | The Cambridge Cookie-Theft Corpus: A Corpus of Directed and Spontaneous Speech of Brain-Damaged Patients and Healthy Individuals | 
  
  | Authors | Caroline Williams, Andrew Thwaites, Paula Buttery, Jeroen Geertzen, Billi Randall, Meredith Shafto, Barry Devereux and Lorraine Tyler | 
  
  | Abstract | Investigating differences in linguistic usage between individuals who have suffered brain injury (hereafter patients) and those who havent can yield a number of benefits. It provides a better understanding about the precise way in which impairments affect patients language, improves theories of how the brain processes language, and offers heuristics for diagnosing certain types of brain damage based on patients speech. One method for investigating usage differences involves the analysis of spontaneous speech. In the work described here we construct a text corpus consisting of transcripts of individuals speech produced during two tasks: the Boston-cookie-theft picture description task (Goodglass and Kaplan, 1983) and a spontaneous speech task, which elicits a semi-prompted monologue, and/or free speech. Interviews with patients from 19yrs to 89yrs were transcribed, as were interviews with a comparable number of healthy individuals (20yrs to 89yrs). Structural brain images are available for approximately 30% of participants. This unique data source provides a rich resource for future research in many areas of language impairment and has been constructed to facilitate analysis with natural language processing and corpus linguistics techniques. | 
  
  | Topics | Speech resource/database, Corpus (creation, annotation, etc.), Cognitive methods | 
  
  | Full paper  | The Cambridge Cookie-Theft Corpus: A Corpus of Directed and Spontaneous Speech of Brain-Damaged Patients and Healthy Individuals | 
  
  | Slides  | The Cambridge Cookie-Theft Corpus: A Corpus of Directed and Spontaneous Speech of Brain-Damaged Patients and Healthy Individuals | 
  
  | Bibtex | @InProceedings{WILLIAMS10.327, author =  {Caroline Williams and Andrew Thwaites and Paula Buttery and Jeroen Geertzen and Billi Randall and Meredith Shafto and Barry Devereux and Lorraine Tyler},
 title =  {The Cambridge Cookie-Theft Corpus: A Corpus of Directed and Spontaneous Speech of Brain-Damaged Patients and Healthy Individuals},
 booktitle =  {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)},
 year =  {2010},
 month =  {may},
 date =  {19-21},
 address =  {Valletta, Malta},
 editor =  {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias},
 publisher =  {European Language Resources Association (ELRA)},
 isbn =  {2-9517408-6-7},
 language =  {english}
 }
 |