Management of Metadata in Linguistic Fieldwork: Experience from the ACLA Project
Baden Hughes (1), David Penton (1), Steven Bird (1), Catherine Bow (1), Gillian Wigglesworth (2), Patrick McConvell (3), Jane Simpson (4)
(1) Department of Computer Science and Software Engineering, University of Melbourne; (2) Department of Linguistics and Applied Linguistics, University of Melbourne; (3) Australian Institute of Aboriginal and Torres Strait Islander Studies; (4) Department of Linguistics, University of Sydney
Many linguistic research projects collect large amounts of multimodal data in digital formats. Despite the plethora of data collection applications available, it is often difficult for researchers to identify and integrate applications which enable the management of collections of multimodal data in addition to facilitating the actual collection process itself. In research projects that involve substantial data analysis, data management becomes a critical issue. Whilst best practice recommendations in regard to data formats themselves are propagated through projects such as EMELD, HRELP and DOBES, there is little corresponding information available regarding best practice for field metadata management beyond the provision of standards by entities such as OLAC and IMDI. These general problems are further exacerbated in the context of multiple researchers in geographically-disparate or connectivity-challenged locations. We describe the design of a solution for a group of researchers collecting data on child language acquisition in Australian indigenous communities. We describe the context, identify pertinent issues, outline the mechanics of a solution, and finally report the implementation. In doing so, we provide an alternative model and an open source software application suite which aims to be sufficiently general that other research groups may consider adopting some or all of the infrastructure.
field recordings, multimodal data management, metadata