The LangAge corpus aims at representing the spoken word with respect to its very properties. However, transcription is never neutral, corpus data are always the product of abstraction, reduction and transformation – and they are never objective or authentic in a genuine way. Each corpus is meant to provide data for a series of research questions which guides the choices in the complex process of corpus building, which includes the many decisions made in the transcription of audio data, the choice of events accompanying the interaction and finally the linguistic annotation. All of these decisions define the possibilities, but also the limits of the corpus exploration. Instead of aiming to establish a complete transcription fulfilling the needs of every possible research question, in what follows, we want to explain the transcription's rationale in a transparent way.
Please cite as:
Gerstenberg, Annette, Valerie Hekkel & Julie Kairet. 2018. Corpus LangAge: Transcription Guide. University of Potsdam: Department of Romance Studies. doi.org/10.5281/zenodo.6444538