Naomi Nagy

Linguistics at U of T

Heritage Language Variation and Change: Corpus Construction & Use 

Naomi Nagy

Methods 14 Panel: Dialect and Heritage Language Corpora for the Google Generation"


I will present aspects of an ongoing project documenting and describing six heritage languages spoken in Toronto (Nagy et al. 2009). The presentation will focus on corpus construction techniques and the integration of teaching and research goals. The Heritage Language Documentation Corpus (HerLD, Nagy 2009) contains time-aligned transcriptions of ~1-hour interviews from 40 speakers of each of 6 languages (Cantonese, Faetar, Italian, Korean, Russian, Ukrainian), representing 3 generations of speakers, balanced across a range of ages (12- 92 yrs.). For each speaker, information about self-reports of language use patterns and linguistic and cultural attitudes are elicited and quantified.  Using consistent methods of data collection, transcription, and analysis across 6 languages, with appropriate metadata, is an innovation designed to further our understanding of contact-induced language change. We are working to provide useful material to Toronto's heritage communities, as well as to academic colleagues.

My presentation of the corpus construction techniques will describe the involvement of heritage speakers in recruiting, interviewing, analyzing, and reporting findings, both as research assistants and as students using the corpus for course assignments.

Specific features of the project to be described include the use of the transcription program ELAN for transcription and data-coding (and how it may be quickly introduced to students and research assistants at all levels), a multi-level consent process that allows participants to determine how their data may be used and shared, the development of a searchable, online database of transcription and audio files (Dept. of Linguistics, U of T), and assignments developed for an undergraduate course called "Exploring Heritage Languages," (Nagy 2010) which both uses the HerLD corpus and contributes outreach material to the HLVC project. Prototype webpages, designed by students to describe available HL resources, illustrate one result of integrating research, teaching, and outreach.


Department of Linguistics – University of Toronto. 2010. Corpora in the Classroom.

Nagy, N. 2009. Heritage Language Varation and Change in Toronto.

Nagy, N. 2010. Exploring Heritage Languages. Course website for TBB 199, University of Toronto.

Nagy, N., Y. Kang, A. Kochetov and J. Walker. 2009. Heritage Languages in Toronto: A New Project. Heritage Language Workshop, University of Toronto.


email: naomi dot nagy at utoronto dot ca | Return to my home page