Naomi Nagy
homepage

Linguistics at U of T

Primary Research #1:
Learning about the structure of a Heritage Language

For this assigment, work in small groups. Each group will include one student who understands a heritage language that is part of the HLVC project and one or more students who do not know that language. Each group should include one person with good organizational skills and one person with good computer skills.

This assignment has three purposes:

  • to share some knowlege about a language that one of you knows
  • to learn to transcribe, translate and annotate a sociolinguistic interview, learning to use the software program ELAN
  • to learn to describe linguistic variation based on empirical evidence

These will prepare you for the opportunity to conduct linguistic analysis of a heritage language in Primary Research Project #2.

As a group, here's what you need to do:

Part 1: GETTING READY

  1. Select one of the speakers available from HerLD, the online database of transcriptions and recordings for the HLVC project. These files are stored in the Corpora in the Classroom website.
  2. In order to have access to these files, you must EACH carefully read and electronically-sign the Ethics Form for the Heritage Language corpus in the Corpora in the Classroom website (https://corpora.chass.utoronto.ca). Type "TBB 199" in the field that asks about the purpose of the study.
  3. Use the Search function to select a speaker from the Heritage Language Documentation corpus.
    • Click on the column headers in the resulting table to sort speakers by their language, generation, age or sex.
    • Choose a speaker for whom there is both a SPEAKERCODE_IV.zip and a SPEAKERCODE_IV.eaf file available.
    • The SPEAKERCODE gives lots of info about the speaker.
  4. Download both the .zip and the .eaf to the computer where you will be working.
  5. Unzip the .zip file to get the .wav (audio) file.
  6. Save these two files in the same folder.
  7. Download ELAN, a time-aligned transcription program and its manual (or bookmark the online manual). ELAN is freeware and works on Mac, Windows and Unix platforms.
  8. Once you have ELAN on your computer, use it to open the SPEAKERCODE_IV.eaf file that you selected. You may be asked to locate the related SPEAKERCODE_IV.wav. (Look back at the class PPT for help with ELAN.)
  9. Try clicking the Play button in the middle tier of the ELAN display. Can you hear a speaker?
  10. Save the file.
  11. Choose the Automatic Backup > 1 minute option in the File menu.
  12. Click around and listen to bits.
  13. Explore the different buttons and menus in ELAN for a few minutes. You may find parts of this file helpful: http://projects.chass.utoronto.ca/ngn/pdf/ELAN_annotation_tips.pdf.

Part 2: FAMILIARIZING YOURSELF WITH THE DATA AND PROGRAM

  1. Find the speaker telling a story or describing something interesting.
  2. Make a note of the time-stamp where this part starts and stops.
  3. Create a new tier and label it "Translation." Use the Speakercode to label the Participant and your initials to indicate the Annotators.
  4. Together, translate each sentence of the story you found. The translation should read smoothly in English -- it might not be word-for-word. Translate at least two minutes of speech. Create one annotation to match each annotation in the speaker's original tier. (You can ignore anything that another speaker says during the story.) It's ok to check a dictionary or ask a friend.
  5. Create another tier and label it "Gloss." On this tier, create an annotation for each word in the first minute of your story. Match each to the relevant part of the recording. Put in a translation of each word. (This may be hard for some words. Do your best.) You can either split up this tier by hand and type inthe words. Or try this fancy trick instead, instead of creating the Gloss tier directly:
    1. Create a new type called "for-tokenizing." Make it a "Time Subdivision" stereotype.
    2. Create a new tier called "Tokenized." Make its parent be the SPEAKER-CODE (original transcription) tier. Choose the type you just created.
    3. Choose Tokenize Tier from the Tiers menu. Use the SPEAKER CODE (original transcription) tier as the Source and your new "Tokenized" tier as the Destination tier. Click the "Start" button to Tokenize.
    4. Create another new type called "for-glossing." Make it a "Symbolic Association" sterotype.
    5. Create a new tier called "Gloss." Make its parent be the "Tokenized" tier. Make its type "for-glossing".
    6. Now, when you double-click in this "Gloss" tier, it will create an empty annotation that matches the tokenized word just above in the Tokenized tier. You can type in the English gloss for each word.

Part 3: DESCRIBE SOME INTERESTING THINGS ABOUT THIS LANGUAGE

You probably noticed some interesting things as you translated and glossed. Your observations may help with this part. Choose any 4 of the following 6 questions to answer.
  1. What are two sounds that this language has but English does not? Be sure to think about sounds, not letters. Make a tier called new sounds and mark a few words that contain each of these sounds.
  2. Can you see any patterns about where certain sounds appear? For example, is there a sound that only appears at the beginning of words? or only at the end of words? or only next to certain (kinds of) sounds (like 'vowels', 'consonants'? Make a tier called sound patterns and mark a few examples of the pattern.

Example from English: The sound [h] does not appear at the end of any words.

  1. What are two differences in the order that words appear in sentences between this language and English? Make a tier called word order and mark a few examples of each. For example: Do articles come before nouns or after? Does the subject of the verb come before the verb or after? What about the parts of a prepositional phrase?
  2. Can you propose a descriptive rule of the order of certain parts of the sentence? Make sure that it is true for the whole two-minute story you have been examining, at least. Is this the same or different from English?
  3. Example of a descriptive rule for English: The adjective comes right before the noun it modifies, as in "the blue page."

  4. For one of the sounds that does not appear in English, describe how it sounds. Then find out how many times it appears in this recording. Use the Find function in the Search menu.
  5. Is there a word or word type that seems to have no equivalent in English? Mark some examples of it. What does it mean? How many times does it appear in the recording? (Use the Find function in the Search menu. )

Part 4: TO SUBMIT

  • Write up your answers to questions A-F (choose 4) clearly, making sure I know where to look in your annotated .eaf file for related examples. Use the names of the tiers you created and the time-stamps to identify where examples are.
  • Make sure that all group members' names appear on the assignment when you submit it. (You will submit one write-up for the group.)
  • Save the file as SPEAKERCODE_PRP1_YOURNAMES.eaf. Save your write-up as SPEAKERCODE_PRP1_YOURNAMES.doc. Submit both in Quercus.
Updated Jan. 2, 2019.
email: naomi dot nagy at utoronto dot ca | Return to my home page