The final project is a phonetic (and potentially phonological) description of a language you have no literacy in, accompanied by recordings of a speaker of that language. The project consists of four main steps, a presentation, and additional steps for those working in groups:
You may work on the project by yourself, or in a group of up to three people. There will be additional requirements in the later stages of the project for each additional person in the group.
The language you choose must be a language that no one in your group can read or write. That will rule out English, and likely a number of other languages. You may choose a language that someone in your group speaks as a heritage language (i.e., they grew up speaking it within their family, but have been mostly isolated from a larger community that uses it), as long as no one in the group has studied the language in school (as a first language or otherwise), or learned to read/write it by any other means.
The language consultant you choose must be a native speaker of the language you have selected: they must have acquired the language in their home starting from infancy, and maintained use of it (more or less) ever since. Your consultant can be a student, professor, or member of the local community, but it may not be someone in your group (or the class). You are responsible for finding a reliable consultant that you can easily meet with for a few hours at a time several times this semester, so please choose carefully! If you're having trouble finding an appropriate consultant, see me for help.
If your consultant agrees to work with you purely out of a commitment to science and/or pride in their language, that’s great, but you should also be prepared to offer some reciprocation to compensate them for their time and to express your gratitude for their help. Buying lunch or gift certificates are fairly easy solutions (the typical rate for consulting of this form is about $10/hour), but you could also provide non-monetary compensation as well: tutoring, proof-reading services, English instruction, baby/pet/house-sitting, etc. Whatever you choose to do, it should be appropriate, considerate, and respectful, not only of their time and effort, but also of their wishes and their cultural expectations and taboos. Also, I understand that the Linguistics Department can provide funding to compensate a consultant for any group that requests it.
Much of your phonetic work for this semester will require suitable recording equipment. You may use a laptop with a built-in microphone, though I recommend against Praat for long recordings (try Audacity). If you are able to make lossless audio recordings (e.g., saved directly to FLAC format) on your phone, tablet, or other mobile device, that would also be a possibility. If your group does not have access to appropriate equipment, or you'd like to use a field recorder for whatever reason, talk to me as soon as possible! You'll need to find a quiet place to make the recordings to avoid as much background noise as possible (people talking, automobile traffic, air conditioners, birds chirping, dogs barking, etc.). There are a few facilities on campus that are specially designed for making audio recordings, including the phonetics lab. If you wish, you may arrange to make recordings in the phonetics lab once renovations are done.
As part of the first assignment, you will need to learn how to use your equipment to get clear recordings of your consultant. This may take a little trial and error, and you should try to sort out these issues at the beginning of your first recording session. You should also familiarise yourself with the equipment ahead of time, and try to make sure you're aware of any potential pitfalls. This includes issues like whether the equipment can handle recordings that are as long as (or longer than) you expect to need for one session, whether you'll lose the entire recording if your battery dies, the power goes out, or your program or computer crashes half way through the recording session, etc. Please also see the note on recording quality below.
Each time you meet with your consultant, be sure to record yourself giving the date, the session's purpose, and the language and consultant’s name: e.g., "October 10th, 2016, preliminary word list elicitation for Farsi, with consultant Darya Rozati."
Recordings should be made in a lossless audio format (preferably FLAC), not a compressed format (MP3, Vorbis, AAC, etc.), at a sampling rate of at least 44.1 kHz with a resolution of at least 16 bits. All modern recording hardware and software worth anything supports these parameters. If you're not sure how to make sure you're using these parameters, I'll be happy to have a look.
It will make it easier to load your recordings in Praat if break them up into separate files less than about 30 seconds each. You should then name the split-up files using a convention that will facilitate searching for specific words. For example, something like the following might make sense to you:
Here each filename starts with 01 to indicate that these files come from the first elicitation session, and then has a letter indicating the order in which the files were recorded (first a, then b, etc.), so that in a sorted view on your computer, they will show up in chronological order. Finally, each filename contains a descriptive term identifying which part of the list it comes from, with suffixed number used for sections too long to fit into a single 30-second file. If you are sufficiently motivated, you could even divide your files up by word, so that each word (along with its English meaning) would be in a single file.
You may submit your files in any way that you need to: you can burn the files to a disk, bring a flash drive to my office, etc. My preferred method, however, is that you make a folder on Google Drive (which you should have access to through your Swarthmore account) and share it with me. Be sure that I can figure out what's what. Add a README file to the folder to explain anything to me, e.g. if you use a naming convention like the one above. Note that Google Drive is also a good way to share materials among members in your group, and even to collaborate on the assignment (multiple people can edit the same Google Doc at the same time, even from different locations). This should all be pretty straightforward, but let me know if you're having trouble.
You will be working with secondary sources and your speaker to develop a list of words illustrating the phonemes of the language. You do not need many different secondary sources; one good, detailed and careful source will usually be enough. You need, if there exists, a basic description of the sound inventory of the language. A dictionary is helpful for coming up with the list of words of the language as well. For each of the steps in the project, you'll need to cite all the sources you've used.
If a source for the language exists, you will start with that, and once you have come up with a list of words illustrating the phonemes (as you can determine from the available sources), you will start working with your speaker.
Use only peer-reviewed scholarly materials! This includes published books, journal articles, etc.—preferably written by established scholars in the field. While web resources are a convenient way to find peer-reviewed publications, they can themselves be unreliable, and so you won't be allowed to use them as a source. There are a few exceptions, like Ethnologue. If in doubt about what's appropriate, just ask.
Find and examine reliable sources (as noted above) to address the following questions:
Write up your findings in polished essay form: a page or two of clear, concise, and coherent prose, using full sentences, proper spelling, etc. Do not simply answer the questions above, one by one. Instead, try to write a cohesive narrative about your language.
Record your consultant saying the following words and phrases in their language. You should say the English word aloud and have them respond in their language, so that your recording will contain both. They should repeat each word twice. This can be a tiring process for the consultant, so be sure to take frequent breaks. As described above, your recordings should have little or no background noise (find a quiet room!), and the voices should be at an appropriate volume: loud enough to be clearly heard, but not so loud as to cause distortion in the recording.
Also try to keep the speech rate constant and normal. Don't speed through the words in rapid-fire succession; pause between each, so that there is a clear break. You may well find that some languages don't have a word that's a perfect translation for each of these, or that you consultant may not know a given word—maybe leg and foot are the same word, your language has words for older vs. younger siblings instead of brother and sister. This is all fine—just do your best.
Finally, try your hand at transcription, by giving a narrow phonetic transcription of just the five color words (‘black’, ‘white’, ‘red’, ‘yellow’, and ‘green’), based on how your consultant says them. Do not use a dictionary for this part! Use only the recordings and your knowledge of phonetics to this point. As needed, you may ask your consultant to repeat the words, or parts of words, to help you identify sounds. You can also attempt to pronounce the words or sounds, and ask your consultant if you are saying them correctly.
Put as much detail into the transcriptions as you can, and discuss any problematic sounds you could not fully identify. I don't expect your transcriptions to be completely correct, because there are likely to be sounds in your language that we have not yet discussed in class. You are not aiming for perfection here, just demonstration of your ability to seriously apply your current phonetic knowledge to real-world speech. Include your transcriptions and discussion as part of your overall writeup, similar to the following sample partial writeup:
- ‘black’ [ˈtʰʊkɑ] or maybe [ˈtʰʌkɑ]
- ‘white’ [ˈbɛldi] or maybe [ˈpɛldi]
- ‘red’ [ˈgɑmo] or maybe [ˈkɑmo]
- ‘yellow’ [ʃife]
- ‘green’ [jote]
The first vowel in [ˈtʰʊkɑ] ‘black’ is hard to identify. It doesn’t really match either [ʊ] or [ʌ] in English; it sounds like something in between. It’s definitely not [u]. The consultant's lips appear to be slightly rounded sometimes when pronouncing this word, but not always.
The initial voiced stops [b] and [g] in [ˈbɛldi] ‘white’ and [ˈgɑmo] ‘red’ sound weird, not like English [b] and [g]. Could they be unaspirated voiceless stops instead, [p] and [k]?
The initial [ʃ] in [ʃife] ‘yellow’ is similar to English [ʃ], but it's definitely not exactly the same sound. We can't really tell exactly what is going on here. The speaker's lips do not appear to be as rounded as for English [ʃ], so maybe that is a crucial difference.
In this step of the project, you will be working out the consonant system of your language, both phonetically and phonologically. You will need to work with any materials you've found about your language as well as your consultant to determine the patterning of consonants, and record some of them in certain combinations. In your analysis, you may additionally make use of the material your recorded for the previous step, or re-record any material that was of unsatisfactory quality.
List all the consonants you understand to exist in the language. Initially this will be based on the data you recorded in Step I and any sources you were able to find on your language (making sure to cite them, and describing how you know what you know). You should also explore this topic with your consultant, e.g. by asking questions like "Can you think of any words which contain a sound like [q]?" Be aware that your consultant may interpret what you say as a slightly different sound, so have them repeat the word (and tell you the meaning), and pay attention to phonetic detail. It may be useful to bring an IPA chart and mark the sounds as you find them. Also consider that there may exist consonant contrasts which are represented in other ways than simply alternating the character, e.g. aspiration ([t] versus [tʰ]), pharyngealisation ([t] versus [tˁ]), palatalisation ([t] versus [tʲ]), etc.
For each consonant, try to find a set of words where there's a word that fits each of following. It's possible that there are large gaps in your language with regard to where [certain] consonants may occur.
You'll want to have an audio recording of all of these words, so you'll probably want to record the whole session. However, this will likely not result in "clean" recordings—i.e., as opposed to in sentences like "Yeah, there's the word [kʰaʊ]—it means ‘cow’." To get clean recordings, you have a couple possibilities. Either way, you'll probably want to make a list of the words as you go, using quick impressionistic transcriptions, and writing the meanings (e.g., by adding "[kʰaʊ] - cow" in a notebook). From there, you may do one of three things:
Document what you did for this.
Arrange the oral stop consonants into a list sorted by place of articulation, and expand it into a chart, with place of articulation as the vertical axis, with a second vertical axis of phonation type, and the contexts from the previous step (like "beginning of word", "intervocalic", etc.) as the horizontal axis. Fill in the words you were able to find. It should look something like an expanded version of the following, with the appropriate contrasts and words of your language:
|beginning of word||intervocalic||...|
|bilabial||voiced||[b]||[biːn] ‘bean’||[əˈbʌv] ‘above’||...|
|voiceless||[p]||[ˈpɑkɨʔ] ‘pocket’||[ˈhoʊpɪŋ] ‘hoping’||...|
Are there any gaps in the chart? Are they systematic? For example, in some languages, you may find that voiced stops never occur word-finally. Is there evidence to lead you to think that any of the stops may be in complementary distribution? E.g., in English if you had thought [p] and [pʰ] were distinct phonemes, you would find that they never occur in the same contexts, though it might be difficult to discern exactly what the pattern is from the contexts elicited here.
Choose at least six oral stops to perform an acoustic analysis on, where at least three of them occur in at least two contexts. I.e., you will be measuring stops in words from at least nine of the cells in the chart. Your six oral stops should be from a range of places of articulation and, if possible, phonation types. They should contrast nicely, so /p b t d k g/ is a good set, while /p b tʰ ɗ k ɢ/ is less optimal, because it's harder to make meaningful, systematic comparisons with so many different variables.
You will need to measure the following three properties of each stop: closure duration, voice onset time, and center of gravity of the release burst. You should measure all four recorded instances of each word. If there is an utterance missing (e.g., if the consultant only said a word three times), try to find two recordings of the same stop in the same context to measure in addition.
Closure duration only need be measured in a context where it's easy to see where the stop begins and ends, such as between vowels, between a fricative and a vowel, or at the end of a word if the release burst is detectable (not common in non-careful English, but common enough in other languages). Use the same kind of environment for all of your oral stops.
Voice onset time should be measured for each of the six stops in each context it occurs in by measuring from the beginning of the stop's release to the beginning of voicing (which is usually the first blue glottal pulse mark found by Praat (to get the glottal pulses to appear in Praat, select Show pulses from the Pulses menu in the sound viewing window). Note that for (partially) voiced stops, the VOT will be negative, since the first pulse will occur before the release, and for fully voiced stops, the VOT will be negative with the same magnitude as the closure duration.
The center of gravity of the release burst should be measured for each stop consonant in each context. Use the cursor to highlight the entire release burst, and obtain a spectral slice by selecting View spectral slice from the Spectrum menu. This will open a window showing the spectral slice and create a new object in your Praat Objects. Select this spectral slice, and then select Get centre of gravity from the Query menu.
You don't need to measure center of gravity of the release bursts if you don't want to—but know that this is a characteristic of stops.
You should end up with a chart like the following for every oral stop in each context you examined (with empty closure duration columns in the contexts where you did not measure it):
center of gravity (Hz)
Be sure to recheck any anomalous measurements. Discuss your results, including any notable patterns (comparing across whichever features possible, such as context, phonation type, place of articulation, etc.) and how the measured properties of the sounds compare to our expectations.
In this section you will perform an acoustic analysis on all the voiceless fricatives of your language. You should measure fricative duration and fricative center of gravity over the entire duration. These measurements should be presented in tables like the following, one for each fricative in each context:
Optional: If you're feeling adventurous, you can include images of the spectral splice for each fricative. To get an average spectral slice for all four instances of the same fricative, you can cut a large section of the same duration out of each fricative and paste them side-by-side in the same sound file. Then, select the entire new “mega-fricative” and generate a spectral slice. The resulting spectral slice will be a reasonable average of what the individual spectral slices would have been.
Be sure to recheck any anomalous measurements. Discuss your results, including any notable patterns (comparing across whichever features possible, such as context, place of articulation, etc.) and how the measured properties of the sounds compare to our expectations.
What were the remaining consonants that appeared in the words you've recorded? Describe in clear, coherent prose any sounds from the consonant phonemes in your language that you have trouble identifying precisely or that you find particularly interesting. Give as much detail about these sounds as you can, including whatever acoustic information you have been able to glean with Praat from the recordings you’ve made.
Choose at least five pairs of consonants, and present an analysis of what kind of distribution they are in (e.g., contrastive, complementary distribution, etc.), citing evidence from your word list or measurements that supports your analysis.
Summarise all the contrastive consonants you have identified (stop, fricatives, and everything else) in an IPA-like consonant chart, making note of any allophones in an accompanying list. Indicate somehow (making clear how) the consonants you aren't sure of.
In this step of the project, you will be conducting detailed work on the vowel system of the language you are studying, including phonetic analysis of its oral diphthongs and phonological analysis of its vowel inventory.
You can use your previously recorded material from the last two steps, re-record this material if the original material was unsatisfactory, and/or record new material. Whatever recorded material you use (including old material) for your analysis should be submitted along with your write-up.
For the measurements in this step, be sure to take measurements from as many different similarly structured words as possible (at least four), rather than just one measurement from a single word. As always, give all units, and report the raw and and rounded mean, as well as the standard deviation for every set of measured data.
For each short (non-long) oral monophthong in your language, determine its first three formants using Praat to measure the formant values over a portion of the vowel that is at least 100ms long, is fully voiced, and is centered at the middle of the vowel. For each of the vowel categories you measure, you should have a table like the following:
|mean ± stddev|
|407 ± 14||1470 ± 33||2487 ± 21|
If Praat continues to give you obviously unusual measurements for a particular vowel (i.e. formants significantly different from those in section 2.4 of the course packet, e.g., F1 > 1500 Hz, F2 higher for a back vowel than a corresponding front vowel, F3 significantly outside the range 2000–4000 Hz, etc.), even after adjusting the formant settings, try using spectral slices instead. To do this, get the spectral slice for the selected region as in Homework #3. Then use LPC smoothing on the spectral slice to generate an LPC-smoothed slice, asking Praat to find 20 peaks. The smoothed slice should give you clear peaks at reasonable values for F1, F2, and F3. If not, try fine-tuning the setting for LPC smoothing to produce more or fewer peaks, until you get reasonable results. Ignore the apparent peak at 0 Hz; this is just the mathematical distortion inherent to the LPC calculation. If you collect any formant values with LPC smoothing, be sure to describe what you did carefully and thoroughly.
Using the rounded means of F1 and F2, plot all of the vowel phonemes in a formant plot as you did for Homework #3. Around each plotted vowel, draw an ellipse using the standard deviations of your measurements of F1 and F2 as the radii of the ellipse, as in the diagram to the right. Don't worry about drawing an ellipse if your standard deviation is so small that your ellipse would be indistinguishable from a dot. You will probably want to draw the ellipse manually, which means either exporting the vowel plot and using an image manipulation program, or printing it out, using a writing implement, and scanning it.
You should turn in the recordings you measured, your spreadsheet, your formant plot, and a list of the words you measured for each vowel token and which file it's in (e.g., a bunch of lines like "[ɑ] #1 — [ˈkɑmo] — awords.flac, word #3").
Try to work out what the full vowel inventory of your language is. Are there any minimal pairs or near minimal pairs (in terms of vowels) in your data? Do any vowels appear to be relegated to specific segmental (e.g., after palatals) or prosodic (e.g., in stressed/unstressed syllables) contexts?
Aside from the short monophthongs, what additional features exist in the language? Are there diphthongs in the language? Besides vowel quality, can you say anything about vowel length, phonation types, nasalation, tones, etc.? What do sources say about what's contrastive in the language? If none of these aspects of vowels are contrastive, can you find evidence for them being allphonic, e.g. like nasalisation or length in English? Do these topics affect all vowels equally (e.g., do all the vowels have nasalised counterparts?)? Please try to explore at least one of these topics in as much depth as you can, e.g. by providing examples from your data (in transcription) of the phenomenon and whatever you could work out, including your evidence (phonetic or phonological) for it.
For this part, write up your findings by presenting a full vowel chart (not necessarily acoustic, but based to some extent on your findings of the acoustic measurements in the previous section) followed by examples and concise explanations of your evidence for both the short monophthongs and any additional features. Try to apply minimal featural representations to the vowel inventory—i.e., do your best show what features are necessary to contrast each vowels from rest, and discuss what's problematic to the theory of features we've been using in class. Discuss not just what you think might exist, but what you're not sure about, including any other interesting (or troubling) aspects of the data you collected.
Your goal is to give a presentation of about 10 minutes (and no more than 15 minutes) on the phonetics and phonology of the language you've been working on for your final project. Your presentation should satisfy the following requirements:
Be creative, and feel free to bring in any outside knowledge or skills to enhance your project. For example, if you are adept at statistical analysis, you could do more rigorous analysis of results beyond the simple standard deviation method we've been using in class (such as variance, outliers, statistical significance, correlation, etc.).
Your group's grade on this lab will be determined by your thoroughness, accuracy, coherence, analysis, ingenuity, and overall quality of presentation; your individual grade will be based on your group's grade and your portion of the presentation. Note: this is a phonetics and phonology course, so you are not concerned with the writing system, the syntax, the sociopolitical climate, etc. Don't spend significant time on such topics; the vast majority of your presentation should be devoted to the sounds and sound system of your language.
The official write-up for your final project should be a coherent, fully-formed report that synthesizes all previous steps, including the presentation. There is no strict page limit, but a typical final project will be about ten pages long (a few pages more or less is fine), depending on how extensive and detailed your analysis is, and especially how many graphics you include.
Your report should be structured as follows:
As with your presentation, be creative, and feel free to bring in any outside knowledge or skills to enhance your project. Your group’s grade on the project will be determined by your thoroughness, accuracy, coherence, analysis, ingenuity, and overall quality and polish of the write-up.
If you're working in a group, you must add one of the additional steps outlined here for each additional person. I.e., a group of two must add one of these steps, and a group of three must add two.
Identify a phonologically contrastive feature of the language and perform an acoustic, articulatory, or perceptual analysis of this feature. The feature may be anything we haven't done already in one of the assignments, or the implementation of a phonological process. Options may include tone, stress, palatalisation, vowel harmony, geminates, etc. Acoustic analyses are by far the easiest to perform and interpret (and it's all I've trained you to do), so if you want to do an articulatory or perceptual analysis, please talk to me first. The words you record for this and the type of analysis you perform will depend on the process, and I encourage you to talk to me about this as well.
Investigate the phonotactics of your language. What syllable structures are allowed? What consonant clusters can appear in onsets, codas, and across syllable boundaries? Is vowel hiatus allowed? Are there required or preferred root shapes (number of syllables, must be C-initial or V-final, etc.)? Are there co-occurence restrictions for certain segments? If not strict rules, what about tendencies (i.e., do a statistical analysis of all the words you've collected)? Are round vowels and labial consonants likely to co-occur? Do certain segments only ever show up in certain positions? Does your language count moras? What's the stress pattern of the language? Anything else interesting?
Explore the active phonological processes in your language. How do various phonemes surface in different environments? Are these simply phonetic changes (like how /n/ surfaces as [n̪] in English before a dental segment, as in tenth [tɛn̪θ]), or are they alternations of otherwise contrasting phones? What are the conditioning environments? Do they occur at morphological boundaries, morpheme-internally, or both? Are they phonetically natural? Is there reduplication (full or partial)? Tone sandhi? Infixation? Find a few interesting phenomena, and come up with an analysis of each one, providing your evidence for the alternations you describe.