* * Lecture notes by Edward Loper * * Course: 9.59 (Psycholinguistics) * Professor: Ted Gibson * Institution: Massachusetts Institute of Technology * [09/13/99 10:27 AM] > Communications - mapping between signal \longleftrightarrow meaning. - usually this mapping is (fairly) arbitrary - Occasionally non-arbitrary (=iconicity). eg., gestures, some sign gestures, onomatopeia. - Ambiguity exists for non-1to1 mappings btw signal & meaning > Human Language (vs. basic communication) - human language has multiple levels of related representations - human language has rules which organize representations within and between levels - human language can refer to things other than here-and-now - human language is productive ("Bob is not an x", for any x) - human language can recurse (x within x) - human language has lots of ambiguity > Levels of representation - discourse: John was going outside. Bill stayed inside. - sentence: John was going outside. - word: going - morpheme: go ing - phonemene: /g/ Related hierarchy -- each level is composed of units from the next level down. - phones vs. phonemes vs. allophones - phones: /p/, /d/, etc. - phonemes: [p^h], [p], [t^h] - allophones: [p^h], [p] - English has ~40 phonemes. Min is ~10 phonemes (not many consonants), max is ~140 phonemes. - yay for ipa! :) Why are phonemes distributed as they are? Ease of production and comprehension. e.g., all languages have stops & non-stops, since they are easy to produce and the distinction is easy to hear. Phonemes aren't always at their target position. E.g., /h/ is supposedly a velar sound, but in /hi/ (he), it seems to be front. Is /h/ unspecified for position, or is it assimilating? If we say /h/ by itself, it seems velar.. if it were unspecified, we would expect mid-central? Vowel features, height (high,mid,low), position (front, central, back), nasalization, rounding, tone coarticulation: makes comprehension harder in some senses, but adds a degree of redundancy. [09/15/99 10:34 AM] assimilation with plurals: # /C[+cons +voice]/__ -> /Cz/ # /C[+cons -voice]/__ -> /Cs/ dissimilation with plurals: # /C[+fric +alveolar or palatal]/__ -> /C schwa z/ (C can be sh, z, s) note that dissimilation rule applies first. > Variation across languages Why does variance between languages exist, and what can it tell us? Some sounds, e.g., /m/ and /k/ and /y/, are more common in languages than others. Could be because they are easy to tell apart and easy to produce. - Compare /t/ and /s/. differ only in +/-stop (continuant) - Compare /t/ and /t_{dental}/. differ in place of articulation The first pair is much easier to distinguish acoustically than the second -- some features are primary (e.g., continuant) and some are secondary. Claim that three are 3 primary features: continuant, sonorant, and coronal (i.e., blade is raised). > Nonsegmentals Suprasegments: pitch contour, loudness, duration (speech rate) Registers: falsetto, whisper, breathy voice, creaky voice > Source-Filter Theory Vocal chords produce a source sound. Filter it through mouth etc., and you get an output. Coarticulation leads to lack of invariance: must be context sensitive. > The Segmentation Problem How do we find word boundaries? Hard to tell from a spectrogram. [09/20/99 10:37 AM] > Categorial Perception of Speech Despite the fact that there are continuous differences between sounds, we group them categorically. Other animals also have categorical perception of sounds -- suggests that the existence of categorical perceptions isn't caused by our ability to speak. Babies and habituation -- sucking responses to show they distinguish sounds. Are categorical distinctions acquired or innate (and then similarities are acquired)? Somewhere between: some are innate, some acquired. dichotic listening: different materials presented to each ear. There is a right-ear advantage for linguistic material, since language is on left hemisphere. (left ear advantage for music) Mann & Libermann -- present steady state formants in one ear, transitions in the other ear. 2 percepts result: (i) hear it as a speach sound (sounds centered); (ii) attend to one side (transition sounds like chirp). Get category shift for (i), not for (ii). McGurk effect -- visual cues affeect speech perception > (Auditory) Word Perception bottom-up information: features, phonemes 1. priming evidence in lexical decisions: \exists semantic priming 2. evidenced from gating (give beginning of a word, and ask them to complete it) top-down information from words [09/22/99 10:35 AM] >> Bottom-up flow of information (From the phonemens & features) Cohort theory: a word is recognized through left-to-right activation of phonemes. Cohort narrows as incoming sounds rule out alternatives. Basically, theorizes that we go "*" -> "k*" -> "k{\ae}*" -> "k{\ae} t*" -> etc. as we hear a sound. Implies some sort of sorted phonemic table of words. 1. Gating paradigm -- present subject with sucessively longer pieces of words, and ask for what the word is (and how confident they are). 2. Lexical decision -- present subject with sequence of phonemes, and ask them if it's a word. Both semantic priming, and "beaker" also primes "beetle" and "bug:" cohort theory. 3. Phoneme restoration -- present subject with a word, replacing one phoneme with noise, and ask subject to identify word. Later version had 2 versions: add noise to phoneme, or replace phoneme with noise; for non-words, they could tell, but for words, they couldn't. This suggests that subjects aren't just guessing. >> Top-down flow of information (Lexical effects) Evidence against cohort theory: 1. Right-context effects -- Ask subjects to repeat the stimulus they hear. Use sounds at various points between "dash" and "tash." People will percieve the sound (/d/ or /t/) depending on whether the rest of the sound forms a valid word. 2. Rhyme effects in eye movements -- beaker primes bug (via beetle), but not stereo (via speaker) in lexical decision. But in a more natural task, such as looking at whatever matches what you hear, you do get rhyme priming effects. TRACE Theory (activation-based) > Taxonomy of writing systems deep vs. shallow orthography [09/29/99 10:36 AM] > Syntactic Structure [10/04/99 10:43 AM] For Japanese (head-final) word order: flip everything but specifiers? so (IP (NP subject) (I' (VP (modifier) (V' (argument (V verb)))) I)). [10/13/99 10:41 AM] > Parsing in real time.. Mmm.. Disamiguity and garden path and plausibility, oh my. [10/18/99 10:42 AM] Difficulty in understanding sentences comes not from too many incompleted dependencies, but from distance between dependencies... Ok, so how do we do this DLT thingy? Well, first mark new referents. # (ed (was ((given (an (aligator)) (for (his birthday)))))) # (1 (was ((1 1) (for 1)))) # (1 ...) # (1 was ((1 ...))) # (1 was ((1 1) ...)) # (1 was ((1 1) (for 1))) # ((that ((the comment # (that (the star athlete # (was (receiving bribes))))) # (upset the coach)) # ((was (supported (by the university official)))))) # ((that ((N (that (N (V N)))) (V N))) (V (by N))) # ((that [(N (that (N (V N)))) (V N)]) [V (by N)]) # ((that ((N (that [(N) (V N)])) [V N])) [V (by N)]) * # ((that ((N (that (N [V N]))) [V N])) [V (by N)]) * # ((that ((N (that (N (V [N])))) [V N])) [V (by N)]) * # ((that ((N (that (N (V N)))) [V N])) [V (by N)]) # ((that ((N (that (N (V N)))) (V [N]))) [V (by N)]) # ((that ((N (that (N (V N)))) (V N))) [V (by N)]) # ((that ((N (that (N (V N)))) (V N))) (V {by N})) # ((that ((N (that (N (V N)))) (V N))) (V (by [N]))) # ((that ((N (that (N (V N)))) (V N))) (V (by N))) # ((that ((N (that (N (V N)))) (V N))) (V (by N))) # 0 1 0 1 The professor who the student who I met at the party liked ate the cheese-ball. # ((N (who_j ((N ((who_i (I ((V t_i) {at N}))))) (V t_j)))) (V N)) # 1 0 1 0 0 1+1 1 1+4 1 1 # The reporter who_i the senator attacked t_i disliked bob # 0 1 0 0 1 1+2 3 1 # the reporer who_i the photographer sent t_i # 0 1 0 0 1 2+1 # to the editor hoped for a good story # 0 0 1 4 0 0 0 1 So integration cost = # of things intervening between the head and the adjoined item.. E.g., disliked is 3 because the head of the NP that it's being adjoined to is "reporter", and there are 2 items betwen them.. Add one for "disliked" itself.. Cost function: cost to adjoin to something. Well, look at the distance between it and the head that it's being adjoined to, in terms of # of discource elements added. Note that this will be langauge-specific, because of ordering effects... # CI([XP: Spec/XP X']) = CI(X') # CI([X'_1: X'_2 YP]) = CI(X'_2) + cost(YP) # CI([X': X YP]) = CI(X) + cost(YP) # cost([YP: Spec/YP Y']) = cost(spec/YP)+cost(Y') # cost([Y'_1: Y'_2 ZP]) = cost(Y'_2) + cost(ZP) # cost([Y': Y ZP]) = cost(Y) + cost(ZP)