Acoustic cues to lexical segmentation: a study of resynthesized speech

J Acoust Soc Am. 2007 Dec;122(6):3678-87. doi: 10.1121/1.2801545.

Abstract

It has been posited that the role of prosody in lexical segmentation is elevated when the speech signal is degraded or unreliable. Using predictions from Cutler and Norris' [J. Exp. Psychol. Hum. Percept. Perform. 14, 113-121 (1988)] metrical segmentation strategy hypothesis as a framework, this investigation examined how individual suprasegmental and segmental cues to syllabic stress contribute differentially to the recognition of strong and weak syllables for the purpose of lexical segmentation. Syllabic contrastivity was reduced in resynthesized phrases by systematically (i) flattening the fundamental frequency (F0) contours, (ii) equalizing vowel durations, (iii) weakening strong vowels, (iv) combining the two suprasegmental cues, i.e., F0 and duration, and (v) combining the manipulation of all cues. Results indicated that, despite similar decrements in overall intelligibility, F0 flattening and the weakening of strong vowels had a greater impact on lexical segmentation than did equalizing vowel duration. Both combined-cue conditions resulted in greater decrements in intelligibility, but with no additional negative impact on lexical segmentation. The results support the notion of F0 variation and vowel quality as primary conduits for stress-based segmentation and suggest that the effectiveness of stress-based segmentation with degraded speech must be investigated relative to the suprasegmental and segmental impoverishments occasioned by each particular degradation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Acoustic Stimulation
  • Adult
  • Cues*
  • Female
  • Humans
  • Male
  • Middle Aged
  • Pitch Perception*
  • Semantics*
  • Sound Spectrography
  • Speech Acoustics*
  • Speech Intelligibility*
  • Speech Perception*
  • Time Factors
  • Voice Quality*