7

I have long felt that [f] and [s] are hard to tell apart on the phone, especially when spelling out words letter by letter. As a non linguist (but audio engineer) it seems to me that the frequencies required to distinguish them are usually higher than the bandwidth of the phone codec.

Today I fell foul of this in a most unfortunate way, and managed to get a parking penalty charge because the automated pay-by-phone system for the car park registered the wrong license plate.

In order to help me prepare the case to contest my parking ticket - can anyone provide me with a citation for this fact?

Sideshow Bob
  • 173
  • 5

1 Answers1

10

This paper may be useful for its collection of references. This paper is a single simple read. The problem is that the spectral properties of fricatives are usually reported in terms of the frequency of the spectral peak, where /f/ and /s/ are clearly different but also above the cutoff frequency for the phone. Another part of the black box that you're up against is that the ASR system doesn't involve a small, phonetically trained human making spectrograms, so you'd need to come up with a reason to think that the system they used has problems. (It isn't generally the case that nobody can distinguish [s] and [f] on the phone, and I think that some ASR systems introduce problems, so learning how ASR works could be helpful).

[EDIT]

The citation for the Jongman paper is Jongman, A., Wayland, R., & Wong, S. (1998). 'Acoustic characteristics of English fricatives: I. Static cues'. Working Papers of the Cornell Phonetics Laboratory. 12: 195-205. The current URL is http://conf.ling.cornell.edu/plab/paper/wpcpl12-Jongman.pdf.

user6726
  • 83,066
  • 4
  • 63
  • 181
  • 3
    +1. I would add that, while not all [s] and [f] sounds are ambiguous to humans on the phone all of the time, they are more likely to be confused in the very context in question--word-finally. See my answer to the related question linked in my comment above. – musicallinguist Apr 15 '15 at 02:27
  • 1
    Note, too, that in ordinary speech the context provides strong cues for distinction; for instance, it is hard to imagine a context in which sail and fail are equally likely. But reciting an arbitrary string of letters provides no context which might resolve an ambiguity. – StoneyB on hiatus Apr 15 '15 at 16:13
  • The second paper is no longer available. Can you provide a link to a different source? – tmh Dec 05 '15 at 18:17