Monday, November 17, 2014
Every language has filler words that speakers use in nervous moments or to buy time while thinking. Two of the most common of these in English are “uh” and “um.” They might seem interchangeable, but data show that their usage break down across surprising geographic lines. Hmm.
[...] To uncover the geography of filler words, Grieve ran through the Twitter corpus to find how often a given American county uses “um” over “uh” and vice versa. After that, he used an algorithm known as “hot-spot testing” to smooth out the results and make them more meaningful.
The smoothed-out version has a lot to say. The regional breakdown is clear, and it doesn’t look much like other maps that try to show where some phenomenon or another is happening in the United States. Grieve said the use of “uh” looks to follow the elusive “Midland dialect,” which linguists have suspected follows the Ohio River southwest from central Pennsylvania. That accounts for most of the blue that sweeps from West Virginia all the way to Arizona. Grieve said the “uh” and “um” analysis is the first time his research has shown clear evidence of the Midland dialect.