Google Search: What are you trying to autocomplete?

Wired magazine had a good article on offensive Google autocomplete answers to questions like “Are Jews…” or “Blacks/Feminists are….”

There have been lots of good articles on built-in bias in different algorithms such as evaluations of credit worthiness which perpetuate unconscious bad assumptions already built in to traditional financial evaluations.

But…what is the question?

In this case though, I wonder if the customer isn’t partly to blame. To me the very question “Are XXX…” to me sounds like a non-XXX person trying to check on an aspect of XXX culture. A benign scenario could be to clarify a stereotype or just to learn more about the XXX culture. I can even see people in the community want to search answers about questions they may have.

But there are plenty of negative reasons people are looking up information about “those” XXX people out there. The fact that Google is prone to bring up offensive stereotypes in the answers suggest a lot of negative searching is happening under the umbrella “Are XXX …” Even today, I asked the question “Are XXX…” for various groups and got some odd answers like “Are Pennsylvanians rude?” (or they just weird?)

Google search for Are Pennsylvians with autocomplete answers rude/weird

Can you ask this instead?

I do use Google to look up cultural information about all sorts of XXX cultures, but I confess I haven’t run into this problem. For one thing, I rarely use words like “are/is” in my search terms. Instead I use nouns and adjectives which are what I’m really interested. It’s up to you if you want to enter in something neutral or more loaded.

In terms of searching I had thought of “to be” as a semantically empty (although grammatically important) verbs that wouldn’t affect my search results one way or another. Apparently that verb is more powerful than I thought.

The Tennis Ball Color is…5GY (Chartreuse)

Tennis ball in standard color
Tennis ball used in 2011 Japan Open. Photo by Christopher Johnson. Licensed by Creative Commons.

Today the most recent color debate is about those bright fluorescent tennis balls – the question being are they yellow or green? The answer probably is …both.

Focal Colors

This issue points to how cultures do divide color space and supports the Berlin and Kay theory of focal color. Although humans can see thousands, if not millions of colors, most languages assign primary names for only a small percentage of them. In English, two of these color words are “yellow” and “green” (along with red, blue, orange, purple, white, black, brown).

Of course these colors are umbrella terms for a range of colors. For instance, we may speak of pine green (dark and a little blue), sea green (pale green), olive green (like the green olive) and so forth. But when “green” stands alone, we may be thinking of the prototypical or focal color – a bright green associated with emeralds, leaves or “Kelly green”. This green is what is normally seen in national flags, corporate logos or many sports team logos.

Similarly, although yellow comes in different shades including mustard (slightly darker), lemon (slightly paler) and saffron (with a touch of orange), the word “yellow” refers to the shade of yellow used in many national flags and logos such as John Deere (tractors). The focal colors are also what is taught to children when they are exposed to color words as can be seen in the sample images below.

What About the Tennis Ball?

An answer that I think many people have guessed is that the color of the fuzzy round object is right on the mental border between a bright yellow and a yellowish pale spring green. If “yellow” or “green” are the only available options, it appears some think yellow and others green. To be clear, the International Tennis Federation (ITF) (and tennis player Roger Federer) calls this color “yellow”, but if you want an ITF approved model, it may be green.

Some other factors to consider is that lighting may make a tennis appear greener in some photos and yellower in others. Even in the Japan Open photo, the ball appears yellower in direct light and greener in the shadow.

For me, I couldn’t really classify this color as either yellow or green…just fluorescent and perhaps fluorescent chartreuse. Interestingly a lesson artists have to learn about color is that 1) there are a lot of them and 2) most professional color wheels have lots of color divisions on the edge, often including chartreuse (or 5GY in the Munsell system).

From a linguistic point of view, this does show what happens when the color of an object sits on a mental color border, and apparently it’s not pretty.

P.S. Japanese Blue Traffic Lights

Ever heard that Japanese label the Go traffic light as “blue” (ao) even when they’re the same color as U.S. traffic lights? There are differences between the English and Japanese color system, but if you look closely, you’ll see that traffic light “green” is actually pushed towards a bluer (cyan shade). In some photos, the correct color may be “cyan.”

Some Near Rhymes from Nashville

A near rhyme is refers to words paired in songs or verse which sound similar, but not quite enough to be called a “perfect rhyme”…at least not in Standard English. They can be interesting for revealing insights into spoken phonology, in this case the phonology of country songs.

Here are some interesting cases I found on a recent mix CD playing in my car.

Hell On Heels (Say What You Will)

In Hell On Heels (by the Pistol Annies featuring Miranda Lambert), there are lots near rhymes. Just for context, the song chronicles the past exploits of a classic “gold digger.”

Most near rhymes have the same vowel, but end with slightly different consonant, although usually with some phonological smilarities. For example

(1) This diamond ring on my hand
Is the only good thing that came from that man

(2) Poor ol’ Billy, bless his heart
I’m still using his credit card

Note: both /d/ and /t/ are coronal stops.

And even one actual rhyme that features two words with different spellings

(3) Then there’s Jim, I almost forgot
I ran him off, but I took the yacht

Both end with [at] in this dialect.

For me though the most interesting near rhyme is one which pairs deal /dil/ and heels /hilz/ with will /wɪl/. It would be interesting to see how the spectrograms compare here.

(4) I’m hell on heels
Say what you will
I’ve done made the devil a deal

In this case, the vowels are different even though the final consonant is the same. However, in terms of the English vowel space, they are very close – that is both /i/ and /ɪ/ are phonemically high front unrounded vowels. Interestingly, it seems like first hell on heels is pronounced closer to [ɪ] to emphasize the rhyme, but closer to [i] or even [iə] thereafter.

Note: Another interesting case of /i/ in a near rhyme is from the Addams Family theme song which pairs scream and see’um.

Psycho Girlfriend

Another song with an interesting set of rhymes is Psycho Girlfriend by Jessie James (Decker) about a woman who has “issues” talking to her boyfriend who can’t seem to quit her.

Again, we have a great near rhyme where the vowel is the same, but the final consonant is different.

(1) Did I forget to mention
I need all your attention
Or else I’ll throw a tantrum [for it?]

This pairs both /ʌn/ in -tion and /ʌm/ in -um.. Even better, all the words actually end with [ntʌN] where N stands for nasal stop (/m/ or /n/).

And here’s another with different underlying pronunciations, but actually more closely rhyming in the colloquial pronunciation

(2) I’ll call you when you’re workin’ /i.e. working/
Over n over again

Underlylingly, the first line ends with working /wərkɪŋ/ and the second with again /əgɛn/. But phonetically, both end with [ɨn] in the actual lyrics. The vowel change is part of a larger pattern where many unstressed vowels reduce to [ɨ] in American English, causing spelling nightmares for American school children everywhere.

The last case is interesting to me because I think it’s meant to be a rhyme, but doesn’t work for me in my idiolect.

(3) Insecure, in denial
Immature, like a child

The way Jessie James sings it, but both denial and child have two syllables [aj.ɨl] or [aj.l̩] (with vocalic /l/). In my grammar though, I feel like child has only one syllable, so the rhyme doesn’t quite work for me. Checking the Oxford English Dictionary pronunciations, they too transcribe child with one syllable and denial with two.

So this appears to be a dialectal difference interfering with the rhyme. On the other hand, I suspect that if I took a spectrogram of my own speech, I might find that child and denial were actually more similar than I think. The status of /ajl/ and /ajr/ syllables is interesting in general.

How to Pronounce “Gal Gadot”

With the Wonder Woman movie about to come out, it’s important to review this timely pronunciation article from Vox and about how to pronounce the name of star Gal Gadot ( גל גדות‎‎). As they point out, Gadot is NOT the same as the French family name of the play Waiting for Gadot but rather an Israeli name.

So bring out that final /t/ and say something like “Gal Ga-duht” /gæl gadɔt/ (stress on the final syllable). The first of Gal is fairly close to English Short A, but the is between English “uh” and “oh”.

I Give Arrival an A-

A good friend of mine commented that he liked how linguistics was depicted in the recent sci-fi movie Arrival, so I did feel duty bound to view the movie. The good news is that yes, the mechanics of linguistics is portrayed fairly well. Still I was a tad disappointed that some clichés, particularly the Sapir-Whorf hypothesis, is still being depicted as the most important thing about linguistics. To have the author Ted Chiang and screenwriters focus on this to me means he has missed one of the most important lessons of theoretical linguistics.

Spoiler Alerts – I will minimize this, but if you want to be completely surprised, watch the movie first. The first spoiler is – Amy Adams plays a linguist Louise who is asked to decipher an alien language when some mysterious objects park themselves in different parts of the world, including of course rural Montana.

The Good

Before I point out the clichés, I will point out the positives. Namely

  1. Linguist Louise (Amy Adams) is hired based on her “translation” expertise including some recent Farsi speaking terrorists (Farsi is from Iran). BUT she points out that translating a language she already knows how to speak is different from translating a completely unknown language. Therefore she will more data than a 30-sec audio clip. Duh.
    Note: The fact that she has to explain shows how little common sense some people have about how language works.
  2. When the military wonders why Louise is starting with basic vocabulary, she does a good job explaining how she needs to know basic grammar to to frame the question “What is your purpose?” To ask this question, we will need to understand how to build a sentence, make sure we pronouns correctly and more importantly, understand what they have to tell us.
    Note: This part show how linguists focus on “grammatical crap” that make other people’s eyes glaze over. But that’s because you can’t become fluent until this knowledge is automated. However you have to learn about how a grammar works to communicate effectively in a new language. Fortunately, most linguists begin life as grammar geeks, so we actually find this very interesting.
  3. Louise’s fieldwork followed by intense scrutiny of the language samples is pretty realistic. If you know nothing about the target language, it will take much time to decipher everything, even if the other party is fairly cooperative.
  4. The investigation team includes a physicist who comments “You approach this very mathematically.” Yes…linguistics is actually a science. We just use different math notation than calculus.

Clichés and Questions

It wouldn’t be Hollywood without a few of these.

  1. As usual the movie assumes a linguist can speak any weird combination of languages – in this case Farsi, Sanskrit (these two can go together) and Chinese. That’s sort of like assuming a random linguist can speak Polish and Swahili. It can happen, but since those two languages are fairly distant geographically, culturally and linguistically…it would be fairly unusual.
    Note: In addition to general geographic literacy, some linguistic/cultural literacy would be a good idea.
  2. In the beginning of the movie, Louise is prepared to lecture about the history of Portuguese to a large lecture hall. But which class is this? I would only expect this in the history of Romance languages…and that class rarely fills a lecture hall.
    Note: But bonus points for connecting the origins of Portuguese to the kingdom of Galicia.
  3. Louise also comments that the proto-Portuguese speakers valued their poetry and literary culture…But EVERY culture I have encountered has valued the poetry of their language. Even when a language isn’t written or isn’t used for education, native speakers understand their language’s unique charm – just ask any hip-hop or country song artist.
    Note: There is a paradox that many linguistics consider all languages “equal” but also each language “special”. Still it never hurts to play a little indigenous music lyrics in class.

Major Spoiler Alerts Here

And then…Sapir-Whorf Hits Us

I was disappointed that a key plot point revolved the Sapir-Whorf hypothesis which maintains that language strongly influences thought. Specifically when Louise learns the alien language at a “deep level”, their different tense system causes Louise to gain the ability to see into the future. Um no.

For the record, I don’t dispute that the aliens can perceive time differently than humans. After all, they are aliens. But I don’t think learning a new language has ever affected a human that deeply. Being exposed to a new culture can be definitely life changing, and the CONCEPTS behind a foreign language’s words can be different. But grammar doesn’t have the impact people think it has.

Consider the example from my experience – I have been exposed to Spanish, a language that classifies nouns and verbs as “masculine” or “feminine”. I understand how the system works and can properly implement it (mostly), but I have never transferred the concept to English. For instance, I can’t necessarily tell you if a fan is masculine or feminine. I’m not sure a Spanish speaker could either except by knowing what the final vowel of the word is.

And in fact the original story’s author Ted Chiang uses English tenses creatively to distinguish when Louise is having a memory from the future. In other words, if people could see into the future, the language’s tense system could make the adjustment. FWIW – Since Louise was exposed to the alien’s foggy atmosphere at one point, I will assume that’s how she got her new time sense.


There are some subtle influences of language – such as an enhanced ability to distinguish red from orange if your language has those two color terms. On the other hand, other forms of training can override this default. A trained artist can distinguish lots of colors, including ones that may not have common words in a language.

Major Major Linguistic Spoiler

“Phonology” Questions

By focusing on Sapir-Whorf, the movie misses an interesting question about the alien language. Initially the scientists focus on the sounds the aliens make, but Louise wonders if we could communicate better by writing. It turns out that the aliens, which are vaguely squid like, can generate black ink circular signs from their tentacles. These signs float in their native white fog until they are dissolved.

For humans, language is normally spoken with writing learned later. Language can be combined with different gestural motions, which enhance the communication, but aren’t always consistent.

For the aliens, I think it’s the reverse. The signs are the primary linguistic form with audio cues enhancing communication, but not necessarily consistently. Unlike humans, the aliens don’t necessarily need tools to “write” just as humans normally don’t need tools to speak in person. With a foggy atmosphere, I could see that hovering black circles could be more a robust signal than audio alone, so that could be the main language signal.

Eventually, the scientists create an app to replicate the circles (yeah), but I would be curious to see if the circles contain words, phrases or sentences. And you don’t need time travel to understand the shape of the circle – it could definitely be a byproduct of how each tentacle ends with multiple mini tentacles in a circular formation. Circles would definitely be easier to make than a line with that anatomy. The aliens can also create sequences of circles which shows that there is in fact a linearity in their longer utterances.

This is where the good stuff lies….

Teaching Standard English…Jeopardy Style!

Some urban (and rural) schools districts have quietly introduced a curriculum that teaches children who don’t natively speak Standard English to “translate” or “code switch” between their native dialect and standard English. One teacher has turned the grammar class into a Jeopardy style review. You can see that the kids are having fun figuring out arcane grammar rules. Generally speaking it’s a lot more motivating and effective to encouraging literacy than constantly correcting a child’s grammar.

P.S. As one educator Noma LeMoine explains, this effort has never been about “teaching” Ebonics to students, because “We don’t need to teach African American Vernacular English…They already know it.”

Linguistics for Young Readers?

I was watching the one of the Turnitin Writing X Tech 2016 Webinars on Teaching the Writing Brain and I was shocked to see that the presentation included the words morphonemic as well as morphology and phonology. You mean linguistics might be useful for understanding how children need to learn to decode the written word? Shocking!

Spelling and Linguistics

FYI – The word morphonemic was related to the issue of teaching spelling. The presenter Virginia Berninger emphasized that children do need to understand that not only do prefixes and suffixes affect the meaning of a word, but can also affect pronunciation (as in the first vowel of nation vs. nation+al. She also mentions another controversial word, phonics, to illustrate that English spelling (“orthography”) is supposed to be phonetically based and that she recommend that children learn the phonological structure of English spelling alongside all of our native spelling system quirks (that is, orthographic awareness).

And (OMG!) you might want to consider word origin (etymology) when teaching spelling. That’s because English borrows a foreign language’s spelling rules when it borrows the words. Linguists definitely know this, but you don’t see this mentioned as a strategy except in spelling bee competitions.

Building a Communication Bridge

For me as a linguist, the idea of teaching children phonics, word structure and matching spelling quirks to pronunciation seems fairly obvious as is the idea that writing teachers should have some linguistic training. Unfortunately linguists and more traditional “English” teachers have often seen each other as the enemy, and I will admit to mocking bad prescriptive grammatical rules. As a result, I often see many language teachers (even foreign language teachers) discuss teaching “culture” or “ideas” instead of “grammar” (As if we can’t we teach both!)

While I sympathize with frustrated linguists, I have to admit we have done a terrible job of explaining how linguistics applies to real world teaching and writing situations until fairly recently. That’s why I’m so happy that a seminar for writing instructors included neurological research supporting basic linguistic analysis. Linguistics could be starting to enter the world of general academic knowledge. Even Grammar Girl sometimes even mentions linguistics in a positive light (you go girl).

For linguistics, I do think we need to work better to appreciate the role of traditional prescriptive rules. While it is important to understand the structure of non-Standard English dialects (e.g. AAVE (African American English), Southern dialects, etc), we have to acknowledge that linguists always write standard academic English in their journal articles. As with other educated speakers, linguistics have learned to write and spell in a particular fashion that is at least a little bit different from their spoken forms (unless they are speaking like Sheldon Cooper from the Big Bang Theory.)

Some traditional grammar instruction is needed, but we also need to help teachers understand the role of linguistics in teaching those who don’t speak Standard English at home or those who have a learning disability related to reading and writing. I hope research like this can help build that bridge.

BBC: Evolution of the “Queen’s English”

Fresh off the BBC – an interesting article on how the pronunciation of the Queen (Elizabeth II) and RP Standard British English has shifted over time.

You can definitely hear a difference in the Queen’s Christmas speeches over the decades. In the 1957 Christmas speech video, the accent sounds a little archaic, but by 2015 the Queen has the same charming accent as Helen Mirren. It’s still RP, but a more modern form of it.

I should add that the context of the Christmas speeches has changed. The 1957 speech is set up very formally with the Queen in full formal regalia. By 1968, she was dressed in a day dress and by 1986, she was broadcasting from the stables and her accent has shifted as well.

The article also points out that the Windsor social circles have become less isolated than in decades past. Thanks to the late Princess of Wales, her children and grandchildren have much more contact outside royal residences than previous royal generations did.

Even so, it is difficult for the public to truly ascertain how the Queen speaks “colloquially”. By design, she has created a very formal persona and does not normally allow the public to see her speak except in formal speeches. Even when she is interviewed, her speech remains very formal, although this 2013 clip does show her relaxing just a bit. However, she still uses the impersonal one very frequently to describe her own daily duties.

Speeches Over the Decades

Overly Detailed Facts about the Welsh Word for corgi

As a reader of this blog, you need to know that 1) I wrote a dissertation on Celtic mutations and 2) I own a Welsh corgi (see below). His name is Owain Glyndwr Alan Jackson Cooperlee Corgi McCay Pyatt (or Glyndwr for short). Over the years, I have become aware of the meaning of the word corgi and some related words that linguists, Indo-Europeanists, Celticists and corgïsts may appreciate. The rest of you may want to move on.

extremely fluffy corgi with paws pushed out

Corgi is a Compound

Most corgi owners are aware that corgi is from Welsh and literally means ‘dwarf dog’ (cor ‘dwarf’ + ci ‘dog’), a reference to the short legs. In fact corgis literally have dwarfism in their legs which is why you have to be careful how much they bound about, especially as they get older.

You may have noticed that although corgis are a type of ci ‘dog’, they are not a *corci. The ci ‘dog’ element undergoes the Welsh soft mutation changing c /k/ to g /g/. How Welsh!

Corgis have an /n/-Stem plural option

The most common plural of corgi is corgwn /korgun/ which basically incorporates the plural cŵn /ku:n/ ‘dogs’ (note that Welsh w is always the vowel /u/ when not with another vowel). The plural shows that the Welsh dog word is actually related to Latin canis, French chien and even English hound. The root also appears in part of the name for yoga downward dog which is a svanasana (śvan- + asana, lit. a ‘dog asana’). What about the Celtic language Old Irish? The word for dog in Old Irish is , but in other case forms it is con-, a common name element in Irish.

But…Welsh plurals are not always regular. The singular corgi can also be plural corgïaid, at least according to the Geriadur Prifysgol Cymru. Because you can’t pin a corgi down.

Don’t Forget the Ladies

Female corgis have their own Welsh words too. The word for a female dog (or “bitch”) in Welsh is gast, and sure enough you can own a corgast or a coriast. Or if you object to the term “bitch”, you can have a corgïes /korgiɛs/ where -es is a generic feminine ending. By the way, this ending makes me an Americanes “American woman”.


I have determined that corgine should be used to indicate the high state of being a corgi. Not that being an ordinary canine is a bad thing, but not all English speakers understand the specialness of being a dog. We should not forget that legends tells us that corgis were ridden by fairies but given to humans in gratitude to a human who fixed a carriage (and other dog breeds have similar origins I assume).

In any case, the adjectival form of corgi is corgïaidd /korgiajð/ (not to be confused with the alternate plural corgïaid /korgiajd/. The dd is “soft th” or /ð/ in Welsh.

Other Little Things

The prefix cor can be found in other Welsh words notable corgoed ‘dwarf tree’ and coriarll ‘viscount’ or literally ‘little earl’. The Welsh are very productive and clever compoundists.

One Could Use Singular They, You Know

A question that I am sometimes asked as a linguist is why English can’t adopt a gender neutral pronoun alongside he, she and it. The irony is that English actually already has two options available, but they are rarely mentioned as being acceptable alternatives.

Singular Impersonal They

Any linguist worth their while will tell you that colloquial English widely uses singular impersonal they as common substitute for an unspecified person of any gender. This version of they shows singular agreement as can be seen in the examples below.

  • “A football player with a head injury must be cleared by a doctor before they can return to the game.”
  • “A person who doesn’t watch the news has only themself to blame if they are caught in the rain without an umbrella.”
  • “A person can’t help their birth” (Vanity Fair, William Thackery, 1848)
  • “There’s not a man I meet but doth salute me As if I were their well-acquainted friend” (Shakespeare, The Comedy of Errors, Act IV, 1594)

The examples, which include Thackery and Shakespeare, show that this construction has been in the language for many centuries, yet few advocate its use in Modern English.

Impersonal One

Another classic impersonal pronoun is one as in “One must be careful to watch the news on a regular basis.” (Thanks Linguistics Girl for this Reminder). And yet one rarely sees this pronoun mentioned.

I believe there are some reasons why these pronouns are often forgotten, but I will address that more next week.