A fairly old accessibility recommendation is that you insert a tag to indicate a language change. For instance, if I were to write a Welsh sentence Hanner paint o gwrw os gwelwch yn dda (or ‘Half a pint of beer please’), it would be tagged something like the code below.
Sample Lang Tag
<cite lang ="cy">Hanner paint o gwrw os gwelwch yn dda</cite>
Note that this assumes that you’ve included a
<html lang="en-us">' tag in your code…which many systems do these days!
But someone asked an interesting question – do pop culture phrases like ¡Hasta la vista….baby! have to be tagged as “Spanish”. Hmm. This isn’t really an accessibility question so much as a linguistics question, which is when does a word stop being a “borrowing” and become part of the English language? It actually does happen in stages. We all realize that taco, señor, jalapeño are Spanish words but so are lasso, canyon, rodeo
not to mention Arizona, Colorado.
But we don’t realize that lasso (cognate with lace from French) is Spanish partly because we have “nativized” the pronunciation (there is no “short a” in Spanish). The more recent borrowings tend to resemble Spanish a little bit more.
Is it in the Dictionary Yet?
In theory, one reason to tag a switch in language is to switch pronunciation dictionaries. Clearly this would be ridiculous for “lasso” (or “lahso” in Spanish), and might be overkill for “taco” which does have an entry in the English dictionary (I mean the official ones published used as reference materials)…and therefore is likely to be in the screen reader list of words.
Unfortunately, I don’t think Hasta la vista is in the official dictionary…partly because dictionaries generally don’t include phrases. The only word that will be in the English dictionary is “vista”, but with its own English pronunciation which is different from Spanish. But since the English word “baby” has now intruded, you can have embedded LANG tags as in:
<cite lang="es">¡Hasta la vista....<span lang="en">baby!</span></cite>
Isn’t compliance fun?
But let’s go on a cynical sidetrip here and ask…if you tag it will the tool recognize it? Screen readers…sort of…if you want it too (and know how to enable automatic language detection). Some search engines may have a better record (or they may relying on ISP). The most “robust” use is in the Word spell checker. If you have an extended text in Spanish, every word will be marked as a spelling/grammatical error until you “mark” the text as Spanish (so that Microsoft can switch checkers.) It’s under the Tools » Language » Set Language menu (except for Office 2007 where it’s under the Review tab.
Muy bien, but…which languages really count? We all have access to a Spanish and French spelling dictionary/pronunciation files (and German, Dutch, Italian….), but what about Welsh and Basque? There may dictionaries, but they do not come standard. You have to hunt these out and install them. Still at least they exist.
However, there are those languages without any dictionaries (the ones with about 10,000 speakers or less). Then the tagging here is really just metadata, but is it good metadata? From a linguistic point of view, the standard language codes are not very useful for detailed linguistic description.
The ISO-639 codes normally really NOT language codes in a linguistic sense, but just codes correlating to a spelling system. It matters whether it is “en-US” (USA) or “en-GB” (Britain) because the two countries spell differently (and we do have minor gramattical differences). The distinction between “en-US” (USA) and “en-PR” (English spelling in Puerto Rico) is technically there, but in practice non-existent. Most English writing in Puerto Rico is probably aimed at the U.S. standard.
As the last example shows, the original method of specifying codes just by country has some problems. Fortunately, the standards groups are working on it. Still the new codes, which may be better, are rarely recognized by the vendors (or at least there is a MAJOR timelag).
So what are we tagging and does it matter? For “major languages” yes it does matter. For lots of other uses…probably not. I could tag (and often I do), but at the end of the day, it’s the visible text identifying the language/dialect that matters the most.