Encoding on the Internet
4: Latin 1 (ISO-8859-1)
The Latin-1 Internet Standard
"Latin 1" or ISO-8859-1 was the standard 8-bit
encoding for Western European languages on the Internet until the advent of Unicode. The names comes
- "Latin" – For the Latin or Roman alphabet which is
what English and other Western European languages use
- I.S.O. – International Organization for Standardization (Geneva),
- 8859 – Designating encoding standards (as opposed to camera speeds
or other standards)
- 1 – The first encoding standard registered at the ISO
Charts of Latin 1 are available at:
NOTE: Some Web sites displaying encoding charts mistakenly refer to the entire Latin-1 character set as "ASCII" because characters #0-127 of Latin 1 are the same ASCII. However, characters #128-255 with
all the accented letters are NOT in ASCII. If you open an ASCII text
file, accented letters are generally missing.
Windows-1252 (ANSI) vs. Latin 1
In the past, Windows computers in the U.S. were based on Windows-1252 encoding (also known as CP-1252 and ANSI) encoding standard. The Windows-1252 encoding system is almost (but not quite) identical to Latin 1.
Specifically Windows-1252 character numbers #128-159 do not exist in Latin 1. There are other differences as well. Some entitiy codes (e.g. ) for the euro (€) sign were actually referring to Window-1252 code points.
Note: The Unicode point for the euro (€) sign is 8364 (U+20AC)
Modern Windows technology is based on Unicode, but some older components and software may incorporate Windows-1252 encoding. It is also the case that some people use Latin-1/ISO-8859-1 as exact synonyms which is not correct.