Encoding on the Internet

4: Latin 1 (ISO-8859-1)

The Latin-1 Internet Standard

"Latin 1" or ISO-8859-1 was the standard 8-bit
encoding for Western European languages on the Internet until the advent of Unicode. The names comes
from:

  • "Latin" – For the Latin or Roman alphabet which is
    what English and other Western European languages use
  • I.S.O. – International Organization for Standardization (Geneva),
    www.iso.ch
  • 8859 – Designating encoding standards (as opposed to camera speeds
    or other standards)
  • 1 – The first encoding standard registered at the ISO

Charts of Latin 1 are available at:

NOTE: Some Web sites displaying encoding charts mistakenly refer to the entire Latin-1 character set as "ASCII" because characters #0-127 of Latin 1 are the same ASCII. However, characters #128-255 with
all the accented letters are NOT in ASCII. If you open an ASCII text
file, accented letters are generally missing.

Windows-1252 (ANSI) vs. Latin 1

In the past, Windows computers in the U.S. were based on Windows-1252 encoding (also known as CP-1252 and ANSI) encoding standard. The Windows-1252 encoding system is almost (but not quite) identical to Latin 1.

Specifically Windows-1252 character numbers #128-159 do not exist in Latin 1. There are other differences as well. Some entitiy codes (e.g.) for the euro () sign were actually referring to Window-1252 code points.

Note: The Unicode point for the euro () sign is 8364 (U+20AC)

Modern Windows technology is based on Unicode, but some older components and software may incorporate Windows-1252 encoding. It is also the case that some people use Latin-1/ISO-8859-1 as exact synonyms which is not correct.

Top of Page | Encoding Tutorial Index

Skip to toolbar