Script Basics

The Chinese script is a logographic script structured so that each character
represents a single concept; characters are then combined
to form compound words.
Note: The script does also have a phonetic component.

Although there are several distinct varieties (or "dialects") spoken in China including Mandarin and Cantonese (Hong Kong),they can all read many of the same "written words" because the script is more based on meaning, not on sound.

See the links below for more information

Chinese Simplified, Chinese Traditional, Pinyin

There are several variants of the the Chinese script used in different contexts.

  1. Chinese Traditional is the older form of the script and is used in Taiwan, Hong Kong and other locations outside of China, including various "Chinatowns" in the West. Chinese Traditional characters are more complex and more numerous.
  2. Chinese Simplified was developed in Mainland China (and adopted in Singapore) as a way of simplifying the older system in order to increase literacy. As part of the of the simplification, several Traditional characters were collapsed into one character in Simplified. Although it is relatively easy to convert from Chinese Traditional to Chinese Simplified, the reverse is not always true. As a result, most systems support both Traditional and Simplified Chinese in parallel.
  3. Pinyin is the term used to refer to the system of writing Chinese words in the Latin (English) alphabet. This was developed in the 1950’s in Mainland China to help increase literacy.

Example Traditional vs. Simplified Chinese

The table below shows how the name for Mandarin Chinese changes between scripts and even nationalities. Note though that the characters in the form from China are the same in both Traditional and Simplified Chinese.

Phrase "Spoken Mandarin Chinese" in Different Forms
National
Variant
Trad. Simpl. Pinyin
Singapore
‘Chinese language’
華語 华语 Huáyǔ
Taiwan
‘national language’
國語 国语 Guóyǔ
China
‘common speech’
普通話 普通话 Pǔtōnghuà

Language/Dialects

See the Other Language/Dialects section for information on forms like Cantonese and Wu.

Test Sites

If you have your browser configured correctly, the Web sites above should display the correct characters. If you have difficulties, see list below for font and browser configuration instructions.

If these sites are not displaying correctly, see the Browser Setup page for set up information.

Font Recommendations

Both Windows and Mac (and mobile platforms) provide a set of Japanese fonts, but more decorative versions may be found through font vendors or font download sites.

Traditional Chinese Fonts by Platform

  • Windows – MingLiU, PMingLiU, Microsoft JhengHei
  • Mac OS X – AppleLiGothiic Medium, Li Hei Pro, Apple LiSung, BiauKai, LiSongPro
  • Mac System 9 – Taipei, others

Simplified Chinese Fonts by Platform

  • Windows – SimSun, NSimSun, SimHei, Microsoft YaHei, others
  • Mac OS X – Hei, STHeiti Light and Regular, STFangsong, STKaiti, STSong, Kai
  • Mac System 9 – Beijing, others

Activate Input/Typing Utilities

Different Input Options

In Windows, Macintosh/iOS and Droid, input options for both Simplified and Traditional Chinese are available.

You can also activate different input options for each script. Typical options include

  • Phonetic/Pinyin – Users can type a syllable in pinyin and then select the correct character.
  • By Radical/Stroke – This allows a user to search and enter characters by radical or stroke forms.
  • Handwriting – Some systems allow users to write a character on a trackpad.
  • Additional standards may be supported.

Activate Input Utilities (Windows and Mac)

Yabla How to type Chinese using Pinyin gives detailed instructions for activating Chinese pinyin input on both Windows and Macintosh as well as iPhone and Droid.

You can also view generic documentation for

Tone Marks in Pinyin

Macintosh

If you activate the Extended (ABC) Keyboard on the Macintosh, the following codes allow you to type different accent codes.

Mac Accent Codes, X = any letter
ACCENT SAMPLE TEMPLATE
Macron Ā,ā Option+M, X
Circumflex Â,â Option+6, X
Acute á,Á Option+E, X
Grave À,à Option+`, X
Umlaut ü,Ü Option+U, X

Windows

A more limited set of accent codes are if the Windows International keyboard is activated. The long mark (macron) is not available there.

Web Development

This section presents information specific to Chinese. For general information about developing non-English Web sites, see the Encoding Tutorial or the Web Layout sections.

Historical Encodings

Unicode (utf-8) which corresponds to GB18030 (mandated in the People’s Republic of China) is the preferred encoding for Web sites, but the following older encodings may be encountered.

  • Use Unicode (utf-8) whenever possible
  • Simplified Chinese Historic Encodings: gb18030, gb2312, gbk , Others
  • Traditional Chinese Historic Encodings: big5, euc-tw, Others

Language Tags

Language Tags allow browsers and other software to process Chinese text more efficiently. Below are the recommended codes for different scripts

  • Chinese: zh (the most generic tag, but rarely used)
  • Mandarin Chinese, Simplified Script: zh-Hans is preferred, but zh-CN may be found on older sites.
  • Mandarin Chinese, Traditional Script: zh-Hant or zh-Hant-TW (Taiwan) is preferred, but zh-TW may be found on older sites.
  • Pinyin (Mandarin): zh-Latn-pinyin for Mandarin. If the text is not Mandarin,use one the dialect codes below.
  • Cantonese (Hong Kong): zh-HK
  • Additional tags for dialects

Vertical Text

See the Vertical Text page for information on vertical Chinese text

Other Chinese Languages/Dialects

About Chinese Dialects/Sinitic Languages

Different regions of China speak in varieties which are traditionally called "dialects", but they are so far apart that spealers from different regions may not understand each other. Linguists usually consider these dialects to be separate related languages and sometimes use the term "Sintic languages".

The standard form of modern spoken Chinese is called Mandarin Chinese, but other forms include Cantonese/Yue (Hong Kong), Wu (Shanghai) and Hakka.

Language Codes

For these varieties, there are currently two standards available, the IANA standard which adds "variety" tags to the base zh tag or the SIL ISO-639-3 standard which treats dialects as separate languages.

Note: A indicates no IANA or ISO-639-3 code registered.

Regional Chinese Codes
Variety IANA ISO-639-3
"Chinese" zh zho
Mandarin zh-guoyo or
zh-cmn
cmn
Cantonese zh-yue or
zh-HK
yue
Gan zh-gan gan
Hakka zh-hakka hak
Huizhou czh
Jinyu cjy
Min* zh-min
Min Bei mnp
Min Dong cdo
Min Zhong czo
Min-Nan zh-min-nan nan
Pu-Xian cpx
Wu zh-wuu wuu
Xiang zh-xiang hsn

* Min includes Fuzhou, Hokkein, Amoy, Taiwanese

Script and Language Tag

Most non-Mandarin Chinese documents are written in either Traditional Chinese (or Simplified Chinese with additional characters), pinyin or some other Western phonetic form. To distinguish the forms, one can use a script tags like wuu-Latn-pinyin (Wu Chinese in pinyin) or wuu-Hant (Wu Chinese in Traditional Chinese)

Links on Chinese Dialects

Chinese Computing

Windows

Macintosh

Mobile

Linux/Unix

Chinese Language

Script Basics

Chinese Dialects

Web Development Tips

Technical Issues

Top of Page