Switch Keyboards or Use Codes
Most modern Web editing tools support Unicode natively, so all that is necessary is to activate and use foreign language input tools as in the procedure below:
- Open a blank new editing window in WYSIWYG or HTML code view.
- Switch your keyboards to the appropriate script or input appropriate symbol codes.
- Input the content and save it. It will be encoded as Unicode.
The above procedure works for Sites at Penn State (Word Press), Wikispaces, Drupal and others.
Note: Some systems may need adjustments, so it’s important to test each system for non-English content.
Language Tags
By default pages in these systems are marked as either English or no language. If you insert non-English words, phrases or paragraphs, they should be tagged as being in another language as in the example below.
Why Add Language Tags
One reason to tag content is screen reader accessibility, so that pronunciation engines switch from English to another language. Another is to help search engines find content from different languages. Finally, you can use tags to facilitate CSS formatting for specific languages.
Example Tag
For example if you include a French sentence Ceci n’est pas une pipe (lit: This is not a pipe, the HTML code would be:
HTML
<span lang="fr">Ceci n'est pas une pipe</span>
Note that "fr" is a pre-defined ISO-639 code
In current systems, language tagging must be done in HTML code view. See the Language Tag in HTML page for more information.
Escape Codes to Circumvent Formatting
Because some Web site WYSIWYG editors may insert their own code automatically, you may need to use escape codes to circumvent some issues.
Circumventing Auto Linking
In some systems like Drupal, typing a full URL with the "http://" prefix (http://www.psu.edu/) will cause that URL to become an active link. If you want to display a URL with turning it into a link, you can try replacing the "/" character with numeric escape code (&;#47;).
HTML
http:&;#47&;#47www.psu.edu
Displays
http://www.psu.edu
Circumventing Entity Code Conversion
A more specialized need is to display an entity code such as ç (French ç) or &;#47; (front slash /) without them being converted to the character.
If your CMS converts entity codes to characters, try inserting the zero-width non adjoing character (‌) in between the & or & section and the rest of the code. This will add an extra character of zero width. This character is normally used in scripts like Arabic to prevent characters from merging, but it also works here.
HTML
&‌ccedil;
Displays
ç