Bazsites.com Character Encoding
Directory Topics
On the Web
- RFC 1842 - ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages - Describes the HZ-GB-2312 encoding, the 7-bit Chinese character encoding designed for use in electronic mail and network news messages over the Internet.
- RFC 1922 - Chinese Character Encoding for Internet Messages - Describes ISO-2022-CN and ISO-2022-CN-EXT encoding for transporting Chinese characters in Internet services such as electronic mail, network news, telnet and the World Wide Web. It also discuesses the common CN-GB, CN-Big5 and ISO-10646 encodings.
- RFC 2237 - Japanese Character Encoding for Internet Messages - Describes the ISO-2022-JP-1 encoding scheme, which is used in electronic mail, and network news. Also provides a listing of the Japanese Character Set that can be used in this encoding scheme.
- Tutorial: Shady Characters - A tutorial that explains HTML character sets, character encodings and character references from Webreference.com.
- RFC 1557 - Korean Character Encoding for Internet Messages - Describes the ISO-2022-KR encoding method being used to represent Korean characters in both header and body part of the Internet mail messages.
- HTML Validation: Using Character Encodings - How to validate HTML documents in various character encodings.
- RFC 1468 - Japanese Character Encoding for Internet Messages - Describes the ISO-2022-JP encoding used in electronic mail and network news messages in several Japanese networks. It was first specified by and used in JUNET. The encoding is now also widely used in Japanese IP communities.
- A Brief History of Character Codes - A concise history of the development of character encoding in Western and East Asian languages, including ASCII, EBCDIC, Unicode and TRON.
- UTF-8 Encoding Table and Unicode characters - Unicode code points and their UTF-8 encoding.
- Cham encoding proposal - Draft of the proposal for encoding Cham to ISO. Contains proposed names, and shapes of the characters and other details about the script.
Wikipedia Articles
- Character encoding - A character encoding consists of a code that pairs a sequence of characters from a given character set (sometimes referred to as code page) with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the storage of text in computers and the ...
- Legacy encoding - In computing, a legacy encoding is a character encoding that continues to be used despite being obsoleted by another encoding. An encoding considered legacy in one context may remain the preferred encoding in another.
- Chinese character encoding - In computing, Chinese character encodings can be used to represent text written in the CJK languages — Chinese, Japanese, Korean — and (rarely) obsolete Vietnamese, all of which use Chinese characters. Several general-purpose character encodings accommodate Chinese characters, and some of them were developed specifically for Chinese.
- Variable-width encoding - A variable-width encoding is a type of character encoding scheme in which codes of differing lengths are used to encode a character set (a repertoire of symbols) for representation in a computer. Most common variable-width encodings are multibyte encodings, which use varying numbers of bytes (octets) to encode different characters.
- HZ (character encoding) - The HZ character encoding is an encoding of GB2312 that was formerly commonly used in email and USENET postings. It was designed in 1989 by Fung Fung Lee (李楓峰) of Stanford University, and subsequently codified in 1995 into RFC 1843.