Chinese character

Chinese characters (漢字) are used in the written forms of the Chinese language, and to varying degrees in the Japanese and Korean languages (though the latter only in South Korea). Chinese characters have disappeared from Vietnamese — where they were used until the 20th century — and North Korea, where they have been completely replaced by Hangul.

Chinese characters are called hnz in Mandarin Chinese, kanji in Japanese, hanja or hanmun in Korean, and hán tư (also used in the chu nom script) in Vietnamese. However, the last is considered an extremely sinified form and Chinese characters are normally called chữ nho (字儒). (Note that the morphemes are reversed as is common in Vietnamese borrowings from Chinese.)

In Chinese, a word/phrase (詞 c) (a unit of meaning) is composed of one or more characters (字 z), as in hnz (漢字), which has two characters. As in all spoken Chinese, each Chinese character is read as a single syllabic unit.

Japanese, Korean, and Vietnamese are not linguistically related to Chinese, and in order to make Chinese characters work in those languages with radically different grammar, many adaptations had to be made. In many cases in these languages, characters different from those used in Chinese are used for words or ideas of the same meaning. A frequently cited example of this is 愛人 which means spouse (irn) in Mainland China but lover (aijin) in Japanese and 情人 which means lover (qngrn) in China but spouse (seijin) in Japanese. Also, many similar characters with identical meanings are written with slight differences. One example is black, which is written as 黒 (kuro) in Japanese, but as 黑 (hēi) in Chinese. For these reasons, particularly in China and Japan, where Chinese characters are used most often, it is frequently necessary to distinguish between Chinese Chinese characters and Japanese Chinese characters (though in English the distinction can often be made well enough by using the respective words hanzi and kanji).



The earliest Chinese characters are the so called "oracle script" or (甲骨文) jiǎgǔwn during the Shang Dynasty, followed by the bronzeware script or (金文) jīnwn during the Zhou Dynasty. These scripts no longer serve as anything but a source for scholars.

The first script that is still in (restricted) use today is the "seal script" or 篆書[篆书] zhunshū. It is the result of the efforts of the first emperor of China, Qin Shi Huang Di, in the standardization of the Chinese script. The Seal Script, as the name suggests, is now only used in artistic seals. Few people are still able to read the seal script, although the art of carving a traditional seal in the seal script remains alive in China today.

Scripts that are still used regularly for print are the "clerk script " or 隸書[隶书] lshū, the "Wei Monumental " or 魏碑 wibēi, the "regular script " or 楷書[楷书] kǎishū, the "Song Style " or 宋體[宋体] sngtǐ (only in printing), and the "running script " or 行書[行书] xngshū. Modern Chinese handwriting is usually modeled on the Running Script.

Finally, there is the "draft script " (also called "grass script"), or 草書[草书] cǎoshū. The draft script is an idealized calligraphic style, where characters are suggested rather than realized. Despite being cursive to the point where individual strokes are no longer differentiable, the draft script is highly revered for the beauty and freedom that it embodies. Many simplified Chinese characters are based on this style.


Main article: radical

Each character has a fundamental component, or radical (部首 Chinese: b shǒu, Japanese: bushu, literally "initial portion"), and this design principle is used in Chinese dictionaries to logically order characters in sets.

Full characters are ordered according to their initial radical, which fall into roughly 200 types. Then these are subcategorised by their total number of strokes.

This principle of categorisation is exploited by everybody who must learn to write Chinese characters: the vast number of Chinese characters can be much more easily memorized if they are mentally broken down into their constituent radicals.


Chinese scholars classify Han characters in several groups. The first type, and the type most often associated with Chinese writing, are pictograms, which are pictorial representations of the morpheme represented. There are also ideograms that attempt to graphicalize abstract concepts, such as "up" (上) and "down" (下). However, these pictograms and ideograms take up but a small proportion of Chinese logograms.

Excerpt from a 1436 primer on Chinese characters
Most Chinese characters, however, are radical-radical compounds, in which each element (radical) of the character hints at the meaning, and radical-phonetic compounds, in which one component (the radical) indicates the kind of concept the character describes, and the other hints at the pronunciation. This last type accounts for the majority of Chinese logograms. Note that despite being called "compounds", these logograms are single entities in themselves; they are written so that they take up the same amount of space as any other logogram.

Note that due to the long period of language evolution, such component "hints" within characters are often useless and sometimes quite misleading in modern usage. This is particularly true in non-Chinese languages.

Classification has its own problems, as the origins of characters are often obscure. For example, the character for "East" (東; Chinese: dōng, Japanese: higashi), which combines the "tree" radical (木) and the "sun" radical (日), is usually considered a radical-radical compound. Though it appears to represent a sun rising through trees, and this is both an evocative image and a useful mnemonic, the origin and classification of the character are disputed among scholars. While some agree with the radical-radical classification, others see it as a unique character in and of itself — some claim it as being derived from an early pictograph of bundled sticks.

As another example, the character for "mother" (媽 in Chinese ) consists of one component meaning "female" (女) and another one meaning "horse" (馬 mǎ). The first component denotes a female entity, whereas the second suggests the pronunciation by referring to the word for "horse." The reason that "horse" was chosen to represent mother may be that horses — in a historical context — were often used to represent "steadfastness". The majority of Chinese characters, like this example, have one component that suggests the meaning and another that suggests pronunciation. In many cases, even the component intended to suggest pronunciation has an abstract semantic relation to the idea expressed by the character. This is possible because the phonetic system of Chinese allows for many words to have the same pronunciation (homonymy), and because the consideration of phonetic similarity used in a character generally ignores its tone and the manner of articulation of its initial consonant (but not the place of articulation).


The design and use of a dictionary of Chinese characters presents interesting problems. Dozens of indexing schemes have been created for the Chinese characters. The great majority of these schemes — beloved by their inventors but nobody else — have appeared in only a single dictionary; only one such system has achieved truly widespread use. This is the system of radicals.

Chinese character dictionaries often allow users to locate entries in several different ways. Many Chinese, Japanese, and Korean dictionaries of Chinese characters list characters in radical order: characters are grouped together by radical, and radicals containing fewer strokes come before radicals containing more strokes. Under each radical, characters are listed by their total number of strokes. In Japanese and Korean dictionaries, it is usually possible to search for characters by sound, using Kana and Hangul. Most dictionaries allow searches by total number of strokes as well, and individual dictionaries often allow other search methods as well.

For instance, to look up the character 松 (pine tree) in a typical dictionary, the user first determines which part of the character is the radical, then counts the number of strokes in the radical (in this case four), and turns to the radical index (usually located on the inside front or back cover of the dictionary). Under the number 4, the user locates the radical 木, then turns to the page number listed, which is the start of the listing of all the characters containing this radical. This page will have a sub-index giving stroke numbers and page numbers. The right half of the character also contains four strokes, so the user locates the number 4, and turns to the page number given. From there, the user must scan the entries to locate the character he or she is seeking. Some dictionaries have a sub-index which lists every character containing each radical, so that if the user knows the number of strokes in the non-radical portion of the character, he or she can locate the correct page number directly.

In Korean, character dictionaries are usually called Okpyeon (옥편; 玉篇), which literally means "Jewel Book.", rather like the Latin word thesaurus ("treasure"). 玉篇 is also the name of a fourth century Chinese dictionary from the Liang Dynasty.

Other dictionary systems include the Four corner method

Number of Chinese characters

The question of how many characters there are is still the subject of debate. In the 18th century, European scholars claimed the total tally to be about 80,000. This number, however, is thought to be exaggerated as the character count varies by dictionary and its comprehensiveness. For example, the Kangxi Dictionary lists about 40,000 characters, while the modern Zhonghua Zihai lists in excess of 80,000. One reason for the overwhelming number of characters is due to the existence of rarely-occurring variant and obscure characters (many of which are unused, even, in Classical Chinese). Note, however, that no two characters are ever contextually identical.


It is usually said that about 3,000 characters are needed for basic literacy in Chinese (for example, to read a Chinese newspaper), and a well-educated person will know well in excess of 4,000 to 5,000 characters. Note that it is not necessary to know a character for every known word of Chinese, as the majority of modern Chinese words are compounds made of two or more morphemes, and are thus written not with a single unique character, but with multiple, usually common, characters.


In Japan there are 1945 "daily use kanji" (常用漢字 jōyō kanji) designated by the Ministry of Education. These are taught during primary and secondary school. Publications which include characters which fall outside this list are required to print furigana or rubi over the characters as a phonetic guide.

There are also 2232 government-designated "name kanji" (jinmeiyō kanji 人名用漢字) used in personal and geographical names, with plans to increase this list by 578 kanji in the near future. This would be the largest increase since World War II. The plan has not been without controversy, however. For example, the Chinese characters for "cancer," "hemorrhoids," "corpse" and "excrement," as well as parts of compound words (words created from two or more Chinese characters) meaning "curse," "prostitute," and "rape," are among the proposed additions to the list. This is because no measures were taken to determine the appropriateness of the kanji proposed, with the committee deciding that parents could make such decisions themselves. However, the government will seek input from the public before approving the list. For further information, see the Names section of the main Kanji article.

A well-educated Japanese person may know upwards of 3500 kanji. The Kanji kentei (日本漢字能力検定試験 Nihon kanji nōryoku kentei shiken or Test of Japanese Kanji Aptitude) tests the ability to read and write kanji. The highest level of the Kanji kentei tests the ability to read and write 6000 kanji, though in practice few people attain this level as Japanese generally uses fewer Chinese characters than Chinese does, and literacy in Japanese requires knowledge of fewer Chinese characters than literacy in Chinese.


In South Korea, middle and high school students learn 1,800 to 2,000 basic characters (Hanja), but most people use Hangul exclusively in their day-to-day lives. Chinese characters are still used to some extent, particularly in newspapers, place names and calligraphy.


Although nearly extinct, Vietnamese used varying scripts of Chinese characters to write the language, with use of Chinese characters becoming limited to ceremonial uses beginning in the 19th century. Similarly to Japan and Korea, Chinese was used by the ruling classes, and the characters were eventually adopted to write Vietnamese. To express native Vietnamese words which had different pronunciations than the Chinese, Vietnamese developed the Chu Nom script which added diacritical marks to distinguish native (Vietnamese) words from Chinese.

Rare characters

Often a character which is not commonly used (coined "rare" or "variant" characters) will appear in a personal or place name in Chinese, Japanese, and Korean names (see Chinese name, Japanese name, and Korean name respectively). This has caused problems as many computer encoding systems include only the 5,000 or so most common characters and exclude the less often used characters. This is especially a problem for personal names which often contain rare or classical characters.

People who have run into this problem include Taiwanese politicians Wang Chien-shien and Yu Shyi-kun and Taiwanese singer David Tao (陶喆). Newspapers have dealt with this problem in varying ways, including trying to create a character from two characters, including a picture, or, especially as is the case with Yu Shyi-kun, simply omitting the rare character with the hope that the reader will be able to infer who it refers to. Japanese newspapers may render such names and words in katakana instead of kanji, and it is common practice for people to write names for which they are unsure of the correct kanji in katakana instead.

