Cantonese (linguistics)

This article is on all of the Yue dialects. For the dialect of Guangzhou and Hong Kong, see Standard Cantonese.

Cantonese (廣東話/广东话, lit. "Guangdong speech", colloquial; 粵語/粤语, lit. "Yu dialect", formal) is one of the major dialects or languages of the Chinese language or language family. It is mainly spoken in the south-eastern part of Mainland China, Hong Kong, Macau, by the Chinese minorities in Southeast Asia and by many overseas Chinese of Cantonese origin worldwide. Its name is derived from Canton, the former English name for Guangzhou, the capital city of Guangdong Province. It is a tonal language.

It is the lingua franca of the overseas Cantonese diaspora, spoken by about 70 million Cantonese worldwide. While fewer than the nearly one billion of Mandarin speakers, it is rivalled overseas only by the 40 million speakers of Hokkien, or Southern Fujianese dialects, many of whom are located throughout Southeast Asia. Cantonese is most commonly spoken in Hong Kong, a financial and cultural capital of southern China, and in one form or another in many if not most Chinatowns around the world with Cantonese communities. For instance, sei yap or siyi (四邑) dialect, from the Guangdong counties where a majority of Exclusion-era Cantonese-Chinese immigrants emigrated, continues to be spoken both by recent immigrants from Southern China and even by third-generation Chinese Americans of Cantonese ancestry alike.

Like other major varieties of Chinese, Cantonese is often considered a dialect of a single Chinese Language for cultural or nationalistic reasons; most linguists consider Cantonese a separate language in the sense that they use the term, with notable exceptions in the People's Republic (see Is Chinese a language or a family of languages?).


Dialects of Cantonese

There are at least four major dialect groups of Cantonese: Yuehai, which includes the dialect spoken in Guangzhou, Hong Kong and Macau as well as the dialects of Zhongshan, and Dongguan; Siyi (sei yap), exemplified by Taishan (台山 Toisaan, Hoisaan) dialect, which used to be ubiquitous in American Chinatowns before 1970; Gaoyang, as spoken in Yangjiang ; and Guinan (Nanning dialect) spoken widely in Guangxi. However, Cantonese generally refers to the Yuehai dialect.

For the last 150 years, Guangdong Province has been the home of most of the Chinese emigrants; one county near its center, Taishan (where the siyi or sei yap dialect of Cantonese is spoken), alone may have been the home to more than 60% of Chinese immigrants to the US before 1965, and as a result, Guangdong dialects such as sei yap (the dialects of Taishan, Enping , Kaiping , Xinhui Counties) and what we understand to be mainstream Cantonese (with a heavy Hong Kong influence) have been the major spoken dialects abroad. As more and different kinds of Chinese emigrate, however, the situation is now changing, so that Min (Hokkien, or Fujianese dialect speakers) and Wu dialect speakers are also now heard, as well as Mandarin in increasing numbers from Taiwanese and Northern mainland immigrants.

In addition, there are at least three other major Chinese languages spoken in Guangdong Province—Putonghua, which is official standard Mandarin, spoken in official occasions, used in education, and among the many internal migrants from the north seeking work in the developing south; Min-nan (Southern Min) spoken in the eastern regions bordering Fujian, such as those from Chaozhou and Shantou; and Hakka, the language of the Hakka minority. Hanyu or Mandarin is mandatory through the state education system, but in the Southern household, the popularization of Cantonese-language media (Hong Kong films, television serials, and Cantopop, most notably), isolation from the other regions of China, and the healthy economy of the Cantonese diaspora ensure that the language has a life of its own. Most wuxia films from Canton are filmed originally in Cantonese and then dubbed in Mandarin or English or both.


See Standard Cantonese for a discussion of the sounds of Standard Cantonese and pages on individual dialects for their phonologies.

Cantonese versus Mandarin

In some ways, Cantonese is a more conservative dialect than Mandarin. This can be seen, for example, by comparing the words for "I/me" (我) and "hunger" (餓). They are written using very similar characters, but in Mandarin their pronunciation is quite different ("wǒ" vs. "è"), whereas in Cantonese they are pronounced identically except for their tones (ngo5 vs ngo6 respectively). Since the characters hint at a similar pronunciation, it can be assumed that their ancient pronunciation was indeed similar (as preserved in Cantonese), but in Mandarin the two syllables acquired different pronunciations in the course of time.

Cantonese sounds quite different from Mandarin, mainly because it has a different set of syllables. The rules for syllable formation are different; for example, there are syllables ending in non-nasal consonants (e.g. "lak"). It also has a different set of tones. Cantonese is generally considered to have 6 or 7 tones, depending on who is doing the counting, whereas Mandarin has 4 plus a "neutral tone."

Cantonese preserves many syllable-final sounds that Mandarin has lost or merged. For example, the characters, (裔,屹,藝,艾,憶,譯,懿,誼,肄,翳,邑,佚) are all pronounced yi4 in Mandarin, but they are all different in Cantonese (jeoi6, ngat6, ngai6, ngaai6, yik1, yik6, yi3, yi4, si3, ai3, yap1, and yat6, respectively). However, Mandarin's vowel system is somewhat more conservative than Cantonese's, in that many diphthongs preserved in Mandarin have merged or been lost in Cantonese. Also, Mandarin makes a three-way distinction among alveolar, alveopalatal , and retroflex fricatives, distinctions that are not made by Cantonese.

There is another obvious difference between Cantonese and Mandarin. Mandarin lacks the syllable-final sound "m"; final "m" and final "n" in Cantonese have merged into "n" in Mandarin, as in Cantonese "taam6" (譚) and "taan4" (壇) versus Mandarin tán, Cnt. "yim4" (鹽) and "jin4" (言) versus Mnd. yán, Cnt. "tim1" (添) and "tin1" (天) versus Mnd. tiān, Cnt. "ham4" (含) and "hon4" (寒) versus Mnd. hán. The examples are too numerous to list. Furthermore, nasals can be independent syllables in Cantonese words, like "ng5" (五) "five," and "m4" (唔) "not."

There are clear sound correspondences in, for instance, the tones. For example, a fourth-tone word in Cantonese is usually second tone in Mandarin.

Despite the broad area over which Cantonese is spoken, most universities in the US do not and have not historically taught Cantonese. Most only offer Chinese classes in Mandarin because of Mandarin's status as the official dialect of both the People's Republic of China and the Republic of China. In addition, Mandarin was the court dialect formerly used in Imperial China. But Cantonese courses can be found at some US universities. The University of Hawaii is one example.

Written Cantonese

Cantonese, like many Chinese languages, is often written formally based on Standard Mandarin syntax and grammar. However, the written form of spoken Cantonese is common informally among Cantonese speakers, and in its subculture.

Circumstances where written Cantonese is used include conversations through instant messengers, subtitles in Hong Kong movies and advertisements. It is because written Cantonese is more reflective and expressive, and more receptive among speakers of Cantonese.

Records of legal documents in Hong Kong also use written Cantonese sometimes, in order to record exactly what a witness has said.

Cantonese contains some unique characters that are not found in standard written Chinese.

Colloquial Cantonese is rarely used in formal forms of writing; formal written communication is almost always in standardized Mandarin or hanyu, albeit still pronounced in Cantonese. However, written colloquial Cantonese does exist; it is used mostly for transcription of speech in tabloids, in some broadsheets, for some subtitles, and in other informal forms of communication. It is not uncommon to see the front page of a Cantonese paper written in hanyu, while the entertainment sections are, at least partly, in Cantonese. The vernacular writing system has evolved over time from a process of modifying characters to express lexical and syntactic elements found in Cantonese but not the standard written language. In spite of their vernacular origin and informal use, these characters have become so important in the Canton region for communication that the Hong Kong Government has incorporated them into a special Supplementary Character Set (HKSCS).

A problem for the student of Cantonese is the lack of a widely accepted, standardized transcription system. Another problem is with Chinese characters: Cantonese uses the same system of characters as Mandarin, but it often uses different words, which have to be written with different characters. At least this is the case in Hong Kong, but in the Canton area of mainland China, Cantonese is written with the exact same characters as Mandarin, though the characters stand for words not actually used in Cantonese. An example may help to clarify this:

The written word for "to be" is 是 in spoken Mandarin (pronounced shì) but is 係 in spoken Cantonese (pronounced hai6). In formal written Chinese, only 是 is used; 係 is only used in classical literature. However, in Hong Kong, 係 is sometimes used in colloquial written Cantonese.

Many characters used in colloquial Cantonese writings are made up by putting a mouth radical (口) on the left hand side of another more well known character to indicate that the character is read like the right hand side, but it is only used phonetically in the Cantonese context. The characters , 叻, 吓, 吔, 呃, 咁, 咗, 咩, 哂, 哋, 唔, 唥, 唧, 啱, 啲, 喐, 喥, 喺, 嗰, 嘅, 嘜, 嘞, 嘢, 嘥, 嚟, 嚡, 嚿, 囖 etc. are commonly used in Cantonese writing. As not all Cantonese words can be found in current encoding system, or the users simply don't know how to enter such characters on the computer, in very informal speech, Cantonese tends to use extremely simple romanization (e.g. use D as 啲), symbols (add an English letter "o" in front of another Chinese character; e.g. 㗎 is defined in Unicode, but will not display in Microsoft Internet Explorer 6.0. hence the proxy o架 is oftern used), homophones (e.g. use 果 as 嗰), and Chinese character of different Mandarin meaning (e.g. 乜, 係, 俾 etc.) to compose a message. For example, "你喺嗰喥好喇, 千祈咪搞佢啲嘢。" is often written in easier form as "你o係果度好喇, 千祈咪搞佢D野。" (character-by-character, approximately 'you, being, there (two characters), good, (final particle), thousand, pray, don't, mess with, him/her, (genitive particle), things', translation 'You'd better stay there, and please don't mess with his/her stuff.')

Other common characters are unique to Cantonese or deviated from their Mandarin usage, they include: 乜, 冇, 仔, 佢, 佬, 係, 俾, 靚 etc.

The words represented by these characters are sometimes cognates with pre-existing Chinese words. However, their colloquial Cantonese pronunciations have diverged from formal Cantonese pronunciations. For example, in formal written Chinese, 無 (mou4) is the character used for "without". In spoken Cantonese, 冇 (mou2) has the same usage, meaning, and pronunciation as 無, differing only by tone. 冇 represents the spoken Cantonese form of the word "without", while 無 represents the word used in Mandarin (pinyin: w) and formal Chinese writing. However, 無 is still used in some instances in spoken Cantonese, like 無論如何 ("no matter what happens"). Another example is the doublet 來/嚟, which means "to come". 來(loi4) is used in formal writing; 嚟 (lei4) is the spoken Cantonese form.

