A beginner's guide to learning Cantonese: What is "Chinese" and how do you even learn a language?
- Alexandre

- Sep 19, 2022
- 19 min read

Hong Kong cinema stars such as Bruce Lee - a household name in the United States - did much to popularise the Cantonese language worldwide
Recently, I moved to Hong Kong, and as a part of this new venture, I've taken up the challenge of familiarising myself with the local language. As somebody who already has experience in learning languages and can conversationally get by in about ten languages, I welcomed this challenge with open arms, and have found myself pushing the boundaries of my comfort zone by throwing myself into conversations with locals at shops or in restaurants in the three or so weeks since I arrived here.
I was first introduced to Cantonese as a kid by Hong Kong film stars such as Jackie Chan and Bruce Lee, and soon was watching films in what I thought was "Chinese". Charmed by the colourful and expressive sonority of the language, I soon found myself eager to learn Chinese and started by buying myself some resources. However, the Chinese I was studying didn't sound anything like the Chinese from those films. Unfortunately, I was unaware at that time that "Chinese" does not in fact exist - at least not in the way that it's commonly understood.
In this post, I will give some background to the Cantonese language and share some of the methods and resources that I've used for learning it, which are just as easily applicable to any other language. It must however be stressed that everyone learns best in their own way. What works for me might not work for others, but I hope that some of the information I give in this post will help you with your own language-learning goals, even if it's not for Cantonese itself!

Introduction: What is "Chinese" and what is "Cantonese"?
Many people assume that "Chinese" is a unified, standard language, and they're not exactly wrong. In recent years, there have been efforts to make a reformed version of Mandarin into the national language of the People's Republic of China, called "Standard Chinese". However, this obscures the true wealth of linguistic diversity that is the Chinese (Sinitic) language family, and people looking to practise their Mandarin in Hong Kong will soon experience a rude awakening. China is broadly as large as Europe, and instead, "Chinese" is better understood as a language family.

A map of the languages of the Sino-Tibetan language family
According to the 2021 Hong Kong census, 88.2% of the population speak Cantonese (written as 廣東話 in traditional Chinese characters; Yale: Gwóngdūng wá, Jyutping: Gwong2 Dung1 waa2), the name given to the regional language of Guangzhou (called "Canton" in English), the biggest city of Guangdong province, of which Hong Kong used to be a part before its colonisation by the British after the First Opium War (1839-1842). Cantonese was also widely popularised worldwide by the Hong Kong cinema industry, which saw the likes of Bruce Lee, Jackie Chan and many more rise to international acclaim.

The Sinitic (Chinese) languages on a map

A family tree of the Sino-Tibetan languages with extinct languages in red and existing languages in green
Presentation and history
Cantonese is part of the Chinese branch of the Sino-Tibetan language family and is a cousin of other Chinese languages. It is part of the "Yue" (粤) branch of the Chinese languages, which is descended from Middle Chinese, which was the form of Chinese spoken during the Sui (581 - 618 CE) and Tang (618 - 907 CE) dynasties, according to scholars. The name "Yue" comes from the autochthonous Yue people who lived in the region prior to the immigration of "Han" Chinese people during the Han Dynasty (202 BCE - 220 CE). These people were perhaps related to modern-day Vietnamese people, with "Yuè" (越) being a Mandarin parallel of the Vietnamese "Việt" which is used to refer to ethnic Vietnamese, and the name of the Nanyue (Vietnamese: Nam Việt, "Southern Yue") Kingdom is considered the base of the name of Vietnam itself (Việt Nam).

A statue of one of the later-Sinicised Yue people with tattooes and a mustache, the former of which was very badly considered in traditional "Chinese" culture
Following these migrations, the Han people from colder Northern climates entered a new tropical region with a new geography and food items, where they mixed with the local inhabitants. This is thought to have born an influence on the Cantonese language itself. Chinese Yue languages are widely spoken across Hong Kong, Macau, Guangdong province, Hainan and the east of Guangxi, as well as by overseas Chinese communities. Cantonese itself is recognised as the main language of Guangzhou, Hong Kong and Macau, which each have their own kinds of dialects. The region in which the Yue languages are widely spoken in China is sometimes referred to as Lingnan (south of the Nanling mountains), which donned its name to a cultural sphere shared between Cantonese, Hakka, Teochew, Haishanese and Hainanese peoples, who are distinctly Southern Chinese in culture. The resulting Cantonese people are the ones who would begin speaking the language we now call Cantonese, whose culture and identity came to a head during the Tang Dynasty (618 - 907 CE), which saw the beginning of Southern China's cultural, economic and political autonomy and supremacy during the succeeding Song Dynasty (960 - 1279 CE).

The Han-dynasty (202 BCE - 220 CE) expansion into Yue territories and elsewhere

The Chan clan temple built in Guangzhou in 1894 is an example of the distinct architectural style that developed in the Lingnan region
Phonology
Cantonese is a fairly conservative descendant of Middle Chinese in terms of pronunciation and tones compared to Mandarin, with the former preserving finals (especially the -k, -t, and -p endings) while the latter preserved initials better. Like other Chinese languages, Cantonese is a tonal language, which means that the inflexion or intonation of a word conveys some of its meaning, and these tones were inherited from Middle Chinese. In fact, if you read Tang Dynasty poetry or any middle Chinese text in Cantonese, it will actually sound much more authentic compared to Mandarin, and Cantonese tones are closer to Middle Chinese tones.
A phonetic reconstruction of basic Early Middle Chinese vocabulary from the Tang Dynasty (618 - 907 CE), which would become the ancestor of almost all modern Chinese readings of Chinese characters as well as the Sino-Xenic readings abroad of Vietnamese, Korean and Japanese
This is why I have included Cantonese and Middle Chinese readings alongside Mandarin ones in my meta-translations of Tang- (618 - 907 CE0 and Song-dynasty (960 - 1279 CE) Zen texts...

The Cantonese tonal system, consisting of six tones For more information on tones, please click here
It would appear that Southern Chinese languages such as Cantonese and Hakka have better preserved some of the features of Middle Chinese than Northern languages such as Mandarin. On the other hand, the Wu languages - such as Shanghainese - and Min languages - such as Taiwanese - appear to show more direct ancestry from an even more ancient form of Chinese often called Old Chinese. A dictionary of Chinese characters with side-by-side comparison of some different Chinese dialects alongside Middle Chinese and Sino-Xenic readings (Vietnamese, Korean, Japanese) can be found here:


Maps of the Yue (Cantonese) dialect families
Writing
Like other Chinese languages, Cantonese is written in standard Chinese script, using Chinese characters. In Hong Kong, people write in traditional Chinese characters, whereas in the Mainland, people write with the reformed (and, in my opinion, less aesthetically pleasing and etymologically profound) simplified Chinese characters, which were created by the People's Republic of China in order to promote literacy.

In light green are places where simplified characters are used; in medium green are places where simplified characters are used officially but traditional characters are still common; in dark green are places where traditional characters are used; in cyan are places where Chinese characters are used alongside other scripts; in yellow are places which no longer use Chinese characters but used to
Contrary to popular belief, these are not simply pictograms and the etymology of Chinese chracters is in fact very rich and gives a unique understanding of the language beyond its spoken aspects. The earliest known examples of Chinese script are from the oracle bone inscriptions of the Shang dynasty (1600 - 1045 BCE), where inscriptions were carved onto oracle bones cast into fire for divination purposes. This script is now known as oracle bone script, and later evolved into modern-day characters. These were in fact very pictographic, but later became part of other characters which either used them for their meaning, sound or form.

An example of one of the retrieved oracle bones of the Bronze-Age Shang period
If a character is used for its meaning, what is implied by the use of the character (now a sub-part or radical in another character) is the semantic meaning; for example, a character with the "heart" radical will likely be related to thoughts or feelings. If a radical is used for its sound, it means that the character sounds a lot like the word which is being pointed to with the use of the new character - kind of like if you used a sketch of an eye to convey the word "I" - however this is not always obvious in modern Chinese dialects and was more obvious when Old Chinese was spoken. If a radical is used for its form, this means that the character fits into a graphic representation of the word the new character represents, like how if you align many radicals for "tree" together, you get "forest". As you can see, it isn't as simple as just suggesting that a character is pictographic.



The origin of the character 心 and its usage as a radical in other characters; provided by Outlier Linguistics' add-on dictionary on Pleco

A sample of modern-day Chinese characters and how they evolved in form and in style throughout history, complete with their definitions and readings in modern-day Standard Chinese and Old Chinese
The Chinese writing system is a very unique script and has been the vehicle for much Chinese cultural preservation over the years. Remarkably, it is one of the few Eurasian scripts to have emerged independently from the Ancient Egyptian and Mesopotamian scripts which went on to influence Europe, the Middle East, South Asia and Central Asia.
Prior to a reform in the 20th century, Classical or Literary Chinese was the preferred writing system inherited from far-off ancient times, but it was so removed from how people spoke that it was instead decided to have a writing system that mimicked the way in which Mandarin is spoken. Therefore, Cantonese is not actually read the way that it is spoken.
Classical Chinese: 孔子曰,生而知之者,上也。
Standard Chinese (traditional): 孔子說:「生來就知道的,是上等人」
Standard Chinese (simplified): 孔子说:「生来就知道的,是上等人」
English: Confucius said, "It is better to be born knowing [the Way; rather than to have to learn it]."
However, there has been an effort to find a way to write Cantonese vernacular in Chinese characters, and this has led to the rise of written Cantonese being a thing. However, not everybody can understand it (especially non-Hongkongers) and anybody who is literate who speaks Cantonese should be able to read and write Standard Chinese.
For comparison, compare the following sentences:
Standard Chinese (traditional): 是不是他們的?
Standard Chinese (simplified): 是不是他们的?
Pinyin: Shì bùshì tāmen de?
Written Cantonese (traditional): 係唔係佢哋嘅?
Jyutping: Hai6 m4 hai6 keoi5-dei6 ge3?
English: Is this theirs?
Helpful resources:
This video is a good introduction to understanding how "Chinese" is actually quite an ambivalent term:
This video is an introduction to the Chinese languages, how they work, how they differ and how they are related to one another:
This video will introduce different aspects of the Chinese language system, including phonology, grammar, the writing system and more:
Why learn Cantonese instead of Standard Chinese (Mandarin)?
I'll be very frank, off the bat. Unless you are living in or moving to Hong Kong or Macau, or are in frequent contact with people whose primary language is Cantonese, there are practically no advantages to learning Cantonese compared to Mandarin. There are more people worldwide who speak and understand Mandarin and many people who speak Cantonese as their mother-tongue can also speak and understand Mandarin (but this is not universally true). So if business opportunities internationally and specifically outside of Hong Kong and Macau are your interest, Mandarin comes out on top. If wishing to speak to as many Chinese people across the world as possible, Mandarin is still your best bet.
In fact, the languages are relatively similar compared to foreign languages but are mutually unintelligible. So it's not even as though you can learn one in order to pick up the other too.
However, a surprising benefit to learning Cantonese is when trying to learn other languages influenced by Chinese. During the Tang dynasty, Chinese culture experienced a golden age of sorts and there was much cross-cultural exchange along the Silk Road which brought Persian, South Asian, Central Asian and many other influences over into the empire. China (although it didn't exist as such) was a regional hegemon, and neighbouring countries eager to trade with this country and form political alliances all took on a degree of Tang-dynasty culture, including elements of the language after borrowing what might have been the most revolutionary tool the Chinese had: their writing system.
The elite of these countries adopted and adapted Chinese characters to write in Literary Chinese, and eventually their own languages, meaning that Chinese characters now came with the imported Chinese way of reading them and took on the equivalent native word associated with the character's meaning as well. As a result, roughy 60% of the words in a modern-day Japanese dictionary are words of Chinese origin, up to 70% of Vietnamese is of Chinese origin, and about 75% of Korean words are of Chinese origin. These Sino-Xenic readings exported into these foreign languages from Middle Chinese are not perfect reconstructions of this ancient language (aside from Vietnamese, which is impressively conservative in its pronunciation of Sino-Vietnamese characters), but they retain many features such as the finals (-p, -t, -k) that Mandarin lost and which Cantonese kept, making Cantonese much more faithful if trying to learn Chinese alongside Japanese, Vietnamese or Korean.

An example of different readings of this Chinese character in various Chinese lanuages and their Sino-Xenic readings alongside the Middle Chinese
Learning Cantonese can also be beneficial when trying to learn other tonal languages, such as Vietnamese. Mandarin only has four tones whereas Cantonese has six. This means that a Cantonese speaker will likely find it a lot easier to navigate the introcacies of pitch alteration and speech contours than a Mandarin speaker when it comes to a language like Vieetnamese, which also has six tones.
So now that I'm familiar with its general layout and background, how do I actually learn Cantonese?

Step one: Learn to read the language without learning Chinese characters
The first thing you need to do is make sure you can read and pronounce the language without having to read Chinese characters. There are thousands of Chinese characters which are necessary to learn in order to become a fluent reader of a Chinese language and to understand it at a deeper level, but this is a major deterrent for people wanting to learn one. Seeing as Chinese languages aren't traditionally written in Latin script, you'll have to then learn how to read the two primary romanisation systems that are used for writing romanised Cantonese (Cantonese in Latin script):
This means that you won't be left wondering how to read what might at first seem like an abstract Chinese character, and will be more capable in terms of nailing the pronuciation of a word when reading it for the first time.
This video demonstrates how the phonology of Cantonese and its tones can lead to some funny situations:
This kind of subtlety between tones must be seen as little more than a confirmation of the fears many Cantonese beginners have about being able to speak to others, but it's really not something to get that worked up about. As a foreigner, they will understand if you make mistakes and it's best not to take oneself too seriously in these kinds of situations. Kids make mistakes all the time when trying to learn their native language, and it's no different for those approaching a language for the first time.

A table of Jyutping syllables made up of initials, finals and tones offered by CantonTalk
One of the things you'll notice is that there are sounds that don't exist in English which Cantonese makes common usage of which need to be written in some form. An example is the "jyu" sound in "Jyutping, which is pronounced similarly to a French "u" with a "y" before it. Another is the repurposing of the letter "c" in Jyutping to write a sound sort of midway between "ch" and "ts".
Both romanisation systems are therefore not abiding by English spelling norms and it's important therefore to adapt to the new ways of reading and writing Latin letters by familiarising yourself with the pronunciation.
Out of the two romanisation systems, I've personally noticed that Jyutping is more commonly used and that the Yale system is more common in books. The lack of a single universal romanisation system can be frustrating at first, but I guarantee you that it's not that difficult to get accustomed to.
For a full Jyutping pronunciation chart, please click here.
Regarding tones, they are often seen as a daunting task to master, but I find that they are overhyped. You will hear nightmare stories where the wrong tone means calling somebody's mother a horse, or saying "chicken shit" instead of whatever other phrase you were trying to pronounce, but tones are actually fairly simple. We might think about it often but English also puts a lot of emphasis on intonation and tone, just in a different way to tonal languages such as Cantonese.
Another important point is to not think of tones in terms of frequencies. This is in fact really unconstructive, and I will let the following video explain why:
Step two: Get a good textbook and resources to fall back on for learning vocabulary and getting some guidance with grammar and general self-expression
Textbooks can be really helpful in learning vocabulary and the theory of a language through grammar. They set things out in a methodical way and pace your learning section by section, introducing new concepts, vocabulary, and other tidbits of information along the way. It also means you can read phrases and learn a bit more about the self-expression of a language. Cantonese is a very different language from English, and so taking the time to mull over how people think in a new language is always a helpful use of your time. You'll find that Cantonese-speakers express themselves very differently rom English-speakers, and this can in fact be an easier endeavour than many imagine.
For example:
Cantonese:
English: That is the book that you bought me.
If there is one take-away message I would give to ambitious language-learners reading this, it would be:
If you imagine words and all other lexical bits of information like affixes as being the building blocks of a sentence, grammar is just the way you arrange those blocks and how the blocks relate to one another. Learning grammar doesn't have to be a sluggishly tedious task of learning different conjugations and so forth, it just takes enough input to be able to make a breakthrough in comprehension through simple correlation and pattern recognition.
That said, I wouldn't recommend learning through a textbook alone. In fact, I would strongly advise that a textbook become only one part of the learning experience. In order to actually speak a language, you need speaking and listening practice, and unfortunately, only reading from a textbook and learning the theory of a language doesn't mean you learn how to use it in practice.
Over the years, I've developed something of a sixth sense for which resources are useful to me and which aren't. I like to be able to pick up a book and delve straight into it, finding it to be rich in information and not just touching upon the basics - allowing the reader to go further than the basics if they wish. As a main textbook for Cantonese alongside an English-Cantonese and Cantonese-English dictionary, I use this one, which uses the Yale system:

In fact, a generous somebody created a free course on Memrise which drills the vocabulary used in this course - confusingly, in Jyutping instead of the Yale system. These exercises can be done in five minutes and are gamified, so there is a streak system involved which incentivises you to play on.
Beyond that, complementary resources are always helpful. There is a wealth of e-book materials to be found online, as well as websites which could be really helpful, such as this one. If you're looking for something else, do your own research! Google is your friend in language learning and there are hundred and thousands of potential resources to use if you do your own research.
Step three: Start to actually speak and listen to the language
This video allows viewers to replicate the speech patterns of locals, which is useful in comparing your own speech to that of a native speaker and makes tones less daunting; note the Jyutping transliteration:
For me, the most important part of learning a language is being able to speak it. After all, it's not for nothing that you're called a language "speaker". It isn't much good if you can only read a language but can't have a conversation with somebody. In fact, I would argue that it's actually much more important to be able to speak a language than to be able to read and write it, especially when learning a language such as Cantonese, which is poses a formidable challenge to learners in terms of its writing and its pronunciation. In order to cover more ground more quickly, I would advise that beginners spend more time on the latter than on the former and get to learning characters at an intermediate stage of their learning journey. This isn't how my own learning journey went, which took a convoluted route via Japanese and Mandarin, but it is what I would advise those of you who want to get to grips with the language quickly. After all, as said earlier, written Cantonese is often little more than just
In that sense, everything up until this point has only been laying the groundwork. This is where it gets serious.
By far, the best way to learn a language is with speaking and listening - the traditional way that anybody learns a language. As children, we learn by listening to our elders and mimicking them, making mistakes and correcting them progressively as we get older. The human brain has a tremendous ability for languages, and I would argue that a lot of people who are self-confessed as being awful at learning languages are just not giving themselves enough regular auditory input and spoken practice for their brains to chew on and grow accustomed to, instead opting for Duolingo or textbooks.
The best resource I've found by far to help with speaking Cantonese is Glossika, a Taiwanese-based service which is perfect for speaking and learning a whole range of languages not limited to Cantonese (their selection is very impressive). I've actually made most of my progress through using this resource and would stress this one as the most important because it also encourages you to get into a regular habit of learning, which is easily the most important factor when learning a language. What's more, you can learn or practise as many languages as you want side-by-side. What's more, the system doesn't teach you grammar per se but is hoping that you will implicitly pick up on grammatical structures as you listen and notice patterns - a way of approaching grammar that I often find more successful than the abstract learning of formulae, which should act as a support rather than the leading practice itself. The only down-side is that it's not free and you need to pay either a monthly or annual subscription, which isn't steep.

A screenshot showing a sample of the full range of languages available on Glossika, which is truly astonishing - not my original picture, go to @LearnwithJohn
For listening (aside from the audio tracks for the textbook above), I'd recommend YouTube videos because they're free and give you a visual context too. You can find many listening resources on the CantoneseClass101 YouTube channel, as well as the EasyCantonese street interview playlist, although the latter is more advanced.
However, learning set phrases without understanding that phrase's components and the way they work together can end up being little more than an illusion of progress. Without the understanding of how and why the building blocks of a sentence are arranged a certain way in a given phrase, you do not actually know why you're saying the things you are the way that you are and in fact are just parroting what somebody else might say with no understanding of what others might answer.
This is where a textbook and listening resources come in handy. In order to make progress in learning a language, you have to have some kind of understanding of how to assemble sentences and express youself in that language.
There are two main paths for learning grammar. It can either be learned implicitly by subconscious pattern recognition after listening and repeating phrases like above - the way most of us learn our own mothertongues from our elders - or explicitly, with the aid of a grammar book. Using both side-by-side is usually the best way of making progress, but you ought to put a lot of emphasis on the first one. Theory and practice are far removed, and it is no use knowing the theory of how a language work thanks to a grammar book if you don't practice your implicit understanding.
On top of this, listening will habituate you to the sounds of a language, and once you've got a grasp of that, it will be easier for you to recognise words which you recognise among the phonetic flurry being launched at you. Later, you'll be able to make out strings of words and phrases. Take everything one step at a time.
Step Four: Once you're comfortable enough with the basics, learn to read and write
As mentioned in my introduction, the Chinese writing system is complex and fascinatingly rich in trms of its history and etymology. This often means that once you gain a better lexical command of any Chinese language, it becomes more and more important to know how to read and write Chinese too, because many of the words you will be learning are verbal expressions of terms most commonly found in the written language which are better understood when you analyse them character-by-character. I haven't placed as much emphasis on reading and writing in this guide because it's for beginners trying to make leaps and not find themselves stuck at an early stage. For this reason, I recommend placing more emphasis on writing when you get to an intermediate to advanced level of ease in speaking verbal Cantonese. After all, Cantonee is a primarily spoken language, as Standard Chinese is the common written language between all Chinese languages.
For learning to read and write, there unfortunately aren't many resources that teach you how to read characters in Cantonese seeing as it's considered something of a fringe language and most resources only give you Mandarin Pinyin pronunciation. This combined with the fact that you'd be learning to read and write Standard Chinese means that you'd effectively be learning to read and write Mandarin, and not Cantonese.
That said, if you still would like to go forward with it, the best app for writing is Skritter, and the best one for reading is Du Chinese. The reason I like apps like these is that they're time-efficient while not being too general and non-specific like Duolingo. Yu can take a moment out of your day during your commute to practise and create a habit.
If one thing will help you more than anything, it is repetition. Trying to find the time every day or every few days to regularly revise and practise new content will make a massive difference in your learning journey. Most people I know fail to learn a language either because they aren't using the right resources or because they simply don't take - or more importantly, make - enough time to practise. Five minutes a day is not a lot, and even the minimum is better than nothing.

Skritter

Du Chinese
For learning more about the theory behind Chinese characters, there are some useful videos but the best course by far is offered by Outlier Linguistics, who do a fantastic job of explaining the history, development, usage and logic of traditional and simplified Chinese characters. They offer a masterclass in Chinese characters, as well as others in Mandarin pronunciation, Japanese Kanji and Classical Chinese. I have enrolled for all of their courses and am far from disappointed. You can see an example of their Chinese character etymology dictionary above, which takes users beyond arbitrary mnemonic tools and deep into the actual etymology of the characters, which might sound complicated but is actually a way more efficient way to learn Chinese characters. Unfortunately for Cantonese-learners, they only provide Mandarin Pinyin pronunciations but do a great job of explaining why Chinese is written as it is which will enrich your experience of learning any Chinese language. They've also partnered with Skritter and Pleco, meaning that you can revise the course content on either of the apps. The courses need to be paid for, but they are well worth your while.
An essential download for any of you who want to better understand written Chinese around you is Pleco, the best Chinese dictionary app that there is, which by default displays characters alongside their readings in Mandarin Pinyin, but you can go into the settings and add Cantonese Jyutping readings too. They also offer add-on dictionaries for Cantonese which give you definitions more suited to Cantonese vernacular speech, and have a camera function which automatically reads characters. You can see an example of their Outlier Linguistics etymology entries above.
All in all, learning a language is very individually particular experience and should be gauged by what works for you. I might get some criticism for some of the tips I've given in this post, but this is just a compilation of things that have worked for me on my own and I'm sharing them now.
If you have any corrections, questions or modifications to suggest, please reach out and I'll be in touch.



Comments