delicious languages and highly inflected meals

Odia, oh dear — August 15, 2021

Odia, oh dear

I don’t usually like/dislike languages for reasons as superficial as their writing systems, but it was the Odia writing system that really made me fall in love with this language.

It’s closely related to neighbouring Bengali, with which it arguably constitutes a dialect continuum stretching up India’s northeast coast. Like all languages with a Brahmic script, Odia consonants have an inherent vowel, however like Bengali, Odia’s varies between being /o/ and /ɔ/, rather than between /a/ and /ə/, as is the case in most other Indo-Aryan languages. Other distinctive features that Odia shares with Bengali include a definite article that is suffixed to the end of nouns, a lack of grammatical gender and pluralisation and the use of classifiers when counting nouns.

The Odia writing system though is the real treat for me. Most Brahmic scripts have developed a common feature, the shirorekha. This is the name of the horizontal bar at the top of all writing from which all the letters are suspended. Odia, due to the fact that it was first written on palm leaves rather than in stone or wood, could not be written with a shirorekha, as this would tear the fibre of the leaves. As a result, the letters of Odia have, as their universal feature, a curved outer shell rather than a horizontal line.

These are the consonants:

And these the vowels (note that the first Odia character is the stand-alone form of the vowel, the second character shows the diacritic form that is added to a consonant when the vowel occurs as part of a complex syllable):

One feature of Odia that is making me optimistic that it won’t be such a hassle to learn is that, like Bengali, it has a very standard and consistent conjugation paradigm in which individual morphemes are very independent. What I mean by this is that, whilst in English the past tense ending –ed often affects the sounds that precedes it or vice versa, this is very rarely the case in either Odia or Bengali. There are also no internal ablaut transformations (for example the common English three-way verbal system i-a-u, swim/swam/swum, drink/drank/drunk, sing/sang/sung, etc.)

The verb system is based on a small number of stems which rarely change, and they are appended with suffixes which are in fact auxiliary verbs that have become fossilised over time and work like conjugations. For example, the present continuous (I am doing, you are going, etc.) of Odia is expressed using the continuous stem and the auxiliary verb to be. The past perfect consists of the perfective stem (done, gone, said, eaten) and the past tense of the verb to be. These building blocks accumulate tidily and consistently, such that even though I’m quite early in my Odia learning journey I can tell you that a complicated phrase like I will have eaten is ଖାଇଥିବି (khaithibi), made up of the perfective stem of the verb to eat (ଖାଇ khai) and the future tense of the verb to be (ଥିବି thibi)

For reference, this is a pretty typical Odia verb table, see how regular it is? The best patterns are observable horizontally by tense.

Double the flavour, double the fun — October 9, 2020

Double the flavour, double the fun

As you may know, I began an exciting project this year to learn 15 South-Asian languages to proficiency in order to visit India, Pakistan, Bangladesh, Sri Lanka, Nepal and The Maldives when the whole Covid lifemare is over. It’s going really well, and I’ve in fact added to the list so it will now be 17.

I’ve decided to write this post about a feature that all Indo-Aryan languages I’ve encountered seem to have, a feature which has very ancient roots and appears to have not lasted very long in the other branches of the Indo-European family. I’m talking about correlating conjunctions- relative pronouns that work in pairs, rather than as single terms as they do in English. For example, the English sentence “I like when it rains” is perfectly grammatical. However, in Bengali this would not be allowed, nor in Hindi. The clauses “I like” and “it rains” cannot be separated by a single conjunction, instead the two clauses each take their own conjunction, so that we must say “when it rains, then I like it” (যখন বৃষ্টি হয় তখন আমার ভালো লাগে, jokhon brishti hoy tokhon amar bhalo lage).

The same sentence structure must also be followed in Hindi and Urdu, where the sentence “I like when it rains” becomes “जब बारिश होती है तब मुझे अच्छा लगता है।/جب بارش ہوتی ہے تب مجھے اچھا لگتا ہے۔” (jab bārish hotī hai tab mujhe acchā lagtā hai).

These sentences are far from naturally obvious to English speakers, and I have struggled a fair bit to get used to them. The fact that literally every single sentence with a relative clause in it uses this structure helps somewhat- it makes the phenomenon of these paired conjunctions inevitable.

Let’s take a look at some more examples:

Hindi has the pair जैसा (jaisā) and वैसा (vaisā), which mean “how” and “that way”. They are used for making relative clauses pertaining to adverbials of manner. For example “he cooks just like my mother” would have to be said as “just like my mother, he cooks that way” (jaisā merī mã vaisā voh pakātā hai, जैसा मेरी मां वैसा वह पकाता है।).

Urdu has the locative pairing جہاں (jahã) and وہاں (vahã) which translate to “where” and “there”. This pair is the only one that we actually see sometimes used in English, due to the existence of the phrase “there is/are”. For example the English sentence “where I live there are very few shops” is almost the same in Urdu- “جہاں میں رہتا ہوں وہاں بہت کم دکان ہیں۔” (yahã main rahtā hun vahã bahut kam dukān hain).
But English doesn’t always follow this structure, for example we would say “I eat in the restaurant where my sister works” by adding in the oblique word “there” to produce “where my sister works, in that restaurant I eat”, “جہاں میری بہن کام کرتی ہے میں اُس ریستوراں میں کھاتا ہوں۔” (jahã merī bahin kām kartī hai main us restorãt men khātā hun). The word “there” is not used directly, because an actual noun replaces the pronoun (the restauarant).

Bengali uses the pair যারা (jara) and তারা (tara) to cover the meanings “who” or “what” and “they”. These forms can also be inflected for case and number as they are pronouns referring to people (sometimes). Whilst English can comfortably say “those who have money go to study abroad”, Bengali has to use the pair of correlating conjunctions and say “those who have lots of money, they go abroad to study”, or “যাদের অনেক টাকা আছে তারা বিদেশে মহাবিদ্যালয় যান।” (jader onek taka ache tara videshe mahavidhyaloy jan).

That’s all for this instalment folks, I’m going to write up a post next week about the Sanskritic origins of this feature of Indo-Aryan languages, as well as looking at which other Indo-European languages have and have had it in the past. Ciao for now, Cow!

Sin-holla at your boy — August 8, 2020

Sin-holla at your boy

In the space of a week I have gone from really struggling to master even a handful of the Sinhala alphabet’s letters, to diving head first into learning the language’s grammar and constructing some basic sentences.
My success with learning the alphabet I can pretty assuredly the attribute to a change in my study style. My usual way of making language notes is to create elaborate and aesthetically pleasing diagrams and representations in my fancy notebooks. I can now conclude that, whilst pretty, these notes are not practical.

Because I am not only learning Sinhala at the moment but also another eleven languages from South Asia, I decided to forgo the posh notebooks and grab some massive pads of writing paper. Rather than writing out a beautiful chart of phonetics and characters, I simply wrote out every letter of the alphabet fifty times, then tested myself and tested myself until I got every single one right. It took under a day, but was probably the most intense this lazy boy has studied since high school.
What can I say about the writing system? Well for starters it is stunning. Even though it’s ultimately a Brahmi script that can be traced back to Sanskrit Devanagari, the characters have all lost the bar that is used in Devanagari to link letters in a word. For the most part there aren’t many similarities between Sinhala letters and the Sanskrit ones they’re derived from. This is because Sinhala developed thousands of miles away from the other languages related to it, such as Hindi and Gujarati. This distance and isolation is also the reason that Sinhala has undergone extensive phonological changes which mean that around half of the original alphabet is completely out of commission!

Not much similarity to be found between Sinhala and Sanskrit writing systems.

The alphabet is divided into two sections. One is called shudda, meaning “pure”, because these are the only letters that ever occur in truly Sinhala words. The remainder are called mishra, meaning “mixed”, because they only appear in words that have entered Sinhala directly from Sanskrit and Pali. This division is rather helpful, as it means that half of the alphabet can be pretty much ignored by the beginner.
To give you an example of what i mean by Sinhala’s changes over time, Sanskrit strongly differentiates between aspirated and unaspirated consonants, whereas Sinhala no longer does. That means that Sanskrit words in Sinhala beginning with a “k” now sound the same as those beginning with a “kh”. In Sanskrit the verb “eat” is khadati (खदाति) which has become khana in Urdu (کھانا), Hindi (खाना), Punjabi (ਖਾਣਾ) and Nepali (खाना) and khaoya in Bengali (খাওয়া). We can see that these languages have maintained the aspiration of the “k” in the verb’s stem. Sinhala on the other hand has not, and the verb meaning “eat” is kana (කන).

The importance of Buddhism and Hinduism in South and South-East Asia mean that once you have learnt a number of the region’s languages, you do start to notice some similarities. One that’s pretty easy to get used to is in the vowel system. The vowels /i/ and /e/ are very similar across Tamil, Sinhala, Thai, Burmese and Khmer.

I’ve taken quite naturally to Sinhala for another reason- in the colloquial language verbs are only conjugated for tense and not for person, which means that the verb “කනවා” (kanavaa) can mean “I eat”, “we eat”, “she eats” or “you eat”! It does mean that in order to avoid confusion subject pronouns must be used, but this is the case in the majority of Indo-Aryan languages anyway so I’m not too bothered.

The present tense in Sinhala, formal and colloquial compared.

Now you’ll notice I mentioned this was the case in colloquial language- I make this distinction because in the written language of Sinhala, verbs DO have to agree with their subject, and as a result of learner wants to be able to access literature in Sinhala (which I do) then it’s essential to learn the full conjugations of the verb, which are surprisingly complex, perhaps an indication of why the spoken language did away with it.

Oh sandhi what a pity you don’t understand — April 3, 2020

Oh sandhi what a pity you don’t understand

Well this week has been an utter joy. Ever since I more or less accepted that my South-East Asia trip will have to be put on hold indefinitely, I have put down my Burmese, Lao, Thai and Vietnamese books and have inadvertently started a little ancient languages binge.

I have had a fascination with the Sanskrit language from a young age. It started when I was five because I was fascinated with the Hindu religion. At a Hindu wedding in Madrid I once boldly informed a group of our Indian neighbours that I was going to learn Sanskrit and be able to read Bhagavad Gita. They laughed because they thought it was adorable, and looking back I may have been a touch overconfident (not really, I just think at that age everything seems possible once you get older).

I’ve tried to give Sanskrit a go on numerous occasions, and every time the barrier has been the same. The learning materials for Sanskrit have never really been up to much- the hey day of Sanskrit study was about a century ago, and the textbooks show that. Even the Teach Yourself series by Hodder Stoughton, which I believe has produced many excellent foreign language titles, has a very poor offering in Sanskrit.

But joy of joys, I recently took a gamble on a book online by Oxford Sanskrit professor Antonia Ruppell and all that has changed. In a week (a frigging week!!) I have learnt four declensions, four conjugations and three tenses as well as really getting to grips, FINALLY, with the most troublesome feature of Sanskrit- the sandhi.

Sandhi is Sanskrit is such an infamous phenomenon that even in other languages, the process of changing consonants and vowels internally is known as sandhi.

In short, it refers to the fact that certain sounds are articulated by stimulating certain areas of the mouth, throat and nose. As a result, once you make a certain sound, your mouth is already contorted in a position which makes it easier to make certain sounds than others.

The easiest example of sandhi in English is the pluralising -s. Sometimes it is pronounced like the ‘s’ at the beginning of the word ‘sound’, but sometimes it sounds more like the ‘z’ of ‘zoo’. This isn’t random- certain sounds at the end of words require the letter to be pronounced one way, some another. ‘Cats’ and ‘dogs’ both end with an ‘s’, but one of them sounds like ‘s’ and the other like ‘z’. OMG right? You may never have even thought of this, but it is actually a pretty cool feature of English with a set of very rigid and well understood rules behind it. The word ‘cat’ ends with a ‘t’, which is an unvoiced consonant (that means when you pronounce that ‘t’, your vocal chords don’t vibrate), BUT ‘dog’ ends with a ‘g’, which is voiced (say ‘g’ now with your fingers on your voice box, you’ll feel a buzzing as the chords vibrate). That ‘z’ sound you add to the end of ‘dog’ is the voiced, or vibrating, counterpart of the ‘s’ you add to ‘cat’!

If you want to understand why this is done- try and switch the sounds up. Try to add the voiced ‘z’ to the end of ‘cat’. It ends up either pretty awkward, or you inadvertently change one of the sounds don’t you? The same is true vice versa.

Sanskrit is a really fantastic language for getting used to this, because its alphabet is arranged according to the phonetic qualities of each letter. Wanna see?


Now, what I’m talking about are all the internal sandhi rules of Sanskrit. They look kind of nightmarish, but actually it’s really just because they’re so codified and well documented. English easily has as many sandhi rules as Sanskrit, they are just not really well known by either native speakers or learners of the language.

So why do we care so much about them in Sanskrit? WELL funny you should ask because it is the bane of my life. Sanskrit writers, for millennia, have had a really strong aversion to ending words in a consonant sound. It’s because the way to end a Sanskrit word with a consonant is to use a rather unattractive ligature called a visarga. It is considered pretty classless and lazy to use a visarga when writing either poetry or prose in Sanskrit, so whenever a word naturally does end with a consonant, the Sanskrit writers would always combine it with the next word in order to end up with a word that ends in a vowel, which is a much more aesthetically pleasing form to the Sanskritist.

Let’s have a look at an example of how this pans out in practice.

The way you say “from the city” in Sanskrit is नगरात् (nagaraat). It ends with a /t/, horror of horrors, and the little mark below the word’s final consonant is that visarga I mentioned earlier that Sanskrit poets find so hideous. The /t/ at the end of the word is UNVOICED.

Then, the way you would say “towards the village” is ग्रमं (gramã). It begins with a /g/, which is VOICED.

So now…IF you wanted to say “I am going from the city to the village” (“I am going” is गच्छामि/gacchaami, which is placed at the end of the sentence, as Sanskrit verbs almost always are)…we have to change something in order to have that word-final /t/ and word-initial /g/ occur next to each other without there being any unpleasantness.

The way Sanskrit addresses this is by changing the word-final /t/ to a /d/, which means that all the consonants in that cluster are VOICED.

Although the sentence would initially have appeared as:


नगरात् ग्रमं गच्छामि

Nagaraat gramã gacchaami



नगराद्ग्रं गच्छामि

Nagaraadgramã gacchaami

There’s much more to sandhi than this one example can possibly convey, but hopefully it gives you an idea of just how enormously this principle influences pronunciation and, in particular, the reading of Sanskrit literature.

Lao much is that doggy in the window? — March 17, 2020

Lao much is that doggy in the window?

Laos in one of my favourite countries on earth. It’s one of the few countries I’ve visited where I can’t name a particular sight- Ukraine has the Pechersk Lavra, Italy has The Pantheon, Korea has The Gyeongokgung. Laos though doesn’t really have one of these standout sights- there’s a handful of temples all over the country, but the people for the most part practise Buddhism actively, so the temples are all functioning and not really for show.

What I really love about Laos is just being there.

One of the things I most loved in Laos was sitting in the street chatting with local people. Even though at the time I had never studied the Lao language, its proximity to standard Thai meant that whatever I said in Thai was understood by almost everyone I encountered (unless, it turned out, they were not ethnically Lao but Karen, or one of a number of other minority ethnic groups who live in various parts of what is now Laos).

On the flip side, although spoken Lao is not a million miles away from Thai, I did find myself saying พูดช้านิดหน่อยได้ไหมครับ (could you please speak a teensy bit slower?) an awful lot, as it was only when Lao people spoke their language slowly that I was able to make out their meaning.

A couple of months ago I booked my ticket to Singapore and decided that a stop on my trip would be Laos, most likely after Thailand, and excited at the thought of getting to see some of my favourite places and people, I began learning Lao!

I love Lao- it doesn’t sound tremendously different from Thai (unsurprising as to Lao people Thai is intelligible) and the writing system is not too different from Thai either.

Whereas Thai’s inventory of characters was designed to completely mirror that of Sanskrit thereby enabling letter for letter transcription of Buddhist scripture, Lao only really uses the bare minimum required for notation of sounds.

For example, the Thai alphabet has three consonants which are all pronounced /s/ in the modern language (ส, ศ and ษ). This is because Sanskrit has three sibilant consonants, which are transcribed in the Latin alphabet as ś (श- palatal), (स- alveolar) and ṣ (ष- retroflex), and in order to transliterate Sanskrit texts into Thai, there needed to be a triptote distinction. Similarly where Thai has several letters that represent the sound /kʰ/ (ค, ข, ฅ, ฆ and ฃ), Lao only has one (ຂ).

So, reading and writing Lao is a fair bit easier than Thai. An alphabet of almost 70 letters is reduced to one of around 40, a significant difference.
The script of Lao is also a bit more light hearted than that of Thai. I can’t quite explain in words what I mean, so let me show you the following pairs which demonstrate, I think, the small differences that make writing Lao considerably easier and more fun than writing Thai. If I were less of a highbrow academic I would call Lao script curlywurly and Thai script uptight and glamorous. But I’m not so I won’t.

Thai also has some of the distinctions of being a prestige register. What I mean by this is that both Thai and Lao are effectively different registers, modern incarnations, of an older language that serves as their common ancestor. Thai was the language of the Buddhist clergy and the Siamese royalty, whereas Lao has been perceived throughout its history as a farmer language, an uncouth and rough-round-the-edges poor relation. That means that Thai has been infiltrated considerably by other prestigious languages such as Khmer, Sanskrit, Chinese and Pali, whereas Lao has remained comparatively untouched, with many of its indigenous morphemes still in tact.

All in all, I wouldn’t say I’m enjoying one more than the other. In Thai’s favour is the fact that, as a more widely spoken language with more prestige attached, there is a greater variety and depth of learning materials, meanwhile while learning Lao from, admittedly, limited resources, I am enjoying the script a lot more, as well as not finding the tones as challenging, along with being more excited about using the language on my upcoming travels. I really relish the thought of turning up in Luang Prabang and being able to talk to some very old friends in THEIR language, rather than through the cumbersome medium of my clunky Thai and their non-standard-heavily-Lao-infused Thai.

Teaching a young dog old tricks — March 9, 2020

Teaching a young dog old tricks

Ancient languages get a bad rap. One kind of person I have little respect for claims that they are useless and impractical and that children should learn real things in school, instead of learning to access texts and societies that formed the basis of our own. Another type moans at how difficult they are and what a stressful endeavour it is to become a proficient reader of an ancient tongue. This opinion I have a bit more empathy with, although I would throw in one additional qualification.

I would say that learning ancient languages is tough, not inherently, but because of the learning styles that teaching materials adopt.

As learning ancient languages has become less popular over time, there is unsurprisingly not much impetus to produce more cutting edge learning materials. That’s why if you learned Latin at school, it was most likely taught following the, still excellent, but rather dated Cambridge Latin Course. The materials are updated once in a while, but normally in superficial ways like being reillustrated; complicated grammatical explanations are seldom reworded to make them more accessible to 21st century kiddoes.

And that’s why I was so thrilled the other day to discover A New Practical Primer of Literary Chinese. It caught my eye in an academic bookshop and I was spellbound when I opened the cover- it was the textbook for Ancient Chinese that my heart had been yearning for for years.

So a bit of background- the official language of The People’s Republic of China is Chinese, and the official spoken variant of the nation is Mandarin. Regardless of what variant of Chinese you may speak with your family and friends (Cantonese, Hokkien, Hakka…), the written form is largely the same, and that is what we mean when we say Chinese. Due to a long and unbroken history of literature in the Chinese language, there has been much less mutation in Chinese than one would expect over c. five millennia. A literate Chinese speaker can pick up a text in Chinese from four thousand years ago and understand it much more clearly than an English speaker could understand an English text from 900 years ago. The comparison is flawed because of the complex history of the English language, particularly following the Norman invasion, but it serves to illustrate my point that learning Ancient Chinese, one is hardly a beginner if one already knows the modern language, whereas English speakers wishing to learn Old English are really not much further from the start line than Russians and Spaniards.

The reason I added the previous paragraph is to explain my situation- basically, I can already understand Ancient Chinese, and have little to no difficulty reading texts from any era. What I have never had though is a consistent and deliberate education in the Classical language and how it works, which is what this textbook provides, and it is giving me no end of pleasure.

1) Blurring of lexical categories

As intimidating as this sounds, all it means is that nouns, verbs, adjectives and other parts of speech (lexical categories) can be switched round with ease. It can either make things more complicated or simpler depending on what kind of reader you are.

In the two sentences:
“The people want retribution”
“Carnal desires lead the righteous man to ruin”

The Chinese character 欲 can mean either the verb want or the noun phrase carnal desires. This is because a) Classical Chinese does not strictly demarcate between the lexical categories of noun and verb and b) The cultural context of words is so loaded with meaning and various connotations that a tremendous profundity can be encoded into even the smallest and most innocuous morphemes. For example, it is the long shared history of Chinese and Buddhism that makes the simple word “want” also have connotations of gluttony and earthly pleasures.

2) It is monosyllabic

This may not sound super exciting, but I’ll attempt to explain my feelings.
A morpheme is the name given to a unit of language that has a meaning. So for example “shpl” is not a morpheme, but “apple” is, as it contains the meaning of that particular fruit. Morphemes aren’t always words though. In the word “cardiologist”, “cardio” is a morpheme as it means heart.

As a general rule, Modern Chinese morphemes have one syllable, but words have TWO. If a word has two conceivable morphemes that can make it up, it will be a bisyllabic word made up of two morphemes. An example is 課本, which means textbook and is made up of the morphemic units “class” and “book”. However, if the word is pretty basic and bare in meaning, it will quite often use two morphemes with similar meanings. An example of this is the most common word for “friend”, which in the modern language is 朋友. Both of these characters are morphemes meaning friend…but for a number of reasons they are combined to form a vocabulary item. Whilst a reader of Chinese has no difficulty in identifying the characters in isolation as relating to friendship, in speech only together and in this order is the meaning of “friend” expressed.

The fact that Ancient Chinese was monosyllabic rather than bisyllabic means that, on average, meanings can be conveyed using half as many characters! For example it can’t have escaped your notice that the sentence above – 欲導聖人於禍 – means something as verbose in English as “earthly desires lead the righteous man to disaster”.

Burmese slays — February 24, 2020

Burmese slays

If you’d told me a few years ago that I would be learning Burmese in 2020, loving it and planning on becoming conversationally fluent within the next month and totally proficient and highly literate by year end I’d have been perplexed. There’s nothing wrong with Burmese, quite the opposite in fact- I have gone almost my entire thirty years, including over a decade as a language fanatic and a linguistics savant, without having thought about Burmese once. The nearest I had ever come to thinking about this language before the past few weeks was thinking that the script looks like a mass of bottoms (just call me Tina Belcher). For real though, this is one THICC writing system:



Anyway, I digress. Having decided that my upcoming trip to South-East Asia would incorporate Burma, I found myself wondering if it was worth learning the language before going. I had already decided that I would learn Thai, Lao and Vietnamese to proficiency before setting off, but this was not really a hard decision. I was already pretty fluent in Thai (spoken Thai at least), Lao is very closely related to Thai, and is even to an extent mutually intelligible, and learning Vietnamese had been on my to-do list for quite some time.


Burmese though was a bit of a wildcard. I had never felt any inclination to learn it, it seemed tough (the writing system stumped me a fair bit), and of all the countries I’ll be visiting, Burmese people are the most likely to understand English, and even be able to converse.


But I don’t shy away from a challenge, and I like to know at least the pronunciation system of a people’s language before visiting their country/land, so I bought a couple of Burmese textbooks and grammars and dove into them over the past couple of weeks.

Mes amours…I have fallen rapidly for this language and I am stunned.


Burmese is a Sino-Tibetan language, which puts it in the same family as Chinese and Tibetan. However, as a Chinese speaker I must say there is almost no commonality to be found with Chinese. This is because very early on in its history, the Sino-Tibetan family split into two branches, one branch migrating North and becoming the Chinese macrolanguage, and the other remaining further south and developing into the Tibeto-Burman branch, which then gave the world the languages of the Tibetan and Bamar people. I haven’t studied enough Tibetan to be able to confirm this myself, but there is a greater deal of semblance between Burmese and Tibetan than to be found between either of them and Chinese, although I am told that even these similarities are sparse, arbitrary and isolated.

Like Chinese, but unlike Tibetan, Burmese is a tonal language. There are arguably four tones, and I say arguably because this is yet another example of mainstream linguists conflating the categories of tone and register. There are at least two tones in Burmese, that much is beyond doubt.

But the other two modes of articulation really aren’t tonal in my opinion, or rather their defining feature is not pitch. One of them is called the creaky tone in most literature in Burmese phonology. There’s a reason it’s called the creaky tone rather than simply being described by the pitch of voice- the creaky tone is unmistakable, regardless of pitch. So too is the fourth, so called “checked” tone. The checked tone occurs in syllables that end with an unreleased consonant. This “tone” is distinctive from others as it ends in a glottal stop…wtf has that got to do with tonality? Anyhoo ignore my little rant, even though I disagree with calling them tones and believe they more accurately fit the descriptor “(vocal) register”, whatever you wanna call them, Burmese has four.


Also, like Vietnamese, Burmese marks tones on its syllables in a much more straightforward way than Thai.

The low tone is generally unmarked: the consonant and the vowel together without any tonal diacritic produces a low tone syllable: ဘေ he, လို lo, etc.

The high tone has a nice little sign after the syllable that looks like an outline colon. I can’t verify this, but I suspect it is related to a Sanskrit sign which aspirates a vowel, due to the fact that the high tone is quite often very breathy: ဘေး he, လိုး lo, etc.

And the creaky tone generally has a small hollow circle below the final letter of the syllable: ဘေ့ he, လို့ lo, etc.


One other feature Burmese has which has absolutely blown my mind is that a diacritic exists which, when placed below a letter, removes its voicing. Voicing, for those who don’t know, is basically the linguistics term for when your vocal cords vibrate during phonation. Place your fingers over your voicebox and make the sounds /k/ and /g/- when you make /g/ you should probably feel a kind of buzzing. That’s what it means to say that /g/ is the voiced equivalent of /k/. The symbol that devoices a consonant is a small hook that hangs below a letter and points left. In theory this symbol hanging below a /g/ (ဂ) could turn it into a /k/ (က). However we don’t see this symbol used in any circumstance as mundane as that, instead we see it used with the following consonants exclusively: င, ဝ, ဓ and န (which are /ng/, /w/, /m/ and /n/ respectively). Try just for a second to make any if those sounds withOUT your vocal cords vibrating. It is very very hard! And it exactly what is required whenever you see the signs ဝှ, ငှ, ဓှ, or နှ.

These are not uncommon fringe words either- the word for “in” is မှာ, so even in some fairly simple sentences early on in the learning journey, you have to master this sound which constitutes a challenge not just for English speakers, but for most humans, as this sound really does only occur in a very limited number of languages, and even then only under quite specific circumstances.

Tonguethaied- สาวแสนสวยใส่เสื้อสีแสดสวมส้นสูงสีส้ม — January 14, 2020

Tonguethaied- สาวแสนสวยใส่เสื้อสีแสดสวมส้นสูงสีส้ม

A few of my language learning goals for this year revolve round a (still hypothetical) South-East Asia trip I plan to make later this year that will, ideally, cover Vietnam, Cambodia, Laos, Burma and Thailand. Last week I posted an article about Vietnamese and its slightly tricky means of forming 2nd and 3rd person pronouns, and this week I’m inclined to write about the sudden progress I have been making with Thai in the past few weeks.


My experience learning Thai is similar to that of a few other languages, in that I first attempted to learn it and struggled enormously, and years later have returned and made effortless strides.


One of the reasons I had such a hard time with Thai initially is that I dived straight into learning the language without properly briefing myself on the fundamentals of the language. Similarly, the reason I am now so rapidly progressing is that I invested some time to really get to grips with the minute details of the language to a point that I can now  look at Thai and see past the complications and look at it with the same clear head that I have when thinking in Arabic, Dutch or Japanese.


In the next few posts let’s dive into a few of the relatively minor issues for Thai learners and see how we can resolve them by understanding them a tad better.


Commonly cited issue #1: the alphabet has a fuckload of different ways of writing the same sound.


So this is technically, but less than a lot of people believe. Thai textbooks tend to avoid using the international phonetic alphabet to transcribe words because it overcomplicates things a bit. However, if the IPA were used, I think far fewer Thai learners would believe that there are 6 ways of writing the sound ‘k’, or 5 ways of writing ‘p’. Don’t get me wrong, there are 5 ways of writing ‘k’, so its hardly a walk in the park! And the ‘p’ sounds in Thai are not one and the same, there are actually three slightly different ‘p’ sounds, one with 3 letters used to spell it and the other 2 with one each.

Screenshot (9)

The problem here is that English speakers are not raised to know what things like aspiration or voicelessness mean in linguistics.

Aspiration means when a small puff of breathy air is released alongside the consonant. For reference, the ‘t’ in the word toe in English is aspirated, whereas the ‘t’ in steal is generally not. Srsly, try it now, and pay attention to the t’s, it blew my mind the first time I realised they weren’t the same phoneme!

And when we say a consonant is voiced, what we mean is that the vocal chords vibrate and buzz as its pronounced. Bearing that in mind, pronounce the g and k of English to yourself- they’re very much the same sound, but when you say g you can feel a slight hum in your throat.

Generally speaking, in English we tend to aspirate our voiceless consonants, and voice our unaspirated consonants, and as a result it confuses our little colonizer minds (jk, none of my gene pool was on these islands 50 years ago, that’s your mess) when we come across a language that has a four way distinction:

  • voiceless unaspirated
  • voiceless aspirated
  • voiced unaspirated
  • voiced aspirated.

Thai has 1-3, but does not ever (I don’t think…?) aspirate its voiced consonants.

SO, let’s look at the alveolar consonants of Thai (alveolar means that these consonants are pronounced with your tongue pressing the sockets of your maxillary incisors):

  • For voiceless unaspirated we have /t/ represented by ต and ฏ
  • For voiceless aspirated we have /th/ (we represent aspiration with a small superscript ‘h’) represented by ท, ธ, ฑ, ฐ, ถ and ฒ
  • And lastly for voiced unaspirated we have /d/ represented by ด and ฎ

Okurr, I’ll admit, the six different ways of writing /th/ take some getting used to, and as a native English speaker I unfortunately struggle to distinguish all the time between /th/ and /t/…so for me in a way there are 8 ways of writing the same sound. It’s not fun.

But there is one further feature of the Thai writing system that somewhat explains this intriguing system.

You see, the Thai writing system is a Brahmic script; it is derived from the Devanagari script used to write Classical Sanskrit.

Classical Sanskrit used even more consonant sounds than Thai, and as a result when the alphabet was adopted for the use of Thai speakers, there was a glut of consonants. For example as well as alveolar consonants, Sanskrit has retroflex consonants (if you have ever heard Urdu or Hindi spoken you may have noticed a very distinctive and hard ‘r’, ‘d’ or ‘t’ sound- these are retroflex consonants and are produced when the tongue is placed even higher than the alveolar ridge, practically against the palate of your mouth). And to add to this, Sanskrit’s retroflex consonants can also be aspirated! So a tonne of consonants representing non-extant sounds in Thai were adopted, and they were put to very good use.

They were separated into three classes based on their sound qualities (bear in mind, those are qualities in Old Thai that do not all hold true in the modern language). These classes would then, along with the following vowel, determine the tone of the syllable being written.

Screenshot (8)

This isn’t exactly simple, but I can only assume it made perfect sense at the time.

If a syllable ending in a long vowel, begins with a low or middle class consonant, then the syllable is pronounced in the level mid tone.

I’ve created a table below that shows the full extent of these rules. They don’t come naturally to me, and I cannot imagine they come naturally to Thai speakers. In fact I have been told that Thai spelling is often learnt independently of speaking, as spoken tones do not always represent the written standard (unsurprisingly).

Screenshot (12)

The problem here though, is that 1) a lot of spellings are from Sanskrit words and have been transcribed exactly, therefore it is often not the case that these words fit the tonal regulations in the table above and 2) since the time of Old Thai being spoken, many of the language’s old norms have changed and as such the tone in the original spelling no longer reflects the vernacular reality.

So six modifiers have also made their way into the Thai language in order to tidy up the grey areas (fun trivia, the diacritics are each called ไม้, followed by a Sanskrit number, which Hindi/Urdu/Persian/Panjabi/Gujarati/Bengali speakers may recognise!)

Screenshot (11)

So there you have it. I don’t know if I am representative of most Thai learners, although I suspect not for a bunch of reasons, chief among them being that I am hoping to eventually become fully literate in Thai, whereas the vast majority of Thai learners I’ve come across are learning in Romanised script for the sole purpose of tourism or work in Thailand. However, appreciating these little quirks and features of Thai and the history of its writing system have helped me come on leaps and bounds in the past few weeks.

A family affair: anh/chị/ông/bà/cháu/cố là ai? — January 10, 2020

A family affair: anh/chị/ông/bà/cháu/cố là ai?

Standard story here- a few years ago I began learning x and found it really difficult so I quit. Recently I picked it back up and found that 1) it was nowhere near as hard as I remembered and 2) it was a really enjoyable language to learn and have since been acquiring at a rapid pace.

On this occasion said language is Vietnamese, so this article is a little potted summary of the main stumbling block I’ve discovered in the past few days.

Pronouns are a bit of a nightmare and constitute a mountain I need to overcome so that it doesn’t continue to impede my studies.

Pronouns are not the universal phenomenon Europeans often believe them to be.

Those of you who are native English speakers may remember briefly struggling to get your heads around a two-way distinction in the second person. Because the most commonly studied language in the country is French, this phenomenon is nicknamed with French in mind, the tu-vous distinction, wherein both pronouns are used in the singular of the second person, but they differ in formality and politeness. Whether it be tu/vous, tú/Usted or du/Sie, the difference between these pronouns is pretty similar across borders.

English however only has the pronoun ‘you‘, and as a result learners of French probably struggle a tad when getting accustomed to using two different pronouns to serve the same purpose.

This struggle is a mini version of the challenge that faces Europeans when they begin learning, for example, Korean or Japanese or, heaven forfend, Thai or Vietnamese.
Vietnamese has two different ways of saying ‘I’ that are in constant use- tôi and mình.

This is a tu/vous distinction in a way, but instead of the second person pronoun you choose being different, it’s the first person one.

To your close friend you may be inclined to respond when asked how you are, “Cảm ơn chị, mình khỏe, còn chị thế nào?” whereas to your teacher the same sentence would necessarily be “Cảm ơn thầy, tôi khỏe, còn thầy thế nào?”

A greater confusion though is present in the second and third person. These pronouns are all lumped together, which is quite common for East Asian and South East-Asian languages.

In Korean for example, I may address an older male speaker using the word 형 hyeong as a second person pronoun, in his absence I would use that word pronominal to refer to him. In that respect, the word 형 can mean ‘you’ or ‘he’, although the real meaning of the word is ‘elder brother’.

You can see now perhaps why this way of doing things has caused me a touch of difficulty.

Vietnamese, like Korean, uses kinship terms such as ‘uncle’, ‘grandma’ or ‘child’ to address people AND to talk about them.

It is tricky because it means that in order to be ready for whatever situation may arise, you need to learn a myriad of family vocabulary items, rather than just one or two ways of saying ‘you’. It also piles on the social pressure of having to figure out peoples ages relative to you, as well as their gender. This is because most Asian languages distinguish both age and gender in their words for siblings or aunts/uncles.

I tend to play it safe- I address almost everybody in China as either 阿姨 (ayi, auntie) or 叔叔 (shushu, uncle) , unless they are 1) notably elderly in which case I go for 奶奶/爺爺 (nainai/yeye, grandma/pa) or 2) they are a professional dealing with me in the course of their business in which case I address them using their profession title such as 師傅 (shifu, master) , 經理 (jingli, manager) or 教授 (jiaoshou, professor).

If I weren’t so lazy I would bother learning the multitude of kinship terms that various kinds of people around China use to address people in the course of communication and building relationships. But I’ve done so well in China with my very limited kinship vocab that I ended up using the exact same tactic when speaking Korean and Japanese, and now am inclined to adopt a similar approach in Vietnamese.

Cố – aunt
Bà – grandma
Chị – sister
Anh – brother
Ông – grandpa

These are the most common terms in Vietnamese used as pronouns. The age range seems to be pulled down by about 20 or 30 years compared to Chinese, which I say because I have encountered many many shop owners addressed as ông or bà (grandpa, ma) when the same people in China would be addressed as 阿姨or 叔叔 (aunt, uncle). Regardless though, the logic is very much the same.

These five terms are the ones I have decided to take on board myself, however Vietnamese people have around 25 of these terms on the go in daily use, and I have decided that for the time being a receptive knowledge of these will suffice.

These second person pronouns (I am not sure if pronoun is even a correct description for these, as even though they do stand in for a noun, they are nouns themselves), are turned into third person pronouns by pairing them with demonstrative adjectives.
In Vietnamese, we suffix ấy to a noun in order to indicate the demonstrative ‘that’.

Đại học = university
Trương đại học ấy = that university


Chị = you
Chị ấy = she
Ông = you
Ông ấy = he

Third person pronouns are omitted so often in Japanese that I can’t remember the last time I had to use one, including in over six months spent working in a solely Japanese-speaking office. However on some occasions, primarily for emphasis, some kind of subject is needed and in those circumstances one uses あの人 (ano hito – that person), and from there it isn’t such a huge logical leap to land at the pronouns cố ấy, anh ấy, ông ấy, bà ấy etc.

Gnarly Bengali- Chapter I — December 4, 2019

Gnarly Bengali- Chapter I

So, I’ve been here before. I try to learn a South-Asian language and really struggle with a few key aspects, and then summarily give up. This happened to me on a couple of occasions when trying to learn Hindi, but finally one day the mental block began to dissipate; the issues that were stopping me from moving forward – difficulty distinguishing between retroflex consonants and their dental equivalents for example, or remembering what sometimes feels like thousands of conjunct consonants- just ceased to bother me, and as a result I learnt Hindi rapidly.

I have no clue why this didn’t happen the first time I tried to learn Bengali. The writing system is similar, the phonological system is actually simpler than that of Hindi-Urdu, and the conjunct characters form in a similar manner. Whatever the issue was, it went to shit.

But not this time honey, oh no. Daddy is absolutely burning through his Bengali textbooks and all of a sudden is not just reciting dialogues between a fruit-seller and fruit-buyer, oh no, he is reading literature written in shadhu bhasha, the prestige register of formal written Bengali.

Imma give you a rundown of what’s going on in Bengali so you can get a taste of how it works, and maybe also get a taste of how it is now tied in first place with Russian as my favourite language.

Today, the writing system- Bengali uses a script that is derived from Sanskrit and divides the alphabet into groups of consonants based on phonological principles.

For example, there is a group of labial consonants. In this category we have a nasal (m ম), an unvoiced stop (p প), an unvoiced aspirated stop (ph ফ‌), a voiced stop (b ব) and a voiced aspirated stop (bh ভ). Each set of consonants has one of each! It’s a very cool system and a testament to just how into phonics the ancient Vedic grammarians were.

bengali orthography

Sadly, not many language texts teach learners to acquire the alphabet this way, which I do think would pay dividends. It’s a bit tricky and involves learning a bit about phonetics theory, but I do think that I have a much easier time learning South-Asian languages because I have been through this little process.

One other feature of Bengali that it shares with neighbouring Indo-Iranian languages to the West is that the alphabet is an abugida. That means that each letter has an inherent vowel sound attached to it. In Hindi for example, the letter क is not simply k, but rather ka; in Gujarati જ is not just j, but ja.

Bengali bucks this trend though, and instead of a, the inherent vowel is a soft o. Now this may sound like a small difference, but for someone who is coming to Bengali having learned Punjabi, Hindi, Gujarati and Nepali- Indo-Iranian languages which all have an inherent a in their consonants- it took me a very long time to internalise that inherent o.

Bengali actually has a few other peculiarities that I took my time in overcoming. For example, Sanskrit distinguishes between three sibilant fricatives: स (dental), श (palatal) and ष (retroflex). In Bengali these three consonants still exist as স, শ and ষ, BUT they are all pronounced the same, palatally, like श in Hindi, and sh in sherbet. This of course in reality makes things much easier, but it was still an annoying habit that I had to purge from myself.

One other problem of nature, which serves as a great jumping off point to start discussing conjunct characters, is the fact that one of the most common conjuncts in Sanskrit, ksh क्ष, which has become ক্ষ in Bengali, is pronounced simply as k. Another pretty common consonant cluster is ক্ষ্ম which should in theory be pronounced kshm, like Lakshmi, but in Bengali the sh is absorbed by the k which becomes geminated and the m disappears altogether. This, along with the fact that Bengali’s inherent vowel is o rather a, means that the Hindu deity called Lakshmi in Sanskrit, is called Lokki in Bengali (লক্ষ্মী).

Now the conjunct characters- the aforementioned system whereby each consonant also has an inherent vowel attached to it means that, by and large, on occasions when two consonants appear together with no vowel separating them, a special reduced form of the letter needs to be drawn, fusing it with the second consonant. Conjuncts can contain up to five consonants in Sanskrit, but most common have just two in Bengali, with a minority of triples and quadruples existing, particularly in tatsama (words integrated into Bengali from Classical Sanskrit).

Doubled consonants are common in many languages, and they appear in Bengali too. Let’s take a look at how different shapes on consonant meld together to form geminates (doubles):

bengali 2

First we see the letter d দ, which when followed immediately by another দ with no vowel in between tags the second দ right onto the end, becoming দ্দ! I am not a fan of the letter দ as I feel it lacks the cursive elegance of other Bengali letters, but I can’t deny my fondness of the double দ্দ.

The letter ম (mo) has a miniature form which attaches to the second consonant in a conjunct. This is not just the case with double m (ম্ম), but when ম m precedes other letters, like in ম্প (mp), ম্ব (mb), ম্ল (ml) or ম্ভ (mbh).

Some letter, among them l ল and n ন, attach two slightly miniaturised versions of the letter to the same stem in order to indicate gemination: ল্ল, ন্ন. Other letters which do this are k ক, g গ, and p প, which become ক্ক, গ্গ and প্প.

Lastly we see two more unusual and unpredictable conjuncts. When the letter t ত is doubled, they both attach to one another and one of them flips to form the conjunct ত্ত. Equally unpredictable is what happens when the retroflex ট is doubled, and all that appears to change is a small tail is added to the single letter ট্ট.


Now let’s look at the other conjuncts, which don’t involve doubling.

The letter b ব has a rather cute mini-form that can be stuck onto the side of the second letter in the conjunct like an adorable little barnacle: ব্জ (bj), ব্দ (bd), ব্ধ (bdh).

The letter r র, as it is one of the most common letters to appear in a conjunct in all Indo-Iranian languages, has a special curtailed form. When it is the first letter, it appears as a diagonal dash over the top of the second letter, as we see when r র precedes sh ষ it is reduced to a dash: র্ষ.

That letter sh ষ also takes on a rather different form, almost mutating back to the Sanskrit letter ष from which it originates when it forms conjuncts: ষ্ট (shṭ), ষ্ঠ (shṭh), ষ্ফ (shph), ষ্ক (shk).

The letter y য় is also relatively common to see in a reduced form, which is a wavy line falling from the top bar to the bottom, and its inclusion in a conjunct often has an impact on the subsequent vowel. Its form is always the same when the second element in a conjunct: ত্য (ty), চ্য (cy),প্য (py).

I included the conjunct ঞ্ছ just to show that sometimes the conjunct form does not necessarily follow any logic, but simply consists of mushing together the two or more letters to form a recognisable glyph.

beng 3

Lastly this list includes some triple conjunct just to show how complex they can end up. There are relatively few of these, and they are not common, so learning them is not a huge ball-ache. Note the small wiggle attached to the bottom of a letter when r র is the second letter in the conjunct, like in ন্দ্র (ndr).