Cyrillic-Using Countries - Plain or Extended?

hrant's picture

I have a client who has asked me to figure out which countries in a list use "plain" Cyrillic and which use an extended alphabet/encoding. I told the client I need help to do that reliably, so here I am. Below is the list, with what I know or think (the latter marked with a question mark) shown. (Note that some of these countries have a completely non-Cyrillic primary alphabet, but still use some flavor of Cyrillic as a secondary system).

Armenia: P

Azerbaijan: E?

Belorussia: P

Georgia: P

Kazakhstan: E

Kirghizistan: E

Moldova: P?

Russia: P

Tajikistan: E?

Turkmenistan: E?

Ukraine: P

Uzbekistan: E?

Many thanks for any confirmations/corrections.

hhp

nina's picture

I'm no expert (just somebody with a couple of Ukrainian friends), but I believe Ukrainian (the language) is written with a localized variant of the Cyrillic script. They have tittles. :-)


[Screenshotted from Wikipedia, which also says Ukrainian is sometimes Romanized – don't know if that matters in your context.]

nina's picture

Argh, editing doesn't work. Just wanted to add – if you're only going to cater to Russian-speaking Ukrainians, what I said before is obviously a moot point.

Jongseong's picture

This depends on what is meant by "plain Cyrillic", which is a nebulous concept considering the history of the script and various reforms that took place for different languages. (Trying to define "plain Latin" runs into similar problems, believe it or not.)

For simplicity's sake I'll consider the letters encoded in ISO 8859-5, which never caught on but whose Cyrillic letters form the basic Cyrillic block of Unicode, to be "plain Cyrillic". Then Bulgarian, Belarusian, and Russian are covered by plain Cyrillic.

Here are the rest, and the letters not supported by ISO 8859-5.

Azerbaijani: ғ, ә, ҝ, ө, ү, һ, ҹ
Kazakh: ә, қ, ғ, ҳ, ө, ұ, һ
Kyrgyz: ң, ө, ү
Moldovan: ӂ
Tajik: ғ, ӣ, қ, ӯ, ҳ, ҷ
Turkmen: ә, җ, ң, ө, ү
Ukrainian: ґ
Uzbek: қ, ғ, ҳ

I haven't been able to find much data on Cyrillic as used for Armenian and Georgian.

Caveats: I've listed the above in terms of languages, but of course in reality Russian dominates Belarus and some parts of Ukraine, Russia also hosts a lot of minority languages that use their own extended versions of Cyrillic, etc. Also, pre-reform Russian or Bulgarian had letters that wouldn't be considered "plain Cyrillic".

ETA: It's interesting how Unicode organizes Cyrillic characters. There is a block called "Basic Russian alphabet" and another called "Cyrillic extensions"; these together make up what I've called "plain Cyrillic" above. If we restrict ourselves to the "Basic Russian alphabet", then even Russian has a letter that is not supported, ё, which, to be fair, can be replaced with е in most circumstances. Belarusian ў is also not supported. Bulgarian ѝ, while not considered part of the alphabet, is used in the orthography, and also is not supported in the "Basic Russian alphabet".

Maxim Zhukov's picture

I find this Wikipedia listing useful. The Ethnologue collection of links is not bad either. I don’t know how reliable or up-to-date all that data is, though. Also, note that that information is on Cyrillic-using languages, not countries.

Maxim Zhukov's picture
  • This depends on what is meant by “plain Cyrillic”

I take it, that is the character set covered by the ‘standard’ Cyrillic (Windows 1251 / Mac Cyrillic 10007) codepage. It supports the typesetting of many Slavic languages: Belarusian, Bulgarian, Macedonian, Russian, Rusyn, Serbian, and Ukrainian. Maybe more.

Si_Daniels's picture

The Paratype catalog (Fontbook format) has a nice reference section in the back that lists these out.

John Hudson's picture

I usually make a distinction between Slavic Cyrillic ('plain') and non-Slavic Cyrillic ('extended'). This seems to work quite nicely because, so far as I recall, all the non-Slavic orthographies require at least one letter that is not found in any Slavic orthographies. This, in turn, allows you to quickly identify the two sets based on linguistic group.

John Hudson's picture

PS. It is better to base the differences on language rather than country, since languages tend to spill across borders.

Jos Buivenga's picture

tracking

hrant's picture

Just wanted to say thank-you for everybody's help.
It's mostly clear now, although it's difficult to keep
one's head wrapped around all of it!

--

> It is better to base the differences on language rather
> than country, since languages tend to spill across borders.

Academically, yes. But clients tend to have geographic
domains; they sell their products based on agreements
with individual countries, so fonts have to work for
for all (or virtually all) writing in a given country.

hhp

apankrat's picture

You may want to add Mongolia to the list.

John Hudson's picture

Hrant: clients tend to have geographic domains; they sell their products based on agreements with individual countries, so fonts have to work for for all (or virtually all) writing in a given country.

Right, but a mapping of characters -> countries is actually a mapping characters -> languages -> countries, even if you hide the middle step from the client.

Syndicate content Syndicate content