Upper and lowercase relationship for non-Western glyphs

Igor Freiberger's picture

There is no need to include in a font any instruction about upper and lowercase for the usual A–Z/a–z characters.

But what about these uncommon glyphs for African languages, phonetic support and transliterations? Must I include some OT instruction to say that uni0195 is the lowercase for uni01F6? Or this is known by client application due to its Unicode support?

A side question: if the relationship between upper and lowercase is informed by Unicode, how to handle pairs of glyphs with no codepoint? Say, Gtilde and gtilde or Eacutedotbelow and eacutedotbelow. As they are out of Unicode specification, will the client program or OS understand their upper/lowercase relation?

This question does not refers to small caps/all caps/petite caps control. I know these are handled by smcp, case, c2sc and pcap features (or, to petite caps, a Stylistic Set as they are still not widely supported).

Theunis de Jong's picture

Must I include some OT instruction to say that uni0195 is the lowercase for uni01F6? Or this is known by client application due to its Unicode support?

Yes -- I mean, yes, the client application has to know what's what. InDesign, for example, got updated to the most recent Unicode tables somewhere between CS and CS4. In the older version, using a trick like "All Caps" (as text attribute) or "Change case to Uppercase" didn't work for all characters one would expect. In CS4, at least, it does.

A side question: if the relationship between upper and lowercase is informed by Unicode, how to handle pairs of glyphs with no codepoint? Say, Gtilde and gtilde or Eacutedotbelow and eacutedotbelow. As they are out of Unicode specification, will the client program or OS understand their upper/lowercase relation?

Nope. If you go over a glyph list in Unicode order, you will see there is no apparent relation between the Unicodes for lowercase and uppercase. As an aside, since these values are defined in the Unicode table, un-encoded (private) characters can not have such a relation with each other, per definition.

I think "All Caps" (the text attribute) may be using a 'case' Opentype table when present, because it also converts regular accents to 'uppercase' accents, which are definitely *not* in the Unicode list.

.. Okay, just tested it with Minion Pro: the acute accent gets translated to an uppercase variant when using "All Caps" (the 'case' case), but not when using a manual "Change to Uppercase".

Igor Freiberger's picture

Thanks Theunis. But how I can associate through OT code two uncoded glyphs? As you pointed, "All caps" commands will work due to OT classes and features. But simple convertions to/from UC/lc will not.

Mark Simonson's picture

OpenType features do not depend on Unicode. In FontLab (and the AFDKO), feature code is based on glyph names. (I'm sure this is true in VOLT, too, but I've never used it.) Ultimately, in the actual font, after it's been compiled, it's based on the glyph index. If we couldn't reference unencoded glyphs in OT code, we'd be in big trouble.

Igor Freiberger's picture

Thanks, Mark. But how to handle plain upper and lower case using OT features? This does not fits smcp, c2sp or pcap. Am I missing something about the case feature?

Mark Simonson's picture

Well, the case feature is intended for something called "case-sensitive forms". The most common use is for punctuation (such as hyphens and bullets), which is normally centered on the x-height. When the case feature is enabled, such as when "change case to all caps" is invoked, then the punctuation is centered on the caps instead (usually using glyph substitution). It certainly seems possible to use this to cause unencoded lowercase glyphs to be substituted with unencoded uppercase glyphs, but you may need to educate users to select "change case to all caps" for the proper result. Some common contexts (such as the caps-lock key and the shift key) will not, unfortunately, invoke the case feature.

Syndicate content Syndicate content