Unicode Capitalization and Small Capitals

JCSalomon's picture

How do people deal with letters with no upper-case form, or where the upper-case is comprised of multiple glyphs? Two cases I’ve seen on this board are the Greenlandic Kra (ĸ maps to K‘) and German Esszet (ß usually maps to SS, but a true capital ß has been proposed; see the Wikipedia entry and the 2004 Unicode proposal and its 2007 resubmission).

Where should software handle these distinctions? Are the small capitals graphical variations on the minuscule or majuscule letters, and so where should the capitalization happen, in the typesetting program or in the font?

dan_reynolds's picture

Many Linotype OpenType fonts that include small caps include small cap glyphs for ß fi fl and ij. In the case of ß, the glyph is two small cap s forms (SS). Same for the ligatures (FI FL IJ).

I have heard that, even if a cap ß comes into use, recommended capitalization will still be SS. For one thing, you don't want pre-existing documents to reflow! Old documents will have to be edited to have the SS replaced with the capital ß; new documents will have to physically enter the capital ß in somehow, i.e., selecting a text with an ß in it and capitalizing that text completely will probably swap that ß with SS. Don't know how the user will actually type in the capital ß yet. There is always the glyph palette…

eigi's picture

For small caps you don't need capitalization. In OpenType fonts you have two different features to turn lowercase letters into small caps [smcp] and to turn uppercase letters into small caps [c2sc]. You may also have two different sets of small cap glyphs in a OpenType font, one to represent the lowercase as small caps, another for the uppercase.

JCSalomon's picture

The trouble (as I saw it) with small-caps de-ligaturization where the capital or small-caps form is not a ligature (as in ß ↔ SS, fi ↔ FI, &c.), is that the letter spacing becomes hard to adjust. I wasn't aware of the [c2sc] feature in OpenType, though; the standard Unicode capitalization routines followed by [c2sc] are what I was hoping existed. Thanks.

Thomas Phinney's picture

Some engines do it exactly the way JC describes. This is probably the most reliable approach overall.



Syndicate content Syndicate content