Vertical Metrics with an eye to an expanding charset - what are the extremes?

Richard Fink's picture

When you set up Vertical Metrics for a font with the Windows 1252 charset, let's say, life is fairly easy.
The outer limits of descenders and ascenders and diacritics above and below are readily seen.
However, if the font is expanded in the future there may be glyphs that extend much higher and/or lower than anything in that first group of glyphs. In fact, it's a certainty.


In light of the vast experience of some of the folks here at Typophile, I was wondering if someone could share what glyphs are the tallest and what glyphs are the lowest (and the languages to which they belong) once you venture past, say, the Adobe Latin 2 set or Windows 1252.

Any advice on what glyphs to use for the uppre and lower outer limits?

All thoughts appreciated.


John Hudson's picture

Are you thinking just in terms of Latin script extensions, or extensions to other scripts?

Richard Fink's picture

I saw that question coming but didn't have a good answer so I just waited for someone to pose it.
And there it is.

Latin script extensions would definitely be included. After all, if the glyph set was expanded to include glyphs from Adobe Latin 3, which is a kind of "next step" I suppose, then that's a good part of what we'd encounter, correct?
However, then, from what I've seen, the next big step in enlarging a latin based font for other language communities is usually adding Cyrillic chars. Do you agree?
If so, let's include Cyrillic.
On the flip side of the globe, let's NOT include Asian glyphs.
(And as an aside - I know Hebrew well enough to know that it would fit in within the upper and lower limits of a font with an existing Latin 3 set. Would that also be the case with a similar language like Arabic? Huge language community, Arabic. It would be nice to know it could be added without recalculating the verticals.)

Still thinking out loud - where would Vietnamese with its stacked diacritics fit in to this question?

What I'm looking to avoid most of all, is this:
An encoding like Win 1252 for a particular font has glyphs which call for a relatively narrow range above and below the descender and the caps height. And so you set your vertical metrics based on that.
A week later, you get a call to expand to Latin 3 and to cover Vietnamese while you're at it and who knows what other language and there go your original vertical metrics right out the window.
Better to calculate enough space to accommodate these additional glyphs should they be added.
Charles Kettering famously said, "A problem well-stated is a problem half-solved" so here's my problem statement:

** I'm looking for a formula for calculating vertical metrics sufficient to accommodate other glyph sets that are most likely to be added later to the font file and/or requested in the data stream - whether a part of the same font file or not - to be delivered to the web page along with the core glyphs. **
That's what I'm after.

(Here's an actual real-life example of the problem:
  if somebody runs a font through the Font Squirrel Generator and asks for a small subset, the Generator will calculate the vertical metrics based on the glyphs in that subset. However, if you ask it for a larger subset, it's quite likely the font will come back with vertical metrics expanded because taller glyphs are a part of the larger subset. I've seen this happen and, needless to say, it can screw up the line-height consistency within a font family pretty nicely if you don't catch it. All of a sudden descenders start getting clipped which never happened before.

Now, since it's you who's answered this phone when it rang and your knowledge is vast, you got a recommendation for me?
As YOU begin a font with the basic ASCII charset, do YOU have a formula that leaves enough room vertically for them long and tall glyphs coming down the road?

Eager to hear your thoughts.


Richard Fink's picture

Perusing Windows system fonts using MainType and reading up on the Windows Glyph List - as far as its supposed Pan-European language coverage - Capital Letter A With Ring Above And Acute from the Latin Extended-B set is the tallest glyph I see.
Below the baseline - it seems like the lc 'j' or 'g' dips as far down as anything else.

John Hudson's picture

This whole issue is why I've been campaigning for script- and language-specific vertical metrics in fonts for many years.

Vietnamese is the commonest case for stacked Latin diacritic marks. The Danish Ǻ is usually taller, but is very rare in actual use, being mostly limited to use in grammars, dictionaries, and discussions like this one. There's not a lot of use of subscript diacritics in Latin orthographies, and when they do occur they tend not to sit lower than the descenders. This is also true for Cyrillic and Greek.

When you get beyond European scripts, however, you'll find a lot of use of descender space. To be properly proportioned, Arabic needs deeper descender space than Latin, plus extra if text is going to be vocalised. Many Indian and Southeast Asian scripts vertically stack conjuncts downward, and can require massive descender space. Here, for example, is Javanese, which is an extreme case.

Trouble is, if you start setting font metrics to anticipate non-Latin extensions, then your default linespacing is going to be too large for English and other European language text; whereas if you set metrics for European languages it will be disasterously too tight for many others. Script- and language-specific metrics settings are the only sensible way forward, but I've yet to get any software makers to commit to it.

Richard Fink's picture

Thank you.

I get the picture now.

Language specific vertical metrics do, indeed, seem sensible. (I can also see why it's a tough sell.)

That said - Cascading Style Sheets gives web authors the ability to manipulate line-height. Further, the characters requiring the extra height can always be spun off into a separate font which can either be applied to sections that require it or as a 'fallback' font which the browser uses when it does not find the characters it needs in the font or fonts listed previously in the "font stack".

As remedies go, that seems to be the sum total of what can be done. (Although I'm wondering if browsers will override the CSS settings. A good question I should test out.)

For now, all my work is with Latin-based and/or European languages.
As a precaution, I'm just going to include a glyph with no Unicode point - maybe just a solid vertical bar or something, to mark the upper limit in cases where, for example, an ASCII subset of the font slices away anything taller than, say, a capital H.
And see how it tests out and if it's worth the trouble.
It seems unnecessary to mark a lower boundary.

Thanks again.


Syndicate content Syndicate content