## EM???

can anyone point me toward any reading material that will help me better understand the Em square concept and related concepts like the en, etc.?

thanks for such a quick response, tiff! actually, i'm trying to figure out how to calculate the en space to set my diacritics on. according to Microsoft, the en space should be half of the em. I guess that if the UPM is 1000, that should be 500, right? is that all there is to it? or is there more to it than that? i guess i should have been more specific...

Em is a unit to measure type. It is not really related to the letter "M" in any manner.

In digital type design, em is the size of the em square. The emsquare is a hypothetical square that your type design relates to. The always corresponds to the point type size in the design application. This means, whenever you set 10 point type, it will be the em that actually will be 10 points. In a digital font, the em is divided into units -- typically 1000 units per em. If you set 10 point type, the 1000 units will correspond to 10 point. If your caps are 700 units high, this means that in 10 point type, the caps will be 7 point high. If your descenders are 200 units, they will be 2 point long in 10 point type (and 3 units in 15 point type, obviously).

An en space should be 1/2 em, i.e. (typically) 500 units wide.

It is advisable that the distance between the ascender line and the descender line in your font equals the em size, i.e. typically 1000 units. For example, if your ascender is 750 units, the descender should be 250 units. By doing so, the user will have better chances to accurately measure the type size of printed text set in your font.

Thank you for that clean, consice explanation, Adam. And thank you Tiff for being so willing to help and quick to respond. I appreciate all your help.

Yes. In most PostScript applications, an em is 1,000 units, and an en, being 1/2 an em, is 500 units.

If you want more, in current usage, the *EM* is a unit of measurement that equals the point size of the type. Note that in the past there have been other defintions of an em (space).

When you express this in units, the width of the em (& maybe the height) becomes whatever the "unit basis" is for a font or composition system. The old Monotype metal systems used an 18-unit em. Linotype linceasters didn

> PostScript uses a 1,000 unit em.
> I've seen fonts (always Trutype?)
> with a 2,000 unit em.

In Type 1 and OpenType PS fonts, an em square of 1000 units is recommended, but other sizes are possible. In OpenType TT / TrueType fonts any number of units per em square up to 16384 is allowed. Typically you will see OpenType TT / TrueType fonts with 1000, 2000 or 2048 UPM (units per em).

> Why accents should center within an en space
> is unclear to me. The use of floating accents
> depends on the composition system.

In font formats that use Unicode (TrueType and OpenType), you generally include two sets of "stand-alone" accents: spacing and non-spacing. It is recommended that the spacing accents have the advance width equal 1/2 em, while the non-spacing accents should obviously have the advance width of 0. For example, the spacing acute (U+00B4, glyphname "acute") in an OpenType PS font with 1000 UPM should have the advance width of 500, while the non-spacing acute (U+0301, "acutecomb") should have the advance width of 0. Both diacritics should be centered on their respective advance widths.

> If I may be a little unkind to MS Word,
> I don

Interesting.

When I went & opened up Adobe Minion Pro in FontLab, the COMBINING DIACRITICAL MARKS section was empty; most of the

Adobe, to date, has not made much use of combining diacritical marks, because the languages they are addressing have not required them for encoding purposes. All the fonts I've made for Microsoft, on the other hand, contain combining marks, and some contain OTL anchor attachment positioning for these.

> an accent will not center over a letter it
> follows. Software will be needed to position
> the accent. I suppose it helps if the
> software can make certain assumptions?

There are two possible scenarios. First, it is possible to specify precise coordinates by implementing anchor positions and an OpenType Layout "mark" feature. Applications such as Microsoft Word 2003 for Windows that support the "mark" feature will position the accents precisely as specified. If the "mark" feature is not included in the font, a layout engine such as that in Microsoft Word XP (or Publisher XP, for that matter), the accents will be centered. Of course, there still will be layout engines that will not do anything reasonable and just put the accent right after the preceding glyph.

Note that OpenType Layout also allows for different glyphs to be substituted in certain contexts. Different stylistic variants of a glyph (also accent) can be substituted automatically if the glyph follows or precedes a certain glyph, or when the user activates a certain feature.

By the way, I never follow the recommendation to put spacing marks on an en space. I usually put them on the width of the lowercase o, positioned so that they would be in correct relationship to the o if superimposed. Then I can make the combining versions as composites and simply shift the left sidebearing onto the right sidebearing. This gives me reasonable rough positioning for the combining mark over a preceding a e or u, and perfect positioning over o, in applications that do not support GPOS mark positioning.

John,

so your combining accents have a zero advance width but they are centered at -50% of the "o" width? (I.e. if "o" has a width of 400, then the centre of the combining accent is at -200?)

Not to break an interesting discussion, but in passing, could you talk about how would one handle the scansion marks *ictus* (a sort of large *acute*) or the *longum* (a sort of *macron*)? With scansion marks, the mark is usually positioned in the middle of the syllable, so the longum and breve are much wider than the diacritics which go over a single letter. -- when we had to set a scansion breve, I'd use one of the phonetic symbols -- a tie, I believe --then draw up a macron to match it. In classical prosody, you use the longum & breve with Greek & Latin; the ictus only is used in English. But as with most things, modern writers on prosody have adopted more symbols to show more things, so the ictus & "reverse ictus" (*grave*) are now used with the longum & breve.

This becomes more of an issue as the *file* rather than the printed book becomes the end product of publishing.

C

so your combining accents have a zero advance width but they are centered at -50% of the "o" width?

'Centered' is a bit misleading: they are positioned so that they will visially be in the correct place relative to a preceding lowercase o, as in this example.

Charles, the syllable-level scansion marks seem to be similar to the challenges of Hebrew masoretic mark positioning. This is a circle above the text that is properly placed as close as possible to the centre of a word (taking into account that there may be other above-letter marks that will displace it slightly). Unicode has no mechanism for encoding marks relative to words, and smart font formats like OpenType have no mechanism for positioning marks relative to words. In both Unicode and OpenType, a combining mark belongs to the immediately preceding base character. So this means that a manual mechanism is required to allow the user to encode and position the masoretic mark as close as possible to the desired position. In effect, this means that the mark has to be positioned either above a base character or between two base characters. A greater level of refinement than that it difficult to obtain without user-controlled adjustment of mark positioning (such as provided in the Middle East version of InDesign).

This is the mechanism I worked out for the Hebrew masoretic mark, which perhaps will help you with your scansion marks.

The mark is centered on a zero width (this is actually common for Hebrew marks, due to the method of mark positioning used by some older Hebrew layout engines, and so differs from the typical left-offset of Latin marks or right-offset of Arabic marks). This means that when the mark is not positioned relative to a base using GPOS mark positioning, it will naturally rest between two base characters (blind positioning). GPOS anchor attachment information is provided to position the mark above a preceding base character. This anchor attachment positioning becomes the default positioning. This positioning can then be inhibited by insertion of the zero-width non-joiner character between the preceding base and the mark, restoring the blind positioning between the two base characters.

[By the way, I'm familiar with the term ictus in a different context: in Gregorian chant notation, it is the small vertical line below a note (or in some note arrangements above) indicating the first beat of the rhythm in that part of the melody.]

**blushing** i didn't notice that this was in the BUILD area. i wondered why you need to know. **laughing** sorry. as you were.