HOW do Illustrator and Indesign find the alternate glyphs in the font files?

glyphiac's picture

Hi everybody,

do you know how Illustrator and Indesign "find" the otf glyphs in the font files(alternates which only can be accessed through the glyphs palette that both programs have)? I tried to find a technical specification, but wasn't successful.

Cheers
glyphiac (from germany)

George Thomas's picture

As explained to me just today, InDesign finds characters based on Unicode and whether the .alt glyph(s) appear in an OTF feature within the font.

My issue is that my .alt glyphs are showing up in the InDesign glyphs palette with the name "NULL", which I don't like since they do actually have names. I'm looking for a workaround to that today.

charles ellertson's picture

...and whether the .alt glyph(s) appear in an OTF feature within the font.

That's the key. Ultimately, InDesign seems to have to associate a character name with a Unicode number. If a glyph itself does not have a Unicode number, it needs to use a feature definition to so associate.

We've have trouble with this issue. Take a font where the lining numbers are not the deault -- that is, don't have any Unicode number. If they are not hooked up to the numbers with a feature -- say lnum, they wind up with the name NULL in the InDesign pallet.

Worse, if you enter one via the glyph pallet, it will give you the right character on that machine only. Put the file on another machine, and all NULLS being equal, there is no telling what you'll get, but it likely won't be what you want.

We helped a customer track down a problem with a jacket they had prepared that had just this problem. Took me a while. Actually, all I really know is that as soon as I associated the glyph with another glyph that had a Unicode number via a OT feature, the problem went away. Take out the feature, it came back. QED...

Igor Freiberger's picture

Glyphiac, a font contains a lot of information besides the glyphs. One of these is a set of OpenType features –actually, codes saying exactly what you asked: alternative glyphs, substitutions and specific styles.

Features are pre-defined OpenType groups of commands. Take for example the hist feature. It is used to replace regular glyphs with its historical, archaic versions. A quite common code is:

feature hist {
    sub s by longs;
} hist;

The substitution above replaces the regular s with the old longs.

Another feature is smcp, which means "small caps". It defines all substitutions the application should apply when the user set this style. A sample:

feature smcp {
    sub a by A.sc;
    sub b by B.sc;
    sub c by C.sc;
} smcp;

Here, you have the A.sc, B.sc, and C.sc glyphs in your font, and say they will replace a, b, and c when using small caps. Of course, a proper code will define this change for the whole alphabet, not just a-b-c. And the code must actually include some additional info, like the script used (Latin, Greek, etc.) and the language context.

To make things easier, there is another portion of codes included in fonts: the OpenType classes. A class is just a reunion of several glyphs into a name. For example, you can create the class "lowercase" with all a-z alphabet and "smallcaps" with the corresponding A-Z small caps. Then, you abbreviate the smcp feature just referencing the class:

feature smcp {
    sub @lowercase by @smallcaps;
} smcp;

Font editors, like FontLab Studio, Glyphs and Robofont, offer a panel or window to let the designer edit these codes, which are embedded into the .otf file when the font is generated. Type designers must carefully choose the features the font will offer and include all needed glyphs into it.

This is just a small explanation about OpenType codes. There are fonts with very few instructions like these and other with tons of replacements. You can find a lot of material regarding this on the web, especially here in Typophile. To find them, use the "site:typophile.com" as an argument in Google search:

OpenType features site:typophile.com

Hope this helps.

George Thomas's picture

Charles, you wrote: "Actually, all I really know is that as soon as I associated the glyph with another glyph that had a Unicode number via a OT feature, the problem went away."

Can you give an example of how to accomplish this?

charles ellertson's picture

George: Sure.

Let's suppose you're using the open source Alegreya bold (not the HT-Pro). You are using it for the chapter number in a book, just the numeral itself. But you want the lining figures, not the default, which are old style.

So you click on it in the glyph pallet, notice that the lining figure has a GID number but is otherwise given as NULL. No matter, up it pops in the file, you screen it back, place it where you want the chapter number, and everything seems to be working.

Now open the file on another machine. No telling just *what* NULL character you'll get.

On the other hand, if you build a lnum feature, it works perfectly. I surmise that at some level, InDesign needs a Unicode number to function properly. By having the lining glyph "1" associated with 0031 via a OT feature (lnum, in this case), that is done. Without the feature, it's just another NULL character, and they're all the same -- NULL.

You could use a stylistic set, or calt, or ccmp, whatever is appropriate. For something that is purely an ornament -- has no legitimate relationship to any character as given by Unicode -- we assign a private-use Unicode number. At least that alerts anyone making further use of the file that some investigation is needed.

Some of the old Type 1 fonts are particularly problematic in this -- they'll have ornaments in Latin character positions. Of course, as soon as you change fonts, or create an XML file, etc., you get that Latin character. And sometimes adding a character results in a different word still in the language, which can be an oops (Say, for example, "P" is taken up by an ornament. So, you have "ornament" then "oops"...Indeed.

The rule in our shop is any glyph in a font either (1) has a Unicode number or is, via a valid OpenType look-up, (2) associated with a character that has a Unicode number. Further, if it's an ornament, that will be a Private Use number, which has no syntactic meaning. In no case will any Unicode assignment be a "lie."

George Thomas's picture

Thanks Charles. I figured out how to do it using salt and it seems to work fine now -- at least for my ten or so .alt glyphs. The only remaining problem is that InDesign CS6 still lists dieresis and dotaccent as NULL. Given that they have their own Unicode number, that is very odd.

charles ellertson's picture

George, I suspect the font you're using is to blame for that. I opened up CS6, created a "new" document, and picked Minion Pro - regular as the font for the text block.

In the Glyph pallet, the dieresis shows as Unicode 00A8 (not what I'd use, but it is valid...). In the glyph pallet, even the case and small cap diereses showed, and not as NULL -- they were "associated" through the case and the c2sc & smcp features in the font.

As far as accents go, when I remake a font, I take out this legacy encoding and have two diereses, one a spacing modifier, one a combining diacritic. (Our fonts aren't for sale, so "legacy" issues don't come up.) Again, both have the proper Unicode number assigned. I don't have the "case" associated, as I always make up a component glyph in FontLab and access it either directly (if in the Unicode index) or through ccmp. If you use mark and mkmk, you won't be making up precomposed characters, so you'd probably need to include any .case or .smcp accents in the appropriate OT features -- case, and scmp & c2sc.

In short, I still think that if in the font the glyphs have either the proper Unicode assigned, or are associated with such a glyph, all will work. Not had this approach fail me yet, and we do a lot of scholarly books that get into some strange languages, including Native American ones.

blokland's picture

Igor: ‘Font editors, like FontLab Studio, Glyphs and Robofont, offer a panel or window to let the designer edit these codes, which are embedded into the .otf file when the font is generated. Type designers must carefully choose the features the font will offer and include all needed glyphs into it.

At the risk of repeating myself on this subject (since 2002, when the first edition of FM was released), I would like to point out here (again) that FM and OTM subset the OT Layout features (stored in a file like for instance this one) based on the characters available in the font.

FEB

Igor Freiberger's picture

Frank, I'm sorry for the omission. As I never used DTL FontMaster, I am unfamiliar with its features so I usually do not to mention it. But I think your repetition is –and will be– always welcome, especially considering the elegant procedure you pointed.

Thomas Phinney's picture

"We've have trouble with this issue. Take a font where the lining numbers are not the deault -- that is, don't have any Unicode number. If they are not hooked up to the numbers with a feature -- say lnum, they wind up with the name NULL in the InDesign pallet."

Yes, that's right. InDesign uses the Unicode value, or the Unicode value plus the features, required to access the glyph via the glyph panel. If the glyph is unencoded AND not accessible via OpenType features, then InDesign stores just the glyph ID.

Of course, that's not consistent across unrelated fonts, and unless the designer takes care, it may even change across different builds of the same font. But then, what would one expect?

charles ellertson's picture

Of course, that's not consistent across unrelated fonts, and unless the designer takes care, it may even change across different builds of the same font.

Or, if someone exports a pfd of a 4-color book jacket, & sends that off to a printer, guess how all the nulls will likely wind up...

I could make an argument, based on the level of knowledge one might reasonably expect amongst InDesign users -- e.g., think graphic artists who design book jackets -- that no null characters show up in the glyph pallet. That way, they're not tempted. I'm aware an opposite argument could be made.

Before you call that "user error," where is anyone without those graphic artists who don't want, and don't think they should have to, understand the bowels of OpenType?

It has to be on the people who offer type products. Perhaps we need yet another distinction, between "type designer" and "font designer," remembering it is the font that gets sold under license.

Thomas Phinney's picture

> Or, if someone exports a pfd of a 4-color book jacket, & sends that off to a printer, guess how all the nulls will likely wind up...

If the fonts are embedded in the PDF, as is almost always the case in professional work, then everything will be just fine.

> Before you call that "user error,"

All glyphs in a font should either be encoded, or accessible via features. Any error is on the part of the font maker. My comment was not intended to put any onus on the end user to deal with this stuff.

> It has to be on the people who offer type products.

Agreed—and even more specifically, on the people who produce the fonts.

charles ellertson's picture

If the fonts are embedded in the PDF, as is almost always the case in professional work, then everything will be just fine.

I suspect I'm getting too far off topic, but this happened to a customer of ours, who asked for help diagnosing a problem with a book jacket she'd prepared. She had wanted a prime -- usually not included in fonts these days -- and had sort of found one, but it was neither encoded nor linked to an encoded character. It was caught in the printer's preflight as a strange character. Certainly not a "primey thing." When I looked at her files, it was there, as a NULL character. I assumed that was the problem.

Now with a four-color jacket, the printer had almost certainly taken things apart to make the separations. What was sent to the printer was a (single) PDF. I assumed it occurred at that step, even though the printer was using the same fonts, but on a different system.

We had a similar instance occur within our shop, no color work involved. I don't remember the character, but it was one glyph on one machine, a different one on another. We just made a rule that no NULL characters could appear in any final file.

Of course, I had some tidying up to do as well, as I'd put things in the font file with no linkage, such as the .case accents I use only to make up glyphs within Fontlab. It's the sort of thing that can happen to anyone if they're not thinking about consequences. I fully understand it is hard for a small shop type designer to anticipate all this, esp. as I've done it myself.

Thomas Phinney's picture

I certainly have a few unencoded glyphs in my current fonts in development. For example, I have a variant ff that is only used as a component in assembling triple ligs starting with ff. (ffh, ffl, ffk, etc.)

So I sympathize as well. It is not crazy to have some detritus in font files.

That said, the prime and double prime have legit unicodes, so I encode them. :)

T

charles ellertson's picture

That said, the prime and double prime have legit unicodes, so I encode them

Bless you. The prime appears in many Dewey classification numbers, found on the copyright page of most books, e.g.

181′.4--dc22
2011006922

Another character used frequently in books on the copyright page, almost always overlooked, is the archival paper symbol, 267E

http://upload.wikimedia.org/wikipedia/commons/thumb/c/cc/Acid-free_paper...

It is usually found in close proximity to the copyright symbol, and should match in size, weight, etc.

Igor Freiberger's picture

Charles, I find extremely important to learn from your experience. I was not aware of the editorial need of prime or acid-free paper symbol. Prime is already standard in my projects, but not acid-free symbol –until now.

Are there other less-used-but-needed characters like these? Thanks!

charles ellertson's picture

Igor, those two happen on a lot of copyright pages, and copyright pages are included in most books.

^ ^ ^
Moving on from there, I'd say the biggest lack in fonts today (though likely not yours) is a full set of combining diacritics, 0300 through 036F, and spacing modifiers, 02B0 through 02FF.

It doesn't matter what a book is about, the author may want, say, to thank some Vietnamese guy in the acknowledgments. Or there may be a author or title in the bibliography where an accent is required. The comp can set these words, borrowing from another font if needed. But it's nice to have characters in the fonts used, to avoid losing a lot of time investigating size, weight, and to some extent, position.

Precomposed characters aren't necessary if there are just two or three occurrences, but having the right bits & pieces is a real help. The files sent off to the printer for a book become the basis for any eventual XML files for archival or research purposes. They may well be the basis for any digital editions. As such, they should have the correct Unicode characters -- for example, not use U+2019 for a glottal stop, but preferably, U+20BC. (Or U+20BB in Hawai'ian [Polynesian]). A base character plus a combining diacritical is syntactically correct, even if there is also a Unicode assignment for a precomposed character.

So, you can have a trade book on flower-growing and still run into a name or two needing such, and the author wants it right -- after all, they can get things "right" on their word processor, can't they.

Beyond that, things can get pretty specialized, and may not be worth the type designers time.

Igor Freiberger's picture

Thank you very much!

Yes, my "encoding" includes these blocks (combining diacritics, spacing modifiers) besides others I considered useful for complex, demanding editorial projects. But there is always space to learn and improve, like with the permanent paper symbol.

Syndicate content Syndicate content