Language-specific contextual substitution

guifa's picture

The only way I can think to do this seems a bit complex so maybe there's ane asier way.

In Spanish (including modern day) often times the word "de" when set in all capitals is ligatured. However, a standard ligature isn't enough because de appears in many words. So I was going to do a contextual substitution to make sure I only nab DE as a single word.

BUT, this is of course unacceptable in most other languages. So here's my solution, and just wondering if I'm approaching it properly or if there is a more efficient way. (I'm using FontForge so I'll use psuedo code)

1: If language is "es", switch "D" with "D.es"
2: If language is "es", switch "E" with "E.es"
3: If sequence "[any letter] D.es E.es" found, match and do nothing
4: If sequence "D.es E.es [any letter]" found, match do nothing
5: If sequence "D.es E.es" found, replace with D_E

This requires making an extra glyph that is identical to a normal D and that in a Spanish document would be constantly being replaced. I'm not sure if that in and of itself might cause concern as well.

Stephen Rapp's picture

House Industries did a whole language specific contextual sub thing with their new script Sable. It has local versions of some of the diacritic marks.

http://www.houseind.com/index.php?page=showfont&id=665&subpage=studio_ho...

Randy's picture

Seems like this is something for locl or salt or ss1 (stylistic sets) or a combo:
(taking a stab as I'm trying to improve my OT mind)

@UC = uppercase;

feature locl { # Localized Forms
language ESP exclude_dflt; # Spanish;
sub @UC D' E @UC by D.es; #D.es is a duplicate D marker glyph
sub D E by D_E;
} locl;

feature salt { # Spanish DE ligature
sub @UC D' E @UC by D.es; #D.es is a duplicate D marker glyph
sub D E by D_E;
} ss1;

feature ss1 { # Spanish DE ligature
sub @UC D' E @UC by D.es; #D.es is a duplicate D marker glyph
sub D E by D_E;
} ss1;

This could well be garbage :-) but it's what I'd try.
Whoops that's fontlab, not fontforge.

It'd be nice to get rid of both marker glyphs, not just one. The only other way I figured required an illegal one to many substitution. You could try to define when you do want the sub, not when you don't, but it's trickier.

Miguel Sousa's picture

I don't think the glyphs D.es and E.es are necessary. I believe it can be done in one shot. The FDK syntax should be something like this:

languagesystem latn dflt;
languagesystem latn ESP;

@ALL_LETTERS = [A-Z a-z];

feature liga {
script latn;
language ESP;
ignore sub @ALL_LETTERS D' E';
ignore sub D' E' @ALL_LETTERS;
sub D' E' by D_E;
} liga;

Randy's picture

Hi Miguel,

Thank you! I didn't know about "ignore sub" .. that will be handy!

Side questions:

1. Is languagesystem toggled by the Language dropdown in the character palette of InD CS2?
2. Is that the prefered way to do things like polish diacritics in apps without locl support, or stylistic sets?
3. Can it be done in Fontlab's opentype editor or just FDK?
4. In your liga feature how do you signal the end of ESP specific subs, say if you wanted to add french specific subs or the aforementioned polish diacritics?

Miguel Sousa's picture

Hey Randy,

1. Yes, it is toggled by the Language dropdown in the character palette, but it will only work in InDesign CS3 and up. Photoshop CS4 and Illustrator CS4 support it as well.

2. If the app does not have support for language-specific lookups, things won't work. For example, Adobe has been putting language-specific code in its fonts (e.g. Turkish) for quite some time, but only with InDesign CS3 we've started seeing the results of that code.

3. You'll be able to compile the features and see the results in Fontlab, but you should also include the line languagesystem DFLT dflt; in the features. However, the current Fontlab 5 will throw an error message because of that. You'll have no problems if you use the FDK.

4. The end of ESP would be signaled by the end of the feature, the start of another language, or the start of another script. Matthew's case had to do with a language-specific ligature, so the best way to do it is through 'liga'. In the case of Polish diacritics I'd say 'locl' is more appropriate. In the case of French, it depends on what you're trying to do. Below is some code for language-specific ligatures and diacritics.


languagesystem DFLT dflt;
languagesystem latn dflt;
languagesystem latn ESP;
languagesystem latn FRA;
languagesystem latn PLK;

feature locl {
script latn;
language FRA;
sub eacute by eacute.FRA;

language PLK;
sub cacute by cacute.PLK;
} locl;

@ALL_LETTERS = [A-Z a-z];

feature liga {
script DFLT;
language dflt;
lookup LIGA {
sub f i by f_i;
} LIGA;

script latn;
language dflt;
lookup LIGA;

language PLK include_dflt;

language ESP include_dflt;
ignore sub @ALL_LETTERS D' E';
ignore sub D' E' @ALL_LETTERS;
sub D' E' by D_E;

language FRA include_dflt;
sub e t by e_t;
} liga;

The code in 'locl' is straightforward; glyphs are replaced by other glyphs depending on the language. The code in 'liga' is a little more tricky because there are ligatures that you'll want to have for all the languages and/or scripts in addition to the language-specific ones. In this case, all Latin languages (including ESP, FRA and PLK) will have the 'f_i' ligature. In addition, the 'D_E' ligature will be available to Spanish, and the 'e_t' ligature will be available to French.

I could have been more concise with the 'liga' code but when it comes to coding language-specific behavior I like to be as explicit as I can to avoid unexpected behaviors. For example, the keyword 'include_dflt' could be omitted (because the lookups under 'language dflt;' are inherited by default by other languages within the same script). However, 'language PLK;' needs to be included in 'liga' otherwise 'f_i' will NOT be available for Polish.

Randy's picture

Thanks for your comprehensive reply. Next up: FDK bootcamp for me.

Randy

Syndicate content Syndicate content