Questions regarding to lookups

Arno Enslin's picture

When I was coding features for Frode (Automatic indexed numerals), I have noted, that they only work, if I put the substitution rules into lookups. In case of the ss01 feature, I need at least one lookup, because I wanted to make use of the same rules in the ss02 feature. But I absolutely don’t know, why it was required to create more than one lookup in the ss01 feature. And in case of the calt feature my consideration was, that I don’t need lookups at all. So I would like to know, why the ss01 and the calt feature don’t work without lookups and how I can determine, how many lookups I need and how many rules I can put into a lookup. With regard to the lookups the three feature files are the result of try and error, but this is not an effective working method, because it costs time and additionally I am not sure, whether my lookups are the best way with regard to the compilation or compression of the font (the aesthetic of the code).

And another thing, that I don’t understand is, why I cannot call a lookup more than one time (where the one time is the definition itself) in a feature. With regard to that question have a view into 2_Features [alternative ss01 and ss02].fea_.txt.

Here are all the three files (They work exactly as intended; tested in FLS.):

Arno Enslin's picture

Okay, I am simplifying my questions.

Why is the lookup needed in the following code?:

feature calt {
lookup alpha {
sub bracketleft one one' bracketright by A;
} alpha;
sub one' A by A;
} calt;

[11] is replaced by [AA] as intended.

feature calt {
#lookup alpha {
sub bracketleft one one' bracketright by A;
#} alpha;
sub one' A by A;
} calt;

But with the code above [11] results in [1A].

Second question: Why can’t I call a lookup registered with the same languagesystem statement?

I cannot code …

feature calt {
lookup alpha {
sub bracketleft one' bracketright by A;
sub bracketleft one one one' bracketright by A;
sub bracketleft one one' bracketright by A;
} alpha;
lookup beta {
sub one' A by A;
} beta;
lookup beta;
} calt;

… but I have to code

feature calt {
lookup alpha {
sub bracketleft one' bracketright by A;
sub bracketleft one one one' bracketright by A; #The first one will be substituted in lookup gamma, the second one in beta.
sub bracketleft one one' bracketright by A; #The first one will be substituted in lookup beta.
} alpha;
lookup beta {
sub one' A by A;
} beta;
lookup gamma {
sub one' A by A;
} gamma;
} calt;

How can I make out, that I manually have to add a lookup without testing?

twardoch's picture

Arno,

I think what you're really asking for is when to put substitutions into one lookup vs. separate lookups. After all,

feature ss01 {
sub a by a.ss01;
} ss01;

really is equivalent to

feature ss01 {
lookup ss01_01 {
sub a by a.ss01;
} ss01_01;
} ss01;

On the other hand,

feature ss01 {
lookup ss01_01 {
sub a by a.ss01;
} ss01_01;
sub b by b.ss01;
} ss01;

is equivalent to:

feature ss01 {
lookup ss01_01 {
sub a by a.ss01;
} ss01_01;
lookup ss01_02 {
sub b by b.ss01;
} ss01_02;
} ss01;

The statement lookup in the AFDKO syntax can be omitted for brevity — then all substitutions within one feature definition (or those before an explicit lookup opening or after an explicit closing of a lookup) are treated as one lookup. However, I recommend always at least thinking in terms of lookups, i.e. as if the lookup statements were always there.

> How can I make out, that I manually have to add a lookup without testing?

That's very easy.

1. The OpenType Layout engine splits the line into "runs" of glyphs of the same font formatting (family, style, size, color etc.), script (writing system) and directionality (LTR, RTL). For continuous Latin-script text set in the same font, one line is one run.

2. Within each run of glyphs, all lookups associated with the features that are being applied to the run, are executed in the order defined within the font (the ordering of the lookups, or in the AFDKO syntax, the order of the feature definitions and the lookup definitions within them).

3. Each lookup is executed in one pass through the current run of glyphs.

4. Each pass proceeds through the run glyph by glyph. The substitutions defined in the lookup are searched for the "best match" for the current glyph, and if found, that best match is executed. This modifies the run and proceeds to the next unprocessed glyph. If no best match is found, it proceeds to the next glyph (without modifying the run).

5. After one lookup has been executed, the next lookup is executed in the next pass on the *output* run of the previous lookup.

So if you have the code

feature calt {
lookup calt_01 {
sub A by B;
sub B by C;
sub C by D;
} calt_01;
lookup calt_02 {
sub B by E;
} calt_02;
} calt;

and you have the glyph run

A B A B A B A B

and the calt feature is being applied, then first the calt_01 lookup is executed on the input glyph run, so the output glyph run is:

B C B C B C B C

and then that output glyph run becomes the input glyph run of the calt_02 lookup, which is executed and the final output glyph run is:

E C E C E C E C

In short: Each lookup is one “pass” through the text. Within one lookup, the “best match” is found for each glyph, and then the next glyph is processed, until the end of the text. The next lookup is executed on the text that comes out of the previous lookup.

To make a very rough analogy, substitutions within one lookup are mutually exclusive when stepping through each glyph, while substitutions in separate lookups are executed one after another so that the next lookup takes notion of the modifications done by the previous one.

I hope this helps,
Adam

twardoch's picture

> feature calt {
> sub bracketleft one one' bracketright by A;
> sub one' A by A;
> } calt;

> But with the code above [11] results in [1A].

There is one lookup defined in your calt feature (implicitly). Your feature definition is equivalent to:

feature calt {
lookup delta {
sub bracketleft one one' bracketright by A;
sub one' A by A;
} delta;
} calt;

You have the input glyph run [11].

The lookup delta steps through the glyphs one by one:

1. First, the glyph [ is "current". The lookup is searched for the best match, and finds none.

2. Then, the first glyph 1 is current. The lookup is searched for the best match, and finds none.

3. Then, the second glyph 1 is current. The lookup is searched for the best match, and finds the match in the rule sub bracketleft one one' bracketright by A;, so it executes it. The current glyph is substituted by A.

4. Finally, the glyph ] is current, and no match is found.

So the output run is [1A].

The code

> feature calt {
> lookup alpha {
> sub bracketleft one one' bracketright by A;
> } alpha;
> sub one' A by A;
> } calt;

defines two lookups, so it's equivalent to

feature calt {
lookup alpha {
sub bracketleft one one' bracketright by A;
} alpha;
lookup beta {
sub one' A by A;
} beta;
} calt;

You have the text run [11].

First the lookup alpha is executed, and after skipping the first two glyphs, the second 1 gets a match for the rule sub bracketleft one one' bracketright by A;, so it's executed. At the end of the lookup execution, the output run is [1A].

Then the lookup beta is executed on the output run of the previous lookup, i.e. on the glyph run [1A]. After skipping the glyph [, the glyph 1 hits a match for the rule sub one' A by A;, so the rule is executed, and the glyph gets substituted by A. So the final output run is [AA].

Very simple and logical. But only if you remember that the lookup statement in the AFDKO syntax is always implied if it is not defined. In fact, I recommend using it explicitly, especially in contextual substitutions, to keep track what's going on.

It's kind of similar to our previous discussion about languagesystems, when I recommended to always spell out script latn, language dflt and include_dflt in feature definitions that define different behaviors for different languagesystems, rather than relying on the implied use of those statements.

The general rule is: the implicit definitions are useful for simple situations, when you mean to say "all languagesystems" (in case of languagesystems) or "one lookup" (in case of lookup definitions within feature definitions). But mixing explicit and implicit definitions is a road to hell when it comes to debugging, so it's always better to spell everything out explicitly if your code becomes more complex.

Best,
Adam

twardoch's picture

To answer your original question:

In AFDKO 1.6 (and 2.5) syntax, which works in FLS5, you can write:

feature ss01 {
lookup ss01a {
sub a by a.ss01;
} ss01a;
} ss01;

feature ss02 {
lookup ss02b {
sub b by b.ss02;
} ss02b;
} ss02;

feature salt {
lookup ss01a;
lookup ss02b;
} salt;

but you cannot write

feature ss01 {
lookup ss01a;
} ss01;

feature ss02 {
lookup ss02b;
} ss02;

feature salt {
lookup ss01a {
sub a by a.ss01;
} ss01a;
lookup ss02b {
sub b by b.ss02;
} ss02b;
} salt;

i.e. at the first occurance of a lookup, you need to define and use it, and in later occurances, you can just use it by referring to it by its name. But you cannot just "use" a lookup before defining it.

In AFDKO 2.5 syntax (but not in 1.6), which will work in FOG5 and FLS6 but not in FLS5, you can also write:

lookup ss01a {
sub a by a.ss01;
} ss01a;

lookup ss02b {
sub b by b.ss02;
} ss02b;

feature ss01 {
lookup ss01a;
} ss01;

feature ss02 {
lookup ss02b;
} ss02;

feature salt {
lookup ss01a;
lookup ss02b;
} salt;

so you "just define" the lookups first — outside of any feature definitions —, and then you "just use" them within the feature definitions. Using this form, you don't have to worry about the ordering of your feature definitions — because the ordering of the lookups is always the only thing that matters.

In the AFDKO 1.6 syntax, the ordering of lookups was implied by the ordering of the feature definitions, but in some cases it was impractical: imagine that you have three lookups: alpha, beta and gamma, and that the lookups alpha and gamma are associated with the smcp feature, while the lookup beta is associated with the ss01 feature. When only smcp is applied, then the lookups alpha and gamma should be executed, when only ss01 is applied, then the lookup beta should be executed, but if both features are applied at the same time, the lookups should be executed in the order alpha (that is associated with the smcp feature), then beta (that is associated with the ss01 feature), and then gamma (that is associated with the smcp feature). With the AFDKO 1.6 syntax, where you can only define lookups inside of feature definitions, it was impossible to order the feature definitions in such a way that could achieve such a goal. This is why I asked Adobe to allow lookup definitions to be written outside of feature definitions in AFDKO 2, and they so did — which is handy.

twardoch's picture

If you're paying close attention, you might ask why I wrote

# Variant 1

feature ss01 {
lookup ss01a {
sub a by a.ss01;
} ss01a;
} ss01;

feature ss02 {
lookup ss02b {
sub b by b.ss02;
} ss02b;
} ss02;

feature salt {
lookup ss01a;
lookup ss02b;
} salt;

when I could have just written:

# Variant 2

feature ss01 {
lookup ss01a {
sub a by a.ss01;
} ss01a;
} ss01;

feature ss02 {
lookup ss02b {
sub b by b.ss02;
} ss02b;
} ss02;

feature salt {
sub a by a.ss01;
sub b by b.ss02;
} salt;

which would be equivalent to

# Variant 2 explicit

feature ss01 {
lookup ss01a {
sub a by a.ss01;
} ss01a;
} ss01;

feature ss02 {
lookup ss02b {
sub b by b.ss02;
} ss02b;
} ss02;

feature salt {
lookup saltab {
sub a by a.ss01;
sub b by b.ss02;
} saltab;
} salt;

In other words: why did I choose to use two lookups in salt (in Variant 1), when the output run of the first lookup is not related to the input run of the second lookup. Indeed, I could have used Variant 2.

Basically, Variant 1 means that I save a little bit of space in the font (because I only define two lookups instead of three), but it means that I lose a tiny bit of performance, because the salt feature executes two lookups one after another, rather than just one. On the other hand, Variant 2 menas that I speed up the performance a tiny bit, but I lose a few bytes of space in the font size.

The differences are so minute that I wouldn't worry. It's just a matter of code optimization vs. code brevity. If I want code that is simpler to read and edit, I choose Variant 1. If I want perfect speed optimization, I choose Variant 2.

Hope this helps,
Adam

twardoch's picture

Arno,

I should bill you for 1 hour of work now :)

A.

Frode Bo Helland's picture

I guess I’m the one you should bill :p

Arno Enslin's picture

Thank you, Adam! I want to avoid questions, that you (or anybody else) have not already answered. So I need a bit time for translating your text to German, understanding and checking, whether your explanations are satisfying explanations for the whole behavior of the code with regard to the number and the content of lookups. (There is something in the behavior of the lookups in comparison to to the absence of explicit lookups), which is still enigmatic for me. But I want to try making out, whether I can enlighten that by myself first.

With regard to the need, to define a lookup, before it is called: That is clear. But in case of my code it was defined (and implicit called) in the feature, but could not be explicit called one more time. But Maybe I can also enlighten that or you have explained it and I did not see it.

Arno Enslin's picture

@ Adam

Well, the principle seems to be indeed simple, but it can be hard, not to loose the overview. In case of the lookups "Index 1" and "Index 2" from the calt feature, that I have coded for Frode, the comprehension was much harder. I really had to comprehend step by step, why the one and only rule of lookup Index 2 cannot be put into the end of lookup Index 1, although the last rule of lookup Index 1 can be put in the begin of lookup Index 2.

(Notice to myself: Check, if only lines can be commented in code [with the number sign] or also whole blocks with begin and end markers.)

I think, that I meanwhile understand the very most of your explanations and extra information except from the problem, about which I was talking in the second paragraph of my previous message.

The first call of the lookup beta is the definition of the lookup beta and it cannot be called more than one time with the same languagesystem statement.

feature calt {
lookup beta { #Definition of lookup beta AND call of lookup beta.
sub one' A by A;
} beta;
lookup beta; #unsuccessful call of lookup beta, that was defined above.
} calt;

The substitution is not well chosen, but that is unimportant in this context. There are cases, in which it would be useful to call a lookup a second time in the same feature and under the same languagesystem statement without defining it a second time. Not only with regard to the comprehension of the own code, but probably also with regard to the file size.

Syndicate content Syndicate content