Tibetan font in VOLT and FontLab need some advice

free_tibet's picture

Good afternoon.
Has read on your site that you can help convert a font.
Whether can you transfer OpenType a font for InDesign in a format
For Word? In a font 250 substitutions are used.
The font is made in FontLab. We are tried to transfer font file in VOLT-
ANY of ligatures does not work. Have tried all helps and examples
Does not help. The font Tibetan and very much is necessary for all of us.
We shall be grateful for any help.

about font
http://www.buddism.ru/DHARMA_text/_Yagpo/
font file
http://www.buddism.ru//DHARMA_text/_Yagpo/Yagpo!_5.3.ttf
www.buddism.ru
free_tibet@mail.ru
Alexander

Si_Daniels's picture

Your font appears to have a hack encoding - ie you map charatcters to the positions of basic latin code-points. I suppose you need this hack for InDesign, but this isn't going to work in Word.

Your font compared to the Microsoft Himalaya in Vista...

I know the letter d, I work with the letter d, letter d is a friend of mine, but you Yagpo!_Wylie letter d are no letter d ;-)

Miguel Sousa's picture

Alexander,
Here are a couple of resources that might be useful.
http://www.xenotypetech.com/
http://www.thdl.org/

Simon,
> I suppose you need this hack for InDesign

Don't know what makes you believe that, but I'm glad to say your supposition is far from the truth. InDesign is fully Unicode compliant, so there's no reason for hacks. It's true that its text engine might not have all the shaping rules necessary for Tibetan (does Uniscribe?), but having a Unicode encoded Tibetan font and a proper keyboard layout, is all you need to do some Tibetan typesetting in InDesign.

sergeym's picture

This font is indeed a hack, because it is using Latin characters to represent Tibetan. It will work in NotePad and InDesign. But why is it done this way? Because there were legacy texts from pre-Unicode era or because InDesign does not support non-simple scripts?

InDesign is fully Unicode compliant, so there’s no reason for hacks. It’s true that its text engine might not have all the shaping rules necessary for Tibetan (does Uniscribe?),

What exactly "Unicode compliance" is in your understanding? It is not the same as OpenType compliance, if people need hacks like that. I just do not know, do you execute OpenType features for Tibetan characters with Tibetan script tag?

Thanks,
Sergey

Miguel Sousa's picture

> What exactly “Unicode compliance” is in your understanding?

It can mean a lot of things. In this case I was alluding to the fact that InDesign is aware of all the necessary Unicode codepoints. This means that the font can use the codepoints that Unicode defined for Tibetan. There's no need for hacking it to use the first 256 codepoints that are assigned to Latin. People had to do that in the pre-Unicode era.

> It is not the same as OpenType compliance

Yes, I know that. I didn't mention OpenType, you did.

> , if people need hacks like that.

Alexander never explicitly said that he's hacking his font to overcome an InDesign flaw. And what I said in my previous post was that, if he definitely needs to hack his font, that shouldn't be because of InDesign (since InDesign can perfectly use a properly encoded Tibetan font).

> I just do not know, do you execute OpenType features for Tibetan characters with Tibetan script tag?

I don't think so. That's what I meant when I said "It’s true that its text engine might not have all the shaping rules necessary for Tibetan"

Sergey, have you tried to use InDesign? If so, which version?

twardoch's picture

> InDesign is fully Unicode compliant

Miguel,

I think this is a statement that is a bit overly optimistic, or at least imprecise. InDesign supports the Unicode character repertoire but there are many aspects of the Unicode Standard that InDesign does not entirely support, for example the bidi algorithm (UAX #9), the normalization forms (UAX #15) or some aspects of the case conversion. While InDesign is an excellent product, I do hope that its Unicode support will continue to improve.

A.

Si_Daniels's picture

>Don’t know what makes you believe that, but I’m glad to say your supposition is far from the truth.

Oh a challenge! Bring it on! Show me a paragraph of properly formed Unicode Tibetan text in InDesign and I'll wear an I Love Adobe T-shirt for two days at TypeCon - fail and you wear the I love Microsoft shirt - deal? :-)

Miguel Sousa's picture

> I think this is a statement that is a bit overly optimistic, or at least imprecise.

Adam,
Yes, point well taken.

Simon,
Which part of my posts didn't you understand? At this point it should be clear to everyone that it is NOT necessary to hack a font in order to access the Tibetan Unicode block in InDesign, as you seem to be alluding to when you say:

Your font appears to have a hack encoding - ie you map charatcters to the positions of basic latin code-points. I suppose you need this hack for InDesign, but this isn’t going to work in Word.

Got it?

Si_Daniels's picture

If you can't do Tibetan in InDesign just admit it. It's not a big deal. Tibetan is hard.

I did say 'suppose' and I'd love you to prove me wrong. We have packaging guys using InDesign who have to resort to hacks like this to produce Hindi and other complex scripts.

twardoch's picture

> We have packaging guys using InDesign
> who have to resort to hacks like this
> to produce Hindi and other complex
> scripts.

Why doesn’t Microsoft produce a plugin for InDesign that renders some text into a desired column width using RichEdit (i.e. Uniscribe), and then converts the result into outlines and passes it back to Indy as a graphic object? Something a la the DecoType OLE server that ships with the Arabic versions of Office, buz allowing the user to set some lines or even a few paragraphs of text?

Or, why doesn’t someone else do it? It should actually be relatively trivial ;)

Best,
Adam

Si_Daniels's picture

The workflow is quite complex, but that might be worth looking into. Know any good InDesign plugin writers who might be able to pull this off? The other suggestion which seems to work well for one-offs is to do the text in Publisher and export it as a PDF that can be placed in the Indy doc.

jodb's picture

The main issue with typesetting a Tibetan font is that there does not exist a uniform keyboard layout (palette) that can be installed on your computer. I suggest that this should get the first priority in the digital development of typesetting Tibetan.

Each Tibetan font uses their own methodology in mapping the Tibetan characters to the Latin ones on the keyboard (phonetically or QWERTY-method), and special software (such as Wylie) was developped to convert the transliterated text into Tibetan. This means that the user of this software (i.e. Wylie) first inputs the text in Latin characters. The whole text is afterwards converted in the Tibetan font, embedded in the software.

The Tibetan Unicode chart only relates to the base characters, vowel signs, diacritical marks, some of the subscripted characters and punctuation signs (and a few unnecessary symbols).
However, the Tibetan conjuncts (the 'ligatures' that Alexander was referring to) do not have an indivual Unicode number (although some of the 'half forms' do). Each font developer has to use his/her own creativity in organizing these stackings of the Tibetan consonants to form the 'ligatures' within their fonts and the OpenType features provide reasonable solutions for achieving this.

I have not yet studied Vista, nor Microsoft Himalaya, so I can not provide info on which keyboard layout or input system is used with this font.

Alexander, I was wondering whether you created the OpenType features in Fontlab or in VOLT. When you import your .vfb file into VOLT, then all embedded OpenType features will be lost and you need to create them all over again in VOLT.

cfynn's picture

In order to support Tibetan OpenType fonts a rendering engine or application needs to support and apply the following OpenType features for Tibetan script: ccmp, blws, abvs, blwm, abvm, calt, and kern.

(These are the features used in Microsoft's "Himalaya" font and in several other OpenType fonts for Tibetan script).

InDesign and other Adobe applications currently *do not* support these features for Tibetan script. Without that support you may be able to enter glyphs for basic Tibetan characters but the complex conjunct stacks (ligatures) absolutely necessary for proper rendering of Tibetan will not be formed.

Microsoft's Uniscribe does support these features - and Unicode Tibetan has worked pretty well for several years in MS Word & MS Publisher - and in IE, Notepad, OpenOffice 2.x, Firefox, Thuderbird and a number of other applications if you properly install the Uniscribe (usp10.dll) that comes with the most recent service pack for Office 2003 in \Windows\System32\ directory.

The main drawback in Word was that Tibetan line wrapping was not implemented. This meant that you sometimes had to insert manual line breaks in order to get lines of Tibetan text to wrap at the proper place. I expect this has been properly implemented in Word 2007.

Tibetan line wrapping and even Tibetan collation (sorting) work fine in OpenOffice 2.x

On Windows XP I have very successfully entered and formatted several large volumes of pecha (traditionally formatted Tibetan books) using both Word 2003 and OpenOffice with Unicode Tibetan and OpenType fonts.

On Linux operating system Pango (used by GTK) and ICU have partial support for OpenType Tibetan script fonts. The main thing missing is proper support for ccmp feature. Lack of support for this feature causes problems if characters U+0F73, U+0F76, U+0F77, U+0F78, U+0F79, and U+0F81 are encountered as these need decomposing before any other OpenType lookups are applied. Lack of support for ccmp may also cause problems with characters U+0F43, U+0F4D, U+0F52, U+0F57, U+0F5C, U+0F69, U+0F93, U+0F9D, U+0FA2, U+0FA7, U+0FAC, and U+0FB9 as most OT Tibetan fonts either decompose or compose these in a lookup under ccmp in order to simplify later lookups.

(Without support for ccmp but support for the rest of the features everyday Tibetan and Dzongkha renders OK since all of the above characters are not normally found in common everyday words. These characters are mostly used in Tibetan transliteration of Sanskrit words.)

There *are* standard keyboard layouts for Tibetan script. One is the layout approved by the Government of Bhutan. A example of this layout can be found at: http://www.thdl.org/tools/dzkeylayout.html. This layout is implemented in XFree86 and can easily be implemented in Windows using MSKLC.

I understand Windows Vista also comes with a Tibetan keyboard layout approved by the Chinese Government.

- Chris

cfynn's picture

Jo

You can create all OT features needed for Tibetan in VOLT. Fontlab currently does not support creation of one to many lookups (decomposition) required under ccmp feature necessary to handle characters U+0F73, U+0F76, U+0F77, U+0F78, U+0F79, and U+0F81.

Unlike with previous "un-intelligent" Tibetan fonts with non-standard glyph based encodings which did require "smart" input methods designed to work with particular fonts - a keyboard or input method for Unicode Tibetan only has to generate the simple base Unicode Tibetan characters as the complex conjuncts or ligatures are formed by the OpenType rendering engine applying the OpenType lookups in the font to get from the underlying simple basic characters to the conjunct glyphs. In other words, since the "smarts" are now built into the font (giving the type designer far more flexibility) they are no longer required in the input method.

If you want to use a Wylie (transliterated Tibetan) input method to enter Unicode Tibetan in Windows see: TISE: Tibetan Wylie Input Utility - also available at: http://byak.sinp.msu.ru/tise/. This type of input method does still require some built in "smarts" - since you are going from one script to another and there is not a straightforward one-to-one relationship between the characters typed and the Unicode Tibetan characters that need to be generated.

An example of a simple one-to-one keyboard for Tibetan script is the Dzongkha Keyboard layout.

Any properly made OpenType font for Tibetan script should work fine to render Unicode Tibetan text entered using either of these input methods.

- Chris

Si_Daniels's picture

Thanks Chris, great write up!

>InDesign and other Adobe applications currently *do not* support these features for Tibetan script ... the complex conjunct stacks (ligatures) absolutely necessary for proper rendering of Tibetan will not be formed.

Miguel, the "I love Microsoft" shirt is in the mail. If you're not coming to TypeCon you can post a picture of yourself wearing it here. ;-)

Miguel Sousa's picture

Si, you surgically removed an important part of the quote:

InDesign and other Adobe applications currently *do not* support these features for Tibetan script. Without that support you may be able to enter glyphs for basic Tibetan characters but the complex conjunct stacks (ligatures) absolutely necessary for proper rendering of Tibetan will not be formed.

Therefore, what I said before still holds true: the encoding of the font doesn't need to be hacked, and InDesign doesn't yet support the necessary OpenType instructions to fully handle Tibetan.

> the “I love Microsoft” shirt is in the mail

I'm waiting to see that... It better NOT be set in Comic Sans!!

Si_Daniels's picture

So according to Adobe you support a language even if the parts that are "absolutely necessary" are not supported? In all seriousness I think this is the reason why customers get frustrated by companies claims that they fully support Unicode when that "full support" does not allow them to set their language.

twardoch's picture

Si,

this is why I prefer to say that an application supports the Unicode character set, rather than saying it supports “Unicode”. The latter is an ambiguous expression, people may read different things under that label.

An even more difficult question is when it comes to “OpenType support”. Being able to render fonts in both OpenType flavors at all, being able to enter and display the full encoded character sets, or the full glyph repertoires in OpenType fonts, supporting some or all advanced typographic layout features for some or all scripts and languages, supporting the layout features necessary to correctly render some languages including directional support, supporting different GSUB and GPOS lookup types -- all this are different “shades” of possible OpenType support.

Unfortunately, the makers of the OpenType font format do not seem to be interested in documenting this publicly, which leaves us with obviously incomplete resources maintained by interested 3rd parties, such as the MyFonts OpenType page (authored by Laurence and myself) or the Typotheque OpenType created by Peter Biľak.

Regards,
Adam

twardoch's picture

BTW, a Microsoft veteran Raymond Chen wrote an excellent posting about the ambiguity of the word “support”:

http://blogs.msdn.com/oldnewthing/archive/2005/11/18/494442.aspx

My favorite incarnation of the blurry meaning of the word “support” in the OpenType domain is the quotation from the OpenType specification: “OpenType™ fonts containing CFF outlines are not supported by the ‘kern’ table”. It is a liguistic equivalent to the acrobatic splits, trying to say that while the makers of the OpenType specification do not “like” people putting the “kern” table into fonts that use CFF outlines, there is really no penalty for including that table anyway.

Adam

Si_Daniels's picture

Thanks Adam, I prefer Michael Kaplans response (which I can't find) which basically says "Do you support Unicode is a non-question, so deserves a non answer" - more Kaplanisms here...

http://blogs.msdn.com/michkap/archive/2005/12/23/506887.aspx

>Unfortunately, the makers of the OpenType font format do not seem to be interested in documenting this publicly

I'm interested, just don't have the time. It’s more the maintenance than the initial set-up that worries me. However I'd be happy to contribute to a wiki page if someone were to set one up, I'm sure Tom or someone else from Adobe would contribute too.

twardoch's picture

> I prefer Michael Kaplan's response

Ah, everyone seems to have *his own* favorite Microsoft blogger ;)

A.

Harbs's picture

Cris wrote:

In order to support Tibetan OpenType fonts a rendering engine or application needs to support and apply the following OpenType features for Tibetan script: ccmp, blws, abvs, blwm, abvm, calt, and kern.

Forgive my ignorance of Indic scripts, but why can't you use ccmp instead of abvs and blws, and mark/mkmk instead of abvm and blwm?

Thanks,
Harbs

John Hudson's picture

The order in which features are applied in Indic scripts is very important, due to staggered character and glyph processing, which is why some functions that one might expect to be handled in a single feature, e.g. ccmp, are split across multiple, Indic-specific features. Also, when Microsoft first spec'd the Indic features, the generic ccmp feature had not been devised.

John Hudson's picture

Si wrote: The other suggestion which seems to work well for one-offs is to do the text in Publisher and export it as a PDF that can be placed in the Indy doc.

This is similar to what I do when I need to put e.g. Hindi or Thai in InDesign documents, although usually I'm working with simple text blocks on a white background, so Word is usually sufficient (and considerably easier to use than Publisher). I'm not sure how useful this approach would be for more complex layout, e.g. packaging.

Harbs's picture

Also, when Microsoft first spec’d the Indic features, the generic ccmp feature had not been devised.

The order in which they're applied is set by the order of the lookups, not the feature set. So you're basically saying that there's no reason why one can't use ccmp and mark. If the general features are used, InDesign ME should support these fonts!

DavidD's picture

Now that CS3 is out, does anyone know if it supports the GSUB and GPOS tables needed for Tibetan or Indic languages?
Also, for cfinn: your sample of Tibetan pecha layout in OpenOffice is quite intriguing. Are you able to layout in OpenOffice in a way that prints two sides which flip vertically as opposed to the usual horizontal? We currently use a legacy font to print 3 folios per legal page in InDesign CS2. We thread from the top folio in front to the bottom folio on the back, then to the middle folio in front, middle back, bottom front, top back. Working from templates, we can thread a long text this way in little time. The resulting double-sided pages then just need to be cut in thirds. How does your OpenOffice method work?

Si_Daniels's picture

>Now that CS3 is out, does anyone know if it supports the GSUB and GPOS tables needed for Tibetan or Indic languages?

;-)

cfynn's picture

OpenOffice.org Writer works well for Tibetan script.

  • Word-breaks / line wraps occur in the proper place - not in the middle of a syllable
  • Lists of Dzongkha or Tibetan words can be sorted correctly (no mean feat as the collation rules for these languages are exeedingly complex)
  • Switching between Tibetan script and Roman fonts is automatic - depending on characters typed.
  • Tibetan digits can be used in calculations, page numbering, numbered lists & headings, dates, etc.

None of these things worked in Office 2003 - though I'm told they are fixed in Office 2007

see: How to use Unicode Tibetan in OpenOffice.org Writer

cfynn's picture

Harbs wrote:

Forgive my ignorance of Indic scripts, but why can’t you use ccmp instead of abvs and blws, and mark/mkmk instead of abvm and blwm?

You can use ccmp - though its generally cleaner to use all three substitution lookups: ccmp, blws, abvs - and maybe calt if you need to substitute contextual forms.

mark/mkmk will not be applied to Tibetan text by most OpenType shaping engines. Generally only a specific sub-set of OpenType features is applied for each particular script.

- Chris

cfynn's picture

DavidD wrote:

Are you able to layout in OpenOffice in a way that prints two sides which flip vertically as opposed to the usual horizontal? We currently use a legacy font to print 3 folios per legal page in InDesign CS2. We thread from the top folio in front to the bottom folio on the back, then to the middle folio in front, middle back, bottom front, top back. Working from templates, we can thread a long text this way in little time. The resulting double-sided pages then just need to be cut in thirds. How does your OpenOffice method work?

In OpenOffice the best way is to make a template formatted for the final cut size - and there is a way of printing these folio sides so that they will come out two up or three up on a4 or legal paper in the correct order back to front. You can "print" like this to a PDF file so that the folio sides are pre-assembled to make printing at any time easier.

I intend to write a document describing how to do this in detail which I will post on my website.

cfynn's picture

There is now an article Platform Independent Tibetan Unicode Font posted on the Dharma Dictionary site.

The method described essentially entails reduplicating the lookups under blws and abvs features under the rlig feature as well.

- Chris

DavidD's picture

Chris,
Just want to say THANKS! You are a wealth of valuable and accurate information!
I look forward to seeing the article on pecha layout in OpenOffice on your website.
David

John Hudson's picture

The order in which they’re applied is set by the order of the lookups, not the feature set.

That was the origial intention of OpenType, but complex script handling led Microsoft to modify its approach and the Uniscribe script engines apply certain features in a fixed order, regardless of the lookup order in the font. So far as I know, the first two features to be applied for any script are 'locl' and 'ccmp'; after that, the order is determined by the layout needs of the script.

k.l.'s picture

It still seems that the relation between OT specs and layout engines needs improved documentation.
(1) One part would describe how this or that current application or layout engine works. For backward compatibility, if desired. To be done by applications' developers.
(2) The more important part would be more instructive/imperative: How should layout engines do it? (This seems most important for complex scripts, but even Latin raises questions regarding typographic features like, which features to be called by an application's 'All Caps' option -- with or without 'lnum'?) This would establish a stable ground for both font and application developers, and make sure users can expect the same font behavior across platforms and applications.

Great thread.

P.S. -- The method described essentially entails reduplicating the lookups under blws and abvs features under the rlig feature as well. -- Reminds me of the 'ssXX'/'salt' thread ...

Syndicate content Syndicate content