Help! Devanagari "locl" Feature doesn't work properly

satya's picture

Hello Everyone!

I am having some issues with the localized (locl) feature for Devanagari. It doesn't seems to work properly in my case. I remember Dan Reynolds asking for a similar help long back but that was to do with the spacing I guess.

Okay, I have two sets of punctuation marks for both Latin and Devanagari, and only the Latin punctuation marks has unicode assigned to them. I have created a "locl" feature using VOLT for Devanagari so that it takes the appropriate punctuation marks when Hindi is typed.

Now the issues I am facing is:
When I type "comma" or "period" while in Devanagari mode, it doesn't get replaced with the "dev_comma" or "dev_period". And because of that, the application thinks that I have switched to Latin and it converts all Devanagari marks to the Latin marks which is typed after comma or period.

Has anyone encountered this issue before? If so, is there a way I can get rid of this? Hello John? Dan?

Here is an image, just to show you what I meant:

Thanks!

sergeym's picture

What application and which version of Windows are you using?

satya's picture

I am using Adobe InDesign CS4 on a Mac with Adobe World-Ready Composer on.

dezcom's picture

Do you have your script and language declarations in the code?

ChrisL

satya's picture

Yes, Chris!

Miguel Sousa's picture

Is the substitution contextual? (meaning, does it try to find out if what's surrounding the punctuation mark are Latin characters or Devanagari characters)

Miguel Sousa's picture

> I remember Dan Reynolds asking for a similar help long back

This thread? http://typophile.com/node/40602

dan_reynolds's picture

In my case, I was working with Word. If I remember correctly, there wasn't any way to convince Uniscribe that the space character could be anything other than the default, Latin space glyph. Oddly enough, I could write some gerry-riged OT code to get my Deva space to come up as a substitution some of the time in CS3 (I know, I know…). It is theoretically possible that you can't have alternate language-specific versions of punctuation yet.

My solution in the end (with the space) was to make it slightly bigger than I had hoped for the Latin text, optically, and slightly tighter than I had wanted it to be with Devanagari text. Worst of both worlds, but no one will probably ever complain about it.

dan_reynolds's picture

Alternatively, have you tried making your Deva punctuation the default, and have the locl feature go the other way around? I mean, build the Latin punctuation for Latin text as an alternate in the Latin script's locl feature.

satya's picture

I have tried every possible way to do this - but no, it's not happening. :(

Problem is that the substitution is taking place with a few marks but not for all, specially with the "commas" and "periods". And that is why, even if am not using any Latin text while in Hindi mode, Uniscribe thinks that the "commas" and "periods" I have used in Hindi are Latin characters and it changes every proceeding punctuation mark to a Latin mark.

In CS4, at least a few marks are getting replaced, but in the MS Word and other Windows applications it's not happening at all.
Oh yes, it works fine in the VOLT proofing tool though.

satya's picture

Okay, is there a way I can make an alternate for the "comma" only? It doesn't seems to work even if I create a Stylistic Set too.

John Hudson's picture

Satya, the problems you are facing are most likely due to layout engine decisions about how and where to divide glyph runs. It would take a while to figure out exactly what the issues are in different applications. I'm about the leave for TypeCon in Atlanta, so can't offer more detailed advice now.

I suspect Sergey Malkin at MS can explain what is happening in the case of Word. Note also that this might depend on what version of Windows you are using: the latest Uniscribe rolls punctuation into glyph runs with adjacent text, but relies upon a system level wrapper that means this won't work in pre-Vista versions of Windows even if you copy the usp10.dll from Vista to an older OS.

sergeym's picture

There are two decisions application has to make when shaping multilingual text. One is break text into runs based on script for contextual shaping. Another one is to choose appropriate font to use. They look very similar, but in fact different enough. For example, font matching would preferably use same font for all punctuation characters across paragraph or even document and script itemization would add comma to preceding word for better kerning. These algorithms may also have cross-dependencies and require making compromises, when results of one may influence the other.

Decisions made by both algorithms may be influenced by many factors, from base font set on piece of text, document language, language for particular piece of text, user settings, etc. I am sure there is no single ‘right’ algorithm for doing that. It would depend on many factors, including target audience for the application. What typographers using InDesign expect and are willing to control manually is radically different from general audience using Word.

Word is using language of the text and would probably do better job with this, but it has weird algorithms of determining what this language is and how to break text into script runs. It may be based on keyboard you used when typing or languages enabled in your copy of Office. It may break text into runs in arbitrary places, as we recently learned with broken contextual shaping of Latin text in Office 2010. It may even reject shaping Arabic if typed inside Word, but shape it when you copy-paste it from Notepad.

Word have made lots of compromises, consciously or unconsciously, over last 20 years based on user scenarios and heavily dependent on their layout algorithms. When situation changes or new features added , e.g. OpenType shaping is added for simple scripts, such logic should be changed accordingly, but this very hard to do because of compatibility reasons. In short, we will probably have to live with this code in Word for a while.

Thanks,
Sergey

sergeym's picture

Uniscribe itself is lower level library and does not have enough control over itemization and font matching decisions. On the level it is used by Word or Publisher or Internet Explorer it does not provide font matching functionality, so application is on its own to implement whatever algorithms it chooses. Since Vista/Office2007 it can help application to some extent by merging neutral items to adjacent strong characters. I’ve tried to make this logic to work in most general cases, but it still has its own problems. It may do better, however parameters available to Uniscribe at this stage lacks context in which text is being set. Uniscribe also have higher-level APIs used by Windows UI, such as menus or common controls. As you may see experimenting with notepad, text may show inconsistency in shaping same punctuation or space glyphs because they may be itemized to different scripts.

Thanks,
Sergey

Syndicate content Syndicate content