Hebrew Unicode - Importance Tiers

hrant's picture

Of the characters defined in Unicode Hebrew,
http://www.unicode.org/charts/PDF/U0590.pdf
what are the "tiers" of desired support?

I know that 05D0 to 05EA are critical, but can the rest be grouped in terms of desirability?

hhp

William Berkson's picture

Hopefully you will get some Israeli comment, but it really depends on the purpose to which the font will be put. I am not expert on this, but here is what I know.

Israelis generally don't use the points or 'nikud', though they will use the 'additional' punctuation, hash marks which are attached to Hebrew letters to indicate the soft g and ch sounds in foreign names (Charlie, George). But once in a while Israelis will throw in a vowel point to resolve ambiguity, so probably it is best to have the full set of vowels.

The cantillation marks are only for the editions of the Bible used as a guide for chanting the text - a very specialized use.

The Yiddish combinations are not required in regular Hebrew [Yiddish is a German dialect written in Hebrew characters]. Yiddish is still alive, but has few speakers; you can get a keyboard mapping just for Yiddish which uses the combination letters, I believe.

For outside Israel you would definitely need the regular vowel points, but for inside you might get away without, though you'd have to ask Israelis. I would guess that few Hebrew fonts have cantillation marks available.

There is a problem with spacing the vowel points in a fully voweled text. Some of the Hebrew composition programs for use outside Israel advertise that they have special software to deal with this.

hrant's picture

> they will use the 'additional' punctuation, hash marks

Could you please list those?

> special software to deal with this.

Wouldn't OpenType and/or kerning handle them OK?

hhp

William Berkson's picture

Those are listed as 05F3 and 05F4. If I remember right the single hash mark is put after a gimmel to get the soft g and the double hash after a tzadi to get a CH.

About how good Open Type and kerning deal with the problem I don't know. I believe that Israelis can't be bothered with the problem, as they rarely use the vowel points. John Hudson did an open type Hebrew font with both vowels and cantillation marks, so he will know all this very well. I think Davka (davka.com) claims their vowelling is better placed, but I don't know. You could always call them.

hrant's picture

Thanks!

I think John is traveling/off-line at the moment - he'll probably help out soon.

hhp

piccic's picture

I've opened a weight of Oded's Alchemy in Fontlab and (as I recalled), it contains just the whole alphabet, the Israeli currency mark, and basic punctuation, so I guess you could stick to William's suggestion to add just the 05F3 and 05F4 hash marks.
I recall Oded told me nowadays cantillation marks are of very limited use within Israel (proof is his fonts does not contain them).

hrant's picture

> basic punctuation

What's that exactly?
And what about vocalization marks?

hhp

John Hudson's picture

There are basically four typical ways to write Hebrew today (I say 'typical' because one sometimes finds partially vocalised texts that include occasional nikud):

1. Totally unmarked: letters only.

2. Only consonant markers indicated: shin dot U+05C1; sin dot U+05C2; dagesh U+05BC; rafe U+05BF. Note that not all consonant marks may be used, or may be used only when necessary to distinguish ambiguous words. For example, one might see documents that only mark the sin dot.

3. Full vocalisation: all nikud (vowels) and consonant markers as above. Most of the nikud are centered under the consonant or, in the case of the right-leg consonants like dalet and resh, under the right side. An important exception is the holam, which sits at the top left of the consonant in most cases. Note that the position and behaviour of holam varies in modern Hebrew and many Bible editions. In modern Hebrew typesetting the holam sits slightly to the left of the letter, between it and the next letter. In many Bible editions the holam sits above the left edge of the letter but, importantly, may contextually move further to the left in some circumstances, e.g. when followed by an unvocalised alef. The OTL lookups for Biblical holam are a pain in the neck.

4. Full vocalisation and accentuation: all nikud and consonant markers plus all te'amin (cantillation marks). This is used only for the Biblical text; to my knowledge, te'amin never occur in any text except the Bible. Te'amin interract contextually with the nikud and with each other and this, among other things, drastically slows the rendering time of a Biblical Hebrew font. Although many Hebrew-speakers will tell you there is no significant difference between the Biblical and modern Hebrew language, there are some typographic difference (e.g. the position of holam mentioned above) and I'm inclined to ship a stripped-down version of my Biblical Hebrew font for modern Hebrew users simply so they will have something that doesn't have to go through dozens of OTL lookups for things they don't need like te'amin interraction.

In addition to the consonant markers, nikud and te'amin, there are a few odds and ends in the Unicode Hebrew block. The paseq mark (U+05C0) is used as part of accentuation in the Biblical text, but it also appears in the Microsoft 8-bit Hebrew codepage, which suggests that it might have some use in modern Hebrew. The Yiddish digraphs that William mentioned may be encoded using the precomposed characters or sequences of regular Hebrew letters; I understand that there is actually divergent practice between different Yiddish user communities. Also, some of the precomposed alef+nikud combinations in the Unicode Alphabetic Presenation Forms block are used as vowels in Yiddish, so should probably be included in any font intended to support that language, even though they may be encoded as sequences of letter and mark.

Modern Hebrew punctuation uses most of the typical Latin set: comma, period, semicolon, etc. including parentheses, brackets and braces. I'm not 100% sure what the standard for quotation marks is. Note that unlike Arabic, Hebrew does not use a horizontally-flipped, right-to-left question mark; I know one Israeli type designer who thinks they should, but he has not convinced his colleagues or typical users.

My recommendation is that a font for modern Hebrew should support everything that is in the Microsoft 1255 Hebrew codepage.

meir's picture

I believe John's post has cleared up the matter almost fully, so I would simply like to answer Hrant's query from an Israeli point of view.

Nikud (vocalization marks) is indeed rare in modern Hebrew, therefore fonts that that are not considered to be "text" faces, meaning they would not accomodate long passages of text, usually include the Hebrew letters only (05D0-05EA), no Nikud.

A good Hebrew text face, though, has a full set of Nikud (starting 05B0) as well as the letters.

I think new Hebrew foundries (like Oded Ezer's) add Nikud to almost all their faces, both text and display - a blessed habit.

The Yiddish vowel pairs double-Vav, Vav-Yod and Double-Yod (05F0-05F2) are rarely included.

Quotation marks are almost always "typewriter" style (05F3-05F4) and should be included in all Hebrew faces (there is very wide use of them, especially since the Geresh ' mark is used as an abbriviation mark for the likes of Prof. and Capt. and the Gershayim " is used for the likes of Dr and Mr).

The hebrew hyphen used for word-connection is called Maqaf (05BE) and should also be included with every Hebrew font.

Sof pasuq (05C3) is literally a colon (it's used in bibilical publications to note where verses end), it can be a direct link to the colon glyph.

The Rafe (05BF) and Paseq (05C0) aren't normally used in everyday Hebrew. Also, I have no idea what the Upper Dot (05C4) means, maybe it's a version of the Holam (05B9) mark? (John? William? Anyone?)

If you add Nikud to a Hebrew font, you should also add the Hebrew portion of the Alphabetic Presentation Forms table (the wide letters and the Yiddish and Ladino ligatures are only an option, though). Notice there's also a special Jewish "plus" sign for Hebrew (to counter the "christian", cross-like plus).

Basic punctuation like commas, periods, and exclamation and question marks are generally the same as in Latin faces.

Some classic designs favor the shape of a diamond for the periods and other points, but I think you have that in some English faces as well, don't you?

I'm curious as to the identity of the Israeli type designer who thinks Hebrew should use Arabic question marks.

Former prime minister, Yitzhak Ben-Tzvi once motioned to add two Hebrew letters (Alef-like and He-like) to replace vocalization marks. Debates continue to this very day, in various forums.

hrant's picture

Yes! I was hoping for such a full reply from you. Thanks.
(This is good enough to print. :-)

hhp

hrant's picture

Meir, a warm thank you to you as well for that highly practical elaboration.
(Sorry, trees.)

All this makes me much more comfortable - if no more confident! :-/

And a question: what do you (plural) think of Rosenberg's Papaya?

hhp

William Berkson's picture

>Microsoft 1255 Hebrew codepage

Hrant, it took be a while to find the file John refers to on my PC: it is CP1255.TXT. It doesn't include the cantillation marks.

Incidentally, I believe the cantillation marks are referred to in modern Hebrew as "Te'amim", not "Te'amin". The 'in' plural is characteristic of Aramaic, rather than Hebrew, which uses 'im'.

>many Hebrew-speakers will tell you there is no significant difference between the Biblical and modern Hebrew language

As I understand, the tense structure is significantly different. Modern Hebrew is influenced by Mishnaic Hebrew (many centuries after Biblical Hebrew), which already had a different tense structure.

William Berkson's picture

>I have no idea what the Upper Dot (05C4) means, maybe it's a version of the Holam (05B9) mark?

Guessing: perhaps it goes on the vav to make a Holam male?

John Hudson's picture

<font class="dontLookLikeCrap">Also, I have no idea what the Upper Dot (05C4) means, maybe it's a version of the Holam (05B9) mark?

This I can answer authoritatively, having discussed the matter at length on the Unicode Hebrew list with a member of the Israeli standardisation body. I'd originally guessed that U+05C4 was the Hebrew number mark, but it is actually intended as the upper punctum extraordinarium -- sorry, I don't know what the Jewish name is -- which appears only in the Biblical text and only a very small number of times. Scholars disagree about its meaning, but it is clearly very important because it even occurs in some Torah scrolls, which do not include nikud or te'amin (excuse my Aramaic: it's what I'm used to). There is also a lower punctum, which occurs only three times in the Bible text, all on one word, in Psalm 27:13, but the Israeli standards body didn't bother to propose this for encoding because it was so rare. I'm not sure what they thought scholars were supposed to do without it! It has now been proposed for addition to the Unicode Hebrew block (along with the almost equally rare nun hafukah). Here is an image of the first word in the 13th verse of Psalm 27. The large dots above and below are the puncta; the form shown for U+05C4 in the Unicode book is misleading.
Psalm 27:13:1

If you add Nikud to a Hebrew font, you should also add the Hebrew portion of the Alphabetic Presentation Forms table (the wide letters and the Yiddish and Ladino ligatures are only an option, though). Notice there's also a special Jewish "plus" sign for Hebrew (to counter the "christian", cross-like plus).

The alphabetic presentation forms also include the alef-lamed ligature. This is not used for the Biblical text*, but is sometimes encountered in other religious books such prayer books, and in a variety of other Hebrew texts from the 7th century on (the earliest dated manuscript containing this ligature was found in the Cairo Genizah; sorry, I can't remember the date). I recently did some research on this ligature, and discovered its early history as record by Solomon Birnbaum in The Hebrew Scripts, Brill 1971, p.226. It originated among speakers of Judaeo-Arabic following the Moslem conquests. The sequence alef+lamed is incredibly common in Arabic because it is the definite article 'al, and the ligature was originally developed specifically to write this word. It's use was later extended to every occurence of the sequence in Judaeo-Arabic, and gradually found its way into use in Hebrew. I'd be interested to know to what extent it is used in modern Israel and in what kinds of texts.

* Although the alef-lamed ligature is not normally used for the Biblical text, whether in scroll or book form, Moses Gaster notes that one book manuscript in his collection does include 'ligatures formed by the joining of two letters', which I'm guessing refers to the alef-lamed ligature. This is part of a discussion of the relative liberality shown in the writing of Bibles (books) as opposed to the sacred Torah scrolls. [The Tittled Bible. Maggs Bros 1929. p.7]</font>

piccic's picture

This is too good to be true.
Anyway, Meir, how's Oded's Alchemy, which he sent me, and is his best selling "text face" does include just basic letters, punctuation and the currency symbol?
Maybe because, being pretty "modern", people use it mostly in newspapers (Oded sent me examples a pair of mags) and not in poetry, children or biblical texts?
Does Yontef's Erika Sans have the complete accents and glyphs you and John mentioned? I'd like to buy it from him.

piccic's picture

Hey Meir,
I just realized I used to download some of your early latin "experiments" when you were a teenager. Tfu Tfu was one of my favorites (I love cats), and knowing such a blackletter "revisitation" comes from Israel is just delightful. Your comments on MTV and timelessness are enlightening.

I'm currently reading a book about the exoteric facet of Nazism (and the SS Ahnenerbe) which ties in with my work-in-progress on Nazist imagery in a modern context. I'll surely write you in the future.

On a slightly different matter (here the audience seems in tune), the last issue of Eye magazine (n.50) includes a fantastic article by Paul Khera on Abbar, an amazing Arabic typeface which attempts, for the first time as far as I know, to detach itself from the dominating and pretty abused calligraphic models, without forgetting them.
The experiments Huda Abifares did with her students in 2002 were more display-oriented, and the work of synthesis Yassar Abbar has done feels really "ahead" in the text-setting field.

meir's picture

Claudio, thanks for all your heart warming compliments! I'm truely honored, and glad you remembered my early font experiments...

Perhas Oded has upgraded Alechemy, because I think I recall seeing it used with Nikud. I'm sorry if my memory misleads me... From a little examination I just did, I noticed all the "Guttman" faces come with Nikud marks, as well as some other Monotype and Kivun (Dagesh's publisher) fonts that come with MS Windows and Office.

Now for some more persumptions: I believe Masterfont has all of Zvi Narkis' work with Nikud, and Fontbit also claim to have full Nikud on most of their fonts (as well as kerned, hinted OpenType versions). Another worthy mention is Shmuel Sela, owner of some classic Hebrew letter designs, especially the wonderful Salit, which was put to wide use in Neville Brody's project for an Israeli newspaper site -
http://www.researchstudios.com/ARCHIVE_ynet_RS.html
I believe most of his fonts come with Nikud.

Finally, I'm almost certain I have seen Yanek's Erika Sans with Nikud once, but I guess you will have to contact him to know for sure...

I would love to see a sample of Abbar's face. I've desperately tried to Google for it, but I only found an excerpt of the article -
http://www.eyemagazine.com/feature.php?id=98&fid=482

Also, I found Tarek Atrissi's "AT Arabic" family, especially the bitmap version, a very exciting piece of work (arabictypography.com).

"Westernization" of semitic typefaces is inevitable, much like the spread of international brands such as Coca-Cola and McDonald's. It's interesting, though, to see how the local folk of each culture take those incredibly imperialistic elements and make them their own. Seldom, the result of such fusion can even do as much as redefine aesthetics, both locally and globally.

hrant's picture

I found Abbar's work to be problematic (and not nearly as original as it seems), and Khera's article superficial. The main problem with Abbar's font is that it matches the vertical proportions of Univers. As if Univers weren't inappropriate for a lot of text to begin with, Arabic adapted to those proportions can only be a display face, and a servile one. True typographic harmony isn't so shallow.

> "Westernization" of semitic typefaces is inevitable

I can't be so pessimistic. I think a culture can make a successful effort to resist such dilution, not 100%, but enough to maintain the pride necessary for cultural survival. You might be interested in checking out out this page of mine:

http://www.themicrofoundry.com/ss_rome3.html

hhp

meir's picture

I feel I must stress on the subject of romanized non-latin type, and I agree with you, Hrant, on the importance of this problematic issue.

However, I believe cautiousness, rather than resistance, is the right approach. Especially in the field of language, which is so socially oriented and therefore dynamic and evolving.

In an interview I once had with Yanek Iontef, who designed Hebrew faces which resemble the work of Spiekermann (Officina) and Gill (Gill Sans), in very professional manner IMHO, I asked him what is the first aspect he takes into consideration when "importing" Latin for Hebrew type, and he said that the first thing would be the shape of the Hebrew letter, and that it should be constantly addressed during the work process, in order for the letters to maintain their Hebrew properties when confronted against the foreign shapes (a good counter example for this is in the link you posted - the shockingly-Latinized arabic logotype image).

I don't really know how to end this post, so I'll just include this shameless link to the aforementioned interview, it has some samples of Yanek's incredible work... Sorry in advance, It's in Hebrew... Peace out.

http://www.exego.net/specials/type/1078.asp

William Berkson's picture

Some of the most creative advances in civilization have come from culture clash - or so argued my late teacher Karl Popper in 'The Open Society and Its Enemies'.

Of course, some of the most stupid and ugly things can also come of cultural clash! So the issue is not whether there is influence, but how creatively it is done. Do you get something new and good, or worse or just ridiculous?

In the case of Hebrew script, it was already 'Easternized' once. The 'Hebrew' characters are not Hebrew at all, but Aramaic. In the Talmud I believe they are called 'Chaldean'. The ancient Hebrew script was gradually replaced by the Aramaic script after the Babylonian exile in the fifth century B.C.E. [Aramaic was the language of Babylon and became the common language of the whole near east.]

By the way, Meir, do they ever use the ancient Hebrew script on stamps or coins today in Israel?

meir's picture

Yes, they do!

10 Israeli New Shekels Coin

hrant's picture

> the first thing would be the shape of the Hebrew letter

But there's something else.
It's not just the abstract structure of individual letters than must be taken into account, it's how they come together to form optimal boumas (word shapes). This is why there's a difference between display and text type. A lack of sensitivity to this is the reason 99% of non-Latin type performs worse than their Latin counterparts. You cannot enforce formal congruence (for example in modularity and vertical proportions) without a loss in functionality.

If you look at the Cyrillic Meta sample on that page of mine, you will see that it's way too similar to the Latin. Imagine it without those (nominally extraneous) parentheses, and you'll see how it seriously confuses reading.

> http://www.exego.net/specials/type/1078.asp

Translation, please! :-)

--

BTW, there's a nice article by abi-Fares in the current issue of Baseline magazine (#43). But to me the main thing it shows is how far non-chirographic Arabic type has to go...

hhp

meir's picture

Addressing John's ponder, I don't think there is any use to the Alef-Lamed ligature in modern Hebrew. I've only seen it in very few prayer books.

Hrant - I'd love having that interview translated, I think it has many useful points of information for starter as well as intermediate type designers and enthusiasts. I'll keep you in mind... :>

I'm going to show off some Hebrew bitmap work of mine in the Critique section, with regards to John's post here about including consonant marks only (which is what I'm doing in that font).

piccic's picture

Hi Meir,
I'll ask Oded about Alchemy, and I promise (promise? don't trust me!) to post later on a scan of the eye page with the Abbar typeface sample. That coin is so cool!

piccic's picture

Meir, please write me at thought (at) nettuno.it, otherwise I'm sure I'll forget to scan the Abbar sample for you. And, yes, Oded added a limited set of nikud to updated versions of Alchemy and his other faces.

raphaelfreeman's picture

Fontbit's text fonts indeed do include full nikud with correct Opentype positioning (providing the software supports it, ie Word, and the Adobe suite to-date).

Masterfont has now the technology (read the programmer) to do nikud correctly and is currently converting all the fonts to have correct nikud placement.

Fontbit has two fonts with teamim inside, Livorna and Hadassa, however, the problem of collisions cannot be solved within the font in a sensible way (although Adobe seems to think it can) and has to be solved by an external script. I have solved this problem with both Livorna and Hadassa and have successfully typeset perfectly positioned biblical texts in Adobe InDesign.

raphaelfreeman's picture

I agree. the Alef-lamed ligature is all but useless.

david h's picture

Two years......

Syndicate content Syndicate content