New to Typophile? Accounts are free, and easy to set up.
Perhaps this quandary has no real answer, but discussion's still helpful.
A publisher recently asked me about a character they needed, an "a" with a macron above, and a tilde above that. Their inclination was to start with an amacron (U+0101) and add a combining tilde (name: uni01010303). My inclination was that since we are dealing with a character for which there is no Unicode index, it should be expressed completely decomposed -- that is (name: uni006103040303).
Why does it matter? Well, prosaically, since we use FontLab, ccmp is the only OT feature available to form such characters. Listing all possible combinations in the ccmp feature is exceeding time consuming, since one should also include all the characters in Latin A, B, and additional. Why? Once you start providing for character strings in canonically correct Unicode, you're sort of obligated to finish. And we have had manuscripts come in where authors expressed, say, aacute with an *a* and a *acutecomb* diacritical. What's wrong with that?
Order is just as bad. Within a plane, Unicode requires inside out. But Unicode doesn't specify the order of the planes. Following a suggestion of 3.0, I prefer bottom to top. Once again, with ccmp, order counts. If you provide for any order, the number of lines grows uncomfortably large.
I haven't used it, but I suspect mark and mkmk have at least some of these issues.
Moving beyond OT fonts, how do people search for such characters in a text file? Is a reader just suppose to search for all possible combinations?
Has anyone else grappled with this?