<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://typophile.com" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>Typophile - Getting a string of all defined characters? - Comments</title>
 <link>http://typophile.com/node/45395</link>
 <description>Comments for &quot;Getting a string of all defined characters?&quot;</description>
 <language>en</language>
<item>
 <title>If you have FontLab Studio:</title>
 <link>http://typophile.com/node/45395#comment-279682</link>
 <description>&lt;p&gt;If you have FontLab Studio: &lt;/p&gt;
&lt;p&gt;1. Open the font.&lt;br /&gt;
2. Choose Tools / Quick Test As / OpenType TT (.ttf)&lt;br /&gt;
3. In the Quick Test window choose Content / All Characters.&lt;br /&gt;
4. Copy and paste the contents of the window into your favorite text editor. &lt;/p&gt;
&lt;p&gt;Note that only encoded glyphs (Unicode and PUA) are shown. &lt;/p&gt;
&lt;p&gt;Adam&lt;/p&gt;
</description>
 <pubDate>Wed, 21 May 2008 06:07:32 -0700</pubDate>
 <dc:creator>twardoch</dc:creator>
 <guid isPermaLink="false">comment 279682 at http://typophile.com</guid>
</item>
<item>
 <title>For the record, here is the</title>
 <link>http://typophile.com/node/45395#comment-279285</link>
 <description>&lt;p&gt;For the record, here is the my final script. You may jazz it as you want. &lt;/p&gt;
&lt;p&gt;I finally remembered that with the Python &lt;code&gt;unicodedata&lt;/code&gt; module one can check for the name of a unicode character (but with the narrow build of Python on the Mac, one cannot process that way characters above 0xFFFF). That can be quite handy. Here is an example:&lt;br /&gt;
&lt;code&gt;&lt;br /&gt;
&amp;gt;&amp;gt;&amp;gt; from unicodedata import *&lt;br /&gt;
&amp;gt;&amp;gt;&amp;gt; name(unichr(0x05D0)); name(unichr(0x0627)); name(unichr(0x0905))&lt;br /&gt;
&#039;HEBREW LETTER ALEF&#039;&lt;br /&gt;
&#039;ARABIC LETTER ALEF&#039;&lt;br /&gt;
&#039;DEVANAGARI LETTER A&#039;&lt;br /&gt;
&amp;gt;&amp;gt;&amp;gt;&lt;br /&gt;
&lt;/code&gt;&lt;br /&gt;
The script thus simply outputs the unicode characters in the font whose code is in the range &lt;code&gt;0x0020&lt;/code&gt; &amp;#8212; &lt;code&gt;0xFFFF&lt;/code&gt; and that have a name in the unicode namelist (according to the &lt;code&gt;unicodedata&lt;/code&gt; function &amp;#8220;&lt;code&gt;name&lt;/code&gt;&amp;#8221;). Here it is:&lt;br /&gt;
&lt;code&gt;&lt;br /&gt;
----&lt;br /&gt;
#!/usr/bin/python&lt;br /&gt;
import fontforge,sys&lt;br /&gt;
from unicodedata import *&lt;br /&gt;
fnt=fontforge.open(sys.argv[1],1)&lt;/p&gt;
&lt;p&gt;s=&#039;&#039;&lt;br /&gt;
glyphset=fnt.glyphs()&lt;br /&gt;
for g in glyphset:&lt;br /&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;cdg=g.unicode&lt;br /&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;if (0x20 &amp;lt;= cdg &amp;lt;= 0xFFFF):&lt;br /&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;uni=unichr(cdg)&lt;br /&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if (name(uni,&quot;noname&quot;) != &quot;noname&quot;):&lt;br /&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;s=s+uni&lt;br /&gt;
print s.encode(&#039;utf-8&#039;)&lt;br /&gt;
----&lt;br /&gt;
&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Michel&lt;/p&gt;
</description>
 <pubDate>Mon, 19 May 2008 07:21:02 -0700</pubDate>
 <dc:creator>Michel Boyer</dc:creator>
 <guid isPermaLink="false">comment 279285 at http://typophile.com</guid>
</item>
<item>
 <title>Hmm.  typophile included the</title>
 <link>http://typophile.com/node/45395#comment-279196</link>
 <description>&lt;p&gt;Hmm.  typophile included the semicolon in the link to the NamesList, which is thus also broken. This links works&lt;/p&gt;
&lt;p&gt;&lt;a href=&quot;http://www.unicode.org/Public/UNIDATA/NamesList.txt&quot; title=&quot;http://www.unicode.org/Public/UNIDATA/NamesList.txt&quot;&gt;http://www.unicode.org/Public/UNIDATA/NamesList.txt&lt;/a&gt;&lt;/p&gt;
</description>
 <pubDate>Sun, 18 May 2008 10:20:42 -0700</pubDate>
 <dc:creator>Michel Boyer</dc:creator>
 <guid isPermaLink="false">comment 279196 at http://typophile.com</guid>
</item>
<item>
 <title>s/00FF/FFFF/    I meant FFFF</title>
 <link>http://typophile.com/node/45395#comment-279181</link>
 <description>&lt;p&gt;s/00FF/FFFF/    I meant FFFF and not 00FF of course (15 minutes over, no way to correct).&lt;/p&gt;
</description>
 <pubDate>Sun, 18 May 2008 07:30:12 -0700</pubDate>
 <dc:creator>Michel Boyer</dc:creator>
 <guid isPermaLink="false">comment 279181 at http://typophile.com</guid>
</item>
<item>
 <title>The ranges above miss Hebrew</title>
 <link>http://typophile.com/node/45395#comment-279179</link>
 <description>&lt;p&gt;The ranges above miss Hebrew and Arabic characters. If you code yourself, you must add them. I just modified the little application;  instead of using ranges, I generated a list of all the characters from 0020 to 00FF that are not control and that are listed in &lt;a href=&quot;http://www.unicode.org/Public/UNIDATA/NamesList.txt;&quot; title=&quot;http://www.unicode.org/Public/UNIDATA/NamesList.txt;&quot;&gt;http://www.unicode.org/Public/UNIDATA/NamesList.txt;&lt;/a&gt; now Hebrew and Arabic characters should be there (and everything else defined from 0020 to 00FF).&lt;/p&gt;
&lt;p&gt;Michel&lt;/p&gt;
</description>
 <pubDate>Sun, 18 May 2008 07:10:00 -0700</pubDate>
 <dc:creator>Michel Boyer</dc:creator>
 <guid isPermaLink="false">comment 279179 at http://typophile.com</guid>
</item>
<item>
 <title>Michel, your link to the</title>
 <link>http://typophile.com/node/45395#comment-279172</link>
 <description>&lt;p&gt;&lt;cite&gt;Michel, your link to the little application appears to be broken&lt;/cite&gt;&lt;/p&gt;
&lt;p&gt;Link unbroken. Weird. I am sure I had checked it.&lt;/p&gt;
</description>
 <pubDate>Sun, 18 May 2008 05:57:01 -0700</pubDate>
 <dc:creator>Michel Boyer</dc:creator>
 <guid isPermaLink="false">comment 279172 at http://typophile.com</guid>
</item>
<item>
 <title>Michel, your link to the</title>
 <link>http://typophile.com/node/45395#comment-279144</link>
 <description>&lt;p&gt;Michel, your link to the little application appears to be broken&lt;/p&gt;
</description>
 <pubDate>Sat, 17 May 2008 22:06:00 -0700</pubDate>
 <dc:creator>cuttlefish</dc:creator>
 <guid isPermaLink="false">comment 279144 at http://typophile.com</guid>
</item>
<item>
 <title>Welcome! By the way, the</title>
 <link>http://typophile.com/node/45395#comment-278971</link>
 <description>&lt;p&gt;Welcome! By the way, the above script should translate easily to one for FontLab; I don&amp;#8217;t use FontLab but that looks obvious from what I could see in Haralambous&amp;#8217; book &amp;#8220;Fonts and Encodings&amp;#8221;.&lt;/p&gt;
&lt;p&gt;Michel&lt;/p&gt;
</description>
 <pubDate>Fri, 16 May 2008 11:01:36 -0700</pubDate>
 <dc:creator>Michel Boyer</dc:creator>
 <guid isPermaLink="false">comment 278971 at http://typophile.com</guid>
</item>
<item>
 <title>Thanks for all your</title>
 <link>http://typophile.com/node/45395#comment-278960</link>
 <description>&lt;p&gt;Thanks for all your answers!&lt;/p&gt;
&lt;p&gt;When investing jFont I came across a nifty windows utility called &lt;a class=&quot;freelinking-external&quot; href=&quot;http://www.babelstone.co.uk/Software/BabelMap.html&quot;&gt;BabelMap&lt;/a&gt; that has a font analysis function that does this and more - the actual output is made in a non-selectable textfield though. I wrote the developer about this.&lt;/p&gt;
&lt;p&gt;For now, I settled for the fontforge solution, even though it meant getting my feet wet with cygwin and python. Works great. Thanks for that Michel.&lt;/p&gt;
</description>
 <pubDate>Fri, 16 May 2008 10:21:37 -0700</pubDate>
 <dc:creator>mummla</dc:creator>
 <guid isPermaLink="false">comment 278960 at http://typophile.com</guid>
</item>
<item>
 <title>If you are not afraid of</title>
 <link>http://typophile.com/node/45395#comment-278818</link>
 <description>&lt;p&gt;If you are not afraid of line commands, here is a script that works with &lt;a href=&quot;http://fontforge.sourceforge.net/&quot;&gt;FontForge&lt;/a&gt; (no need to install X-Windows; works on Mac or Pc with Cygwin and Python):&lt;br /&gt;
&lt;code&gt;&lt;br /&gt;
#!/usr/bin/python&lt;/p&gt;
&lt;p&gt;import fontforge,sys&lt;br /&gt;
fnt=fontforge.open(sys.argv[1])&lt;/p&gt;
&lt;p&gt;validranges = range(0x20,0x500)+range(0x1E00,0x2700)+range(0xFB00, 0xFB50)&lt;/p&gt;
&lt;p&gt;s=&#039;&#039;&lt;br /&gt;
for g in fnt.glyphs():&lt;br /&gt;
&amp;nbsp;&amp;nbsp;if (g.unicode in validranges):&lt;br /&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;s=s+unichr(g.unicode)&lt;br /&gt;
print s.encode(&#039;utf-8&#039;)&lt;br /&gt;
&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;If you call that script &amp;#8220;listchars&amp;#8221; then&lt;br /&gt;
&lt;code&gt;&lt;br /&gt;
listchars 2&amp;gt;/dev/null font_file&lt;br /&gt;
&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;will give you the string on the output. You can also output in a file with &lt;code&gt;printchars 2&amp;gt;/dev/null font_file &amp;gt; string.txt&lt;/code&gt;. &lt;/p&gt;
&lt;p&gt;If you don&amp;#8217;t don&amp;#8217;t like line commands, and if you are on a mac, you can also use &lt;a href=&quot;http://www.iro.umontreal.ca/~boyer/typophile/typophile/listchars.zip&quot;&gt;this little application&lt;/a&gt; which is just the above wrapped in a clickable thing. It should ask you to install FontForge if you do not have it (it takes seconds); then you can select your font file and you get the string in a window. &lt;/p&gt;
&lt;p&gt;Michel&lt;/p&gt;
</description>
 <pubDate>Thu, 15 May 2008 17:52:30 -0700</pubDate>
 <dc:creator>Michel Boyer</dc:creator>
 <guid isPermaLink="false">comment 278818 at http://typophile.com</guid>
</item>
<item>
 <title>The character map sample of</title>
 <link>http://typophile.com/node/45395#comment-278795</link>
 <description>&lt;p&gt;The character map sample of Algerian on that JFont site looks a bit suspicious, mixing hex and decimal ...&lt;/p&gt;
&lt;p&gt;It depends on what tools you have at hand. Python and FontLab sure sounds like a feasible combination. Maybe the Adobe FDK (also through Python) &amp;#8212; there&amp;#8217;s bound to be something useful in that.&lt;/p&gt;
</description>
 <pubDate>Thu, 15 May 2008 16:00:04 -0700</pubDate>
 <dc:creator>Theunis de Jong</dc:creator>
 <guid isPermaLink="false">comment 278795 at http://typophile.com</guid>
</item>
<item>
 <title>If you use Windows,</title>
 <link>http://typophile.com/node/45395#comment-278773</link>
 <description>&lt;p&gt;If you use Windows, there&amp;#8217;s a program called &lt;a class=&quot;freelinking-external&quot; href=&quot;http://www.jlion.com/docs/jFont.aspx&quot;&gt;JFont&lt;/a&gt; that claims it can print a character map report to an RTF file.&lt;/p&gt;
</description>
 <pubDate>Thu, 15 May 2008 14:28:37 -0700</pubDate>
 <dc:creator>Gus Winterbottom</dc:creator>
 <guid isPermaLink="false">comment 278773 at http://typophile.com</guid>
</item>
<item>
 <title>I don’t think it will be</title>
 <link>http://typophile.com/node/45395#comment-278753</link>
 <description>&lt;p&gt;I don&amp;#8217;t think it will be easy to do this without some programming, or a repetitive motion injury-inducing amount of mouse clicking.&lt;/p&gt;
&lt;p&gt;You basically need to loop through the entries in the cmap and produce a string corresponding to each character code, appending to an output string. It seems like it should be possible to do something like this in FontLab using Python. As I am fairly non-Pythonic I cannot offer any specific advice (I achieve the same thing using tools written in another programming language instead), but I can suggest this non-language-specific pseudo-code:&lt;br /&gt;
&lt;code&gt;&lt;br /&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;for each characterCode in fontUnicodeArray:&lt;br /&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;set myString to myString + unichr(characterCode)&lt;br /&gt;
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;next&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print myString&lt;br /&gt;
&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;&amp;#8220;unichr&amp;#8221; is a function that converts a number (i.e. character code value) into a Unicode string. I believe Python has a something like that ;-)  Maybe you or someone else can fill in the other bits in proper Python (and FontLab) form and get what you&amp;#8217;re after...&lt;/p&gt;
</description>
 <pubDate>Thu, 15 May 2008 11:43:35 -0700</pubDate>
 <dc:creator>j.hadley</dc:creator>
 <guid isPermaLink="false">comment 278753 at http://typophile.com</guid>
</item>
<item>
 <title>Hi,
Thanks for your answer</title>
 <link>http://typophile.com/node/45395#comment-278743</link>
 <description>&lt;p&gt;Hi,&lt;/p&gt;
&lt;p&gt;Thanks for your answer but it was not really what I was looking for - to clarify I need a string of all characters that are defined in a particular font. For example: &lt;/p&gt;
&lt;p&gt;!&amp;#8221;#$%&amp;amp;&amp;#8217;()*+,-./0123456789:;&amp;lt;=&amp;gt;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_&amp;#8216;abcdefghijklmnopqrstuvwxyz&lt;br /&gt;
{|}~¡¢£¤¥¦§¨©ª«¬­®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõ&lt;br /&gt;
ö÷øùúûüýþÿĆćČčđĞğİıŁłŒœŞşŠšŸŽžƒˆ˜–—‘’‚“”„†‡•…‰‹›€™&lt;/p&gt;
&lt;p&gt;Maintype for example gives me a good visual overview of all the characters but I have yet to find a way to automatically create a string like the above...&lt;/p&gt;
</description>
 <pubDate>Thu, 15 May 2008 11:07:18 -0700</pubDate>
 <dc:creator>mummla</dc:creator>
 <guid isPermaLink="false">comment 278743 at http://typophile.com</guid>
</item>
<item>
 <title>I’m not sure if this is</title>
 <link>http://typophile.com/node/45395#comment-278729</link>
 <description>&lt;p&gt;I&amp;#8217;m not sure if this is what you&amp;#8217;re asking, but in illustrator and indesign, you just have to open the glyphs pallette by going to TYPE&amp;gt;GLYPHS.&lt;/p&gt;
&lt;p&gt;You can click each one to get it&amp;#8217;s unicode, or double-click it to put it into a selected text field.&lt;/p&gt;
&lt;p&gt;Also, Apple&amp;#8217;s TextEdit has a similar window EDIT&amp;gt;SPECIAL CHARACTERS (cmnd+opt+t). This will actually give you a lot more information than illustrator or indesign do.&lt;/p&gt;
&lt;p&gt;I&amp;#8217;m sure there&amp;#8217;s a way to do it in word, but I never use it.&lt;/p&gt;
</description>
 <pubDate>Thu, 15 May 2008 10:38:54 -0700</pubDate>
 <dc:creator>Chipman223</dc:creator>
 <guid isPermaLink="false">comment 278729 at http://typophile.com</guid>
</item>
<item>
 <title>Getting a string of all defined characters?</title>
 <link>http://typophile.com/node/45395</link>
 <description>&lt;p&gt;Hi all,&lt;/p&gt;
&lt;p&gt;I&amp;#8217;ve been looking for an easy, fast, automatic way to get all defined characters of a font, preferably just as a unicode string. Does anyone here know a trick?&lt;/p&gt;
&lt;p&gt;Cheers,&lt;/p&gt;
&lt;p&gt;Nick&lt;/p&gt;
</description>
 <comments>http://typophile.com/node/45395#comments</comments>
 <category domain="http://typophile.com/taxonomy/term/6">Build</category>
 <pubDate>Thu, 15 May 2008 10:03:04 -0700</pubDate>
 <dc:creator>mummla</dc:creator>
 <guid isPermaLink="false">45395 at http://typophile.com</guid>
</item>
</channel>
</rss>
