11
votes

From a scripting language (Python or Ruby, say) on a Debian-based system, I would like to find either one of:

  1. All the Unicode codepoints that a particular font has glyphs for
  2. All the fonts that have glyphs for a particular Unicode codepoint

(Obviously either 1 or 2 can be derived form the other, so whatever is easier would be great.) I have done this in the past by running:

fc-list : file charset

... and parsing the output at the end of each line, based on this code from fontconfig but it seems to me that there ought to be a much simpler way of doing this.

(I'm not completely sure this is the right StackExchange site for this question, but I am looking for an answer that can be used programmatically.)

2
"There ought to be a simpler way"? Do you know how many font formats there are? And you want to be able to processes all of them?!Kerrek SB
@Kerrek SB: I know (of course!) that there are many different font formats, but we have libraries that deal with that - for example, the fontconfig command I gave in the question does give you the information I'm after for fonts of several different formats.Mark Longair
This python script works great : unix.stackexchange.com/a/268286/26952Skippy le Grand Gourou

2 Answers

6
votes

I would try any of the FreeType 2 language bindings. Here's a Perl solution to list the Unicode code points of a font using Font::FreeType:

use Font::FreeType;
Font::FreeType->new->face('DejaVuSans.ttf')->foreach_char(sub {
    printf("%04X\n", $_->char_code);
});
3
votes

I've recently listed the mapping from unicode codepoints to glypths in a TTF using TTX/FontTools. That tool is written in Python, so it matches the Python tag in your post. The command

ttx -t cmap foo.ttf

will generate an XML file foo.ttx which describes that mapping, for various environments and encodings. See e.g. this reference for a description of what the platform and encoding identifiers actually mean. I assume that the package can be used as a library as well as a command line tool, but I have no experience there.