1
votes

This question is related to a few questions that have been asked long time ago. I saw comments that Opentype fonts were not supported in Java, but this was 11 years ago. Nowadays they are. The only problem is that kerning pairs are only given in the GPOS table. I have seen that they are there but it is hard to make sure the code is correct.

I am currently dumping the GPOS table trying to follow the pointers until the kerning pairs.

The code so far is below, where GPOS table has been previously copied to the array gpos. The function to dump the table is dumpGPOS(). I need help to determine if what I am doing is correct and how to code the TODO parts:

byte[] gpos;

char[] hexasc( char[] hex, byte num ) {
    int up = num >> 4;
    int down = num & 15;
    hex[0] = (char) ((up < 10)? '0' + up : 'A' + up - 10);
    hex[1] = (char) ((down < 10)? '0' + down : 'A' + down - 10);
    return hex;
}

char[] hex = { '0', '0' };
void printHex(byte b) {
    hexasc(hex, b);
    System.out.print(hex[0]);
    System.out.print(hex[1]);
}

void dumpGPOS() {
    int i, j;
    System.out.println("GPOS header");
    System.out.print("Version:        ");
    for ( i = 0; i < 4; i++ ) printHex(gpos[i]);
    System.out.println("    (" + (gpos[0] << 8 | gpos[1]) + "." + (gpos[2] << 8 | gpos[3]) + ")" );
    j = i;
    System.out.print("TheScriptList:        ");
    for ( i = 4; i < 6; i++ ) printHex(gpos[i]);
    System.out.println("        (" + (gpos[j] << 8 | gpos[j+1]) + ")" );
    j = i;
    System.out.print("TheFeatureList:        ");
    for ( i = 6; i < 8; i++ ) printHex(gpos[i]);
    System.out.println("        (" + (gpos[j] << 8 | gpos[j+1]) + ")" );
    j = i;
    System.out.print("TheLookupList:        ");
    for ( i = 8; i < 10; i++ ) printHex(gpos[i]);
    int lookup = (gpos[j] << 8 | gpos[j+1]);
    System.out.println("        (" + lookup + ")" );
    j = i;

    System.out.println("Lookup List Table");
    System.out.print("lookupCount:        ");
    for ( i = lookup; i <= lookup+1; i++ ) printHex(gpos[i]);
    System.out.print('\n');
    int count = (gpos[lookup] << 8 | gpos[lookup+1]);
    int tab = lookup + 2;
    int[] LookupList = new int[count];
    for ( j = 0; j < count; j++ ) {
        System.out.print("lookup[" + j + "] =         ");
        printHex(gpos[tab]);
        printHex(gpos[tab + 1]);
        System.out.println("        (" + ( LookupList[j] = (gpos[tab] << 8 | gpos[tab+1]) ) + ")" );
        tab += 2;
    }
    int item, sub, size;
    for ( j = 0; j < count; j++ ) {
        item = lookup + LookupList[j];
        System.out.println("Lookup [" + j + "]");
        System.out.println("Lookup Type:        " + (gpos[item] << 8 | gpos[item + 1]) );
        System.out.print("Lookup flag:        ");
        printHex(gpos[item + 2]);
        printHex(gpos[item + 3]);
        size = (gpos[item + 4] << 8 | gpos[item + 5]);
        System.out.println("\nNumber of subtables:    "  + size);
        sub = item + 6;
        int[] subTable = new int[size];
        System.out.println("Subtable offsets");
        for ( i = 0; i < size; i++ ) {
            subTable[i] = (gpos[sub] << 8 | gpos[sub +1 ]);
            sub += 2;
            System.out.println( "    " + subTable[i] );
        }
        for ( i = 0; i < size; i++ ) {
            System.out.println("Subtable [" + i + "]");
            sub = item + subTable[i];
            printSubtable(sub);
        }
    }
}

void printSubtable(int sub) {
    int format = gpos[sub] << 8 | gpos[sub + 1];
    System.out.println("Format:        " + format );
    if (format == 1) {
        /* TODO format 1*/
        return;
    }
    /* TODO format 2*/
}

This problem is also related to the question How to use kerning pairs extracted from a TTF file to correctly show glyphs as Path2D in Java?.

The context is the same, but since Apache FOP doesn't read Opentype fonts and that Opentype fonts use only the kerning information from GPOS table, it is being much harder.

I am using the Microsoft Opentype reference, but it is way over the top (too vague, no drawings, no example code, and not enough examples). What could be more hints, some drawings, code snippets, more examples (especially for extracting kerning tables from GPOS table)?

How can I make 100% sure this code is really doing what it is supposed to do?

1
While it might look way over the top, GPOS processsing (rather than the ancient kern table process) is incredibly hard because langauges are incredibly hard: GPOS is script, langsys, and feature dependent, and may require multiple passes of various GPOS lookups in specific orders to perform correct shaping. Your best bet is to just look at parser code that others have already written (Harfbuzz, Opentype.js, etc). However, it's also wise to post in the right place: typedrawers.com is by far the better forum to ask about font engineering =)Mike 'Pomax' Kamermans
Hi @Mike'Pomax'Kamermans! I am in Graphics since the end of the 80's. Your Primer on Bezier Curve is a fantastic job! Congrats! I don't find languages hard. It's the documentation for Truetype and Opentype (especially the second) that are terrible. I understand you are a JS guy, but that's a language that I just hate. Harfbuss is in C++, which I have programmed for many years, but code in C++ is not particularly clear unless it is well commented. Thanks for the suggestion of typedrawers.com. I will definitely check it out. I had no idea it existed. Not doing fonts seriously for too long.Jack London
I have no beef with people who hate JS, it works on the web, but the web is just the web (even if it has a deep reach). Harfbuzz is basically the authority when it comes to parsing, and Behdad did a hell of a job on it, but if its source is not enough, most of the OG font folks are on typedrawers, so if you want advice from the people who literally write the OT spec: that's the place to be =)Mike 'Pomax' Kamermans
@Mike'Pomax'Kamermans, I downloaded Opentype.js and font-inspector.html. I am running it with font-inspector.html locally. I also copied the site.css for formatting. I modified inspector.html to show the GPOS table. It works, but it is strange. I have to load the file using the button. It is unable to load the file I supply in the code. That's why I hate this language, :-)))Jack London
That is a super weird to say though, because that has nothing to do with javascript, and everything to do with "how that html file was written". Anyone with some JS knowledge could update that file so that you don't need to load the file using that "browse" button. That's a bit like saying you hate C# because you don't like how some kid's CLI utility needs you to pass runtime flags in all caps: that has nothing to do with C# itself =) (Modern JS does not need HTML to run, ever since Node got released years ago, it's a scripting language like php, python, etc, that can also run in the browser)Mike 'Pomax' Kamermans

1 Answers

1
votes

Problem solved!

An advice: if you are trying to do this in Java you are losing your time. This problem was solved simply using Opentype.js and the site https://opentype.js.org, but particularly glyph inspector and font inspector.

I downloaded both code by copying and pasting from page source. I then modified glyph inspector to do the job. You need to go deeply into Opentype.js to get the kerning pairs, though, but it is all there (check code below).

I ended up porting the entire program to JavaScript. I didn't program JavaScript too much before. Therefore, it must be extremely easy for whom already programs in JavaScript. This program just generates a Java class with the glyphs, the kerning pairs and widths (the advanceWidth for each glyph).

Here, an image to show the result:

Enter image description here

The code below in JavaScript dumps the GPOS kerning table into the string text, presenting a list of sets containing the second character and the kern value in each line, while the first character of the pair appears as a Java character at the beginning of the line. Notice, that glyph indexes are avoided by using the ASCII codes of the characters.

This only dumps ' '(space) to '}', which is what is useful for English language. To expand to other characters just use Unicode.

<!-- indoc: true -> writes in HTML on this page  -->
const indoc = true;
const nl = (indoc) ? "<br>" : "\n";
var chars = font.tables.cmap.glyphIndexMap;
var g1,g2;
var i, j, kern;
var text = "";
for ( i = 32; i < 126; i++ ) {
    g1 = font.glyphs.get(chars[i]);
    text += "'" +
    (( i == 39 || i == 92) ? "\\" + String.fromCharCode(i) : String.fromCharCode(i) )+ "'";
    for ( j = 32; j < 126; j++ ) {
        g2 = font.glyphs.get(chars[j]);
        if ( (kern = font.getKerningValue(g1,g2)) !== 0 ) {
            text += ", { '" +
            (( j == 39 || j == 92) ? "\\" + String.fromCharCode(j) : String.fromCharCode(j) )+
            "', " + kern + " }";
        }
    }
    text += nl;
}