Avoid calculating startIndex and endIndex when creating a document using Google Docs API

Question

I have proven to myself that I can insert text into a Google Docs document using this code:

function appendToDocument() {
    let offset = 12;
    let updateObject = {
        documentId: 'xxxxxxx',
        resource: {
            requests: [{
                "insertText": {
                    "text": "John Doe",
                    "location": {
                        "index": offset,
                    },
                },
            }],
        },
    };
    gapi.client.docs.documents.batchUpdate(updateObject).then(function(response) {
        appendPre('response =  ' + JSON.stringify(response));
    }, function(response) {
        appendPre('Error: ' + response.result.error.message);
    });
}

My next step is to create an entire, complex document using the api. I am stunned by what appears to be the fact that I need to maintain locations into the documents, like this

new Location().setIndex(25)

I am informing myself of that opinion by reading this https://developers.google.com/docs/api/how-tos/move-text

The document I am trying to create is very dynamic and very complex, and handing the coding challenge to keeping track of index values to the api user, rather than the api designer, seems odd.

Is there an approach, or a higher level api, that allows me construct a document without this kind of house keeping?

I have to apologize for my poor English skill. About Is there an approach, or a higher level api, that allows me construct a document without this kind of house keeping?, I cannot understand. Can I ask you about your goal? I thought that when I could correctly see the vision of your goal, it might help to think of the solution and workaround. — Tanaike
Sure @Tanaike. I want to write a piece of software that will create a new google doc, insert a complex table containing many rows and cells, which may contain some images, some text etc. I do not like the fact that I need to keep track of the index of all the locations as I insert table, tablerow, tablecell, image, text, text, text, tablerow, tablecell, etc. I want to find an api that will do that for me. — Mike Hogan
Thank you for replying. From your replying, you want to create a table including the values to the Google Document with Docs API. And you want to achieve this without using the index of each element. I could understand like this. If my understanding is correct, unfortunately, in the current stage, I think that there are no methods for directly achieving your goal in the Docs API. So as a workaround, how about using the append method with Google Apps Script? In this case, Web Apps is used, and the script can be run by the request. If this was not the direction you want, I apologize. — Tanaike
No worries @Tanaike. My initial experiments were with the Google App Script api, and I found it would work. But I need to create my documents on the server, with no human involvement. So I moved to the Google Docs api. And boy have I been disappointed. I am well on the way to building a model of a google document that manages index values for me, but provides an append style api. It's just a pain to have to do that tho. — Mike Hogan
Does this help you? "Simplify matters by writing backwards. To avoid having to precalculate these offset changes, order your insertions to "write backwards": do the insertion at the highest-numbered index first, working your way towards the beginning with each insertion. This ordering ensures that each write's offsets are unaffected by the preceding ones." Btw, you can set-up a service account in Apps Script which might allows you to use it without human interaction. — ziganotschka

andrew_reece andrew_reece · Accepted Answer · 2020-02-04T00:53:15

Unfortunately, the short answer is no, there's no API that lets you bypass the index-tracking required of the base Google Docs API - at least when it comes to building tables.

I recently had to tackle this issue myself - a combination of template updating and document construction - and I basically ended up writing an intermediate API with helper functions to search for and insert by character indices.

For example, one trick I've been using for table creation is to first create a table of a specified size at a given index, and put some text in the first cell. Then I can search the document object for the tableCells element that contains that text, and work back from there to get the table start index.

Another trick is that if you know how many specific kinds of objects (like tables) you have in your document, you can parse through the full document object and keep track of table counts, and stop when you get to the one you want to update/delete (you can use this approach for creating too but the target text approach is easier, I find).

From there with some JSON parsing and trial-and-error, you can figure out the start index of each cell in a table, and write functions to programmatically find and create/replace/delete. If there's an easier way to do all this, I haven't found it. There is one Github repo with a Google Docs API wrapper specifically for tables, and it does appear to be active, although I found it after I wrote everything on my own and I haven't used it.)

Here's a bit of code to get you started:

def get_target_table(doc, target_txt):
    """ Given a target string to be matched in the upper left column of a table
        of a Google Docs JSON object, return JSON representing that table. """
    body = doc["body"]["content"]
    for element in body:
        el_type = list(element.keys())[-1]
        if el_type == "table":
            header_txt = get_header_cell_text(element['table']).lower().strip()
            if target_txt.lower() in header_txt:
                return element
    return None

def get_header_cell_text(table):
    """ Given a table element in Google Docs API JSON, find the text of
        the first cell in the first row, which should be a column header. """
    return table['tableRows'][0]\
        ['tableCells'][0]\
        ['content'][0]\
        ['paragraph']['elements'][0]\
        ['textRun']['content']

Assuming you've already created a table with the target text in it: now, start by pulling the document JSON object from the API, and then use get_target_table() to find the chunk of JSON related to the table.

doc = build("docs", "v1", credentials=creds).documents().get(documentId=doc_id).execute() 
table = get_target_table(doc, "my target")

From there you'll see the nested tableRows and tableCells objects, and the content inside each cell has a startIndex. Construct a matrix of table cell start indices, and then, for populating them, work backwards from the bottom right cell to the upper left, to avoid displacing your stored indices (as suggested in the docs and in one of the comments).

It's definitely a bit of a slog. And styling table cells is a whole 'nother beast, which is a dizzying maze of JSON options. The interactive JSON constructor tool on the Docs API site is useful to get the syntax write.

Hope this helps, good luck!

Avoid calculating startIndex and endIndex when creating a document using Google Docs API

2 Answers