Unfortunately, the short answer is no, there's no API that lets you bypass the index-tracking required of the base Google Docs API - at least when it comes to building tables.
I recently had to tackle this issue myself - a combination of template updating and document construction - and I basically ended up writing an intermediate API with helper functions to search for and insert by character indices.
For example, one trick I've been using for table creation is to first create a table of a specified size at a given index, and put some text in the first cell. Then I can search the document object for the tableCells
element that contains that text, and work back from there to get the table start index.
Another trick is that if you know how many specific kinds of objects (like tables) you have in your document, you can parse through the full document object and keep track of table counts, and stop when you get to the one you want to update/delete (you can use this approach for creating too but the target text approach is easier, I find).
From there with some JSON parsing and trial-and-error, you can figure out the start index of each cell in a table, and write functions to programmatically find and create/replace/delete. If there's an easier way to do all this, I haven't found it. There is one Github repo with a Google Docs API wrapper specifically for tables, and it does appear to be active, although I found it after I wrote everything on my own and I haven't used it.)
Here's a bit of code to get you started:
def get_target_table(doc, target_txt):
""" Given a target string to be matched in the upper left column of a table
of a Google Docs JSON object, return JSON representing that table. """
body = doc["body"]["content"]
for element in body:
el_type = list(element.keys())[-1]
if el_type == "table":
header_txt = get_header_cell_text(element['table']).lower().strip()
if target_txt.lower() in header_txt:
return element
return None
def get_header_cell_text(table):
""" Given a table element in Google Docs API JSON, find the text of
the first cell in the first row, which should be a column header. """
return table['tableRows'][0]\
['tableCells'][0]\
['content'][0]\
['paragraph']['elements'][0]\
['textRun']['content']
Assuming you've already created a table with the target text in it: now, start by pulling the document JSON object from the API, and then use get_target_table()
to find the chunk of JSON related to the table.
doc = build("docs", "v1", credentials=creds).documents().get(documentId=doc_id).execute()
table = get_target_table(doc, "my target")
From there you'll see the nested tableRows
and tableCells
objects, and the content
inside each cell has a startIndex
. Construct a matrix of table cell start indices, and then, for populating them, work backwards from the bottom right cell to the upper left, to avoid displacing your stored indices (as suggested in the docs and in one of the comments).
It's definitely a bit of a slog. And styling table cells is a whole 'nother beast, which is a dizzying maze of JSON options. The interactive JSON constructor tool on the Docs API site is useful to get the syntax write.
Hope this helps, good luck!
Is there an approach, or a higher level api, that allows me construct a document without this kind of house keeping?
, I cannot understand. Can I ask you about your goal? I thought that when I could correctly see the vision of your goal, it might help to think of the solution and workaround. – Tanaike