Copy entire word document including tables to another using Python

Question

I need to copy the entire contents of a template to a new document. The problem is that tables just cannot be copied. Currently, my code takes care of copying styles like bold and italic.

def get_para_data(output_doc_name, paragraph):
    output_para = output_doc_name.add_paragraph()
    for run in paragraph.runs:
        output_run = output_para.add_run(run.text)
        # Run's bold data
        output_run.bold = run.bold
        # Run's italic data
        output_run.italic = run.italic
        # Run's underline data
        output_run.underline = run.underline
        # Run's color data
        output_run.font.color.rgb = run.font.color.rgb
        # Run's font data
        output_run.style.name = run.style.name
    # Paragraph's alignment data
    output_para.paragraph_format.alignment = 
paragraph.paragraph_format.alignment
input_doc=Document('templatemain.docx')
output_doc=Document()
for para in input_doc.paragraphs:
    get_para_data(output_doc, para)
output_doc.save('OutputDoc.docx')

Most of the help I've found for copying tables is to append them. But I am copying a template into a blank document so that doesn't help me at all.

Please clarify what kind of tables you have. You used the 'excel' tag on the post, do you have an embedded Excel spreadsheet? Or are these just regular formatted Word tables? — Martijn Pieters
And what library are you using to open the word document? I'm presuming it's python-docx? — Martijn Pieters
I am using python-docx. Excel I've used in the rest of my code. Included it here by mistake. It has no relevance in this snippet. — nonamelowlife

Martijn Pieters Martijn Pieters · Accepted Answer · 2018-06-14T12:11:45

You are only iterating over the .paragraphs attribute of the document. Tables are listed separately, via the .tables attribute.

You'd need to loop over all the child elements of the document body together, in document order, or otherwise you end up with all the paragraphs and tables bunched together. The python-docx library doesn't offer this functionality directly, you'd need to create your own iterator.

For example, a simplified version would be:

from docx.oxml.text.paragraph import CT_P
from docx.oxml.table import CT_Tbl
from docx.table import Table
from docx.text.paragraph import Paragraph


# select only paragraphs or table nodes
for child in input_doc.element.body.xpath('w:p | w:tbl'):
    if isinstance(child, CT_P):
        paragraph = Paragraph(child, input_doc)
        get_para_data(output_doc, paragraph)
    elif isinstance(child, CT_Tbl):
        table = Table(child, input_doc)
        # do something with the table

Tables can only be contained in the document body, in table cells (so nested inside other tables), in headers and footers, footnotes, and tracked changes, but not inside paragraphs.

Copy entire word document including tables to another using Python

1 Answers