2
votes

I have 2 docx files that I am working with. One docx file contains text information of a product (start serial number, length, width, and height). The other docx file contains a sticker label with an image and all of the text information from the first file.

This is what I do currently: I open the first docx file and copy all of the text information (serial, length, width, and height) Then I paste each info into the second docx file that contains the formatted label. If I need to make more than one label, I copy the label and increment the serial number by 1.

This takes a lot of time to make several labels for different products. My goal is to come up with an easier way to take data from one docx and inject it into the other. Also, generating more labels when needed.

My first thought was to extract the docx file to get it's xml contents. Then read the data using javascript, c++, or any other language. Then Ask user to input number of labels to generate, manipulate the xml, and repack it as a docx file.

Then I thought about trying to use the windows office "mail merge" feature, but I have never done this before.

I would like to know if anyone has any suggestions for an easy solution to import data from one docx file and generating labels into another.

I am open for any suggestion.

Also, I am not a professional programmer. I am an undergraduate computer engineering student with some experience in c, c++, java, javascript, python, MIPS assembly, and php.

2
If you want to look at the MailMerge option then I suggest that you start with gmayor.com/graphics_on_labels.htm (but you already have your graphic, so you can ignore the stuff about WordArt). But it's not easy to get the layout correct, and you will need control over the layout of your first .docx to ensure it can be used as a MailMerge data source.user1379931

2 Answers

2
votes
  1. The only open-source (and probably easier to come by) solution I know know is:

http://poi.apache.org/

http://poi.apache.org/document/quick-guide-xwpf.html

This is a good bet when it comes to speed and it is free software.

But if you open a file, alter it and save it again - the result can be flaky: The formatting can be slightly off. At least in my tests with the pptx counterpart.

I reckon when you have user interaction (web page?) in order to create the document, you can build a small HTTP Api around the library.

There is also: http://www.docx4java.org/trac/docx4j - which I have not tested yet.

  1. You can also go the C#/Redmond way: How do I create the .docx document with Microsoft.Office.Interop.Word?

The Interop (2nd Example in the first answer of the question above) way gives the best result when it comes to the accuracy of the formatting. Basically when you open a file with Interop - it will look the same when you alter and save it. But you cannot use this when interacting with a user - because it starts a separate MS Office process - and I would not count on this from my own user experience. But if you want to generate these files as a batch in a single user session - it will deliver a good result.

I cannot comment on the "OpenXML SDK" library described in the above SO question.