3
votes

I have a file, "template.docx" that I would like to have placeholders (ie. [serial number]) that can be replaced with a string or maybe a table. I am using Apache POI and no i cannot use docx4j.

Is there a way to have the program iterate over all occurrences of "[serial number]" and replace them with a string? Many of these tags will be inside a large table so is there some equivalent command with the Apache POI to just pressing ctrl+f in word and using replace all?

Any suggestions would be appreciated, thanks

2
don't know if it's possible with Apache Poi, but docxtemplater provides a command line interface that does exactly that: github.com/edi9999/docxtemplater and javascript-ninja.fr/docxgenjs/examples/demo.html for a demoedi9999
there is also YARG template engine based on poi github.com/Haulmont/yarg/wikiKonstantin V. Salikhov

2 Answers

3
votes

XWPFDocument (docx) has different kind of sub-elements like XWPFParagraphs, XWPFTables, XWPFNumbering etc.

Once you create XWPFDocument object via:

document = new XWPFDocument(inputStream);

You can iterate through all of Paragraphs:

document.getParagraphsIterator();

When you iterator through Paragraphs, For each Paragraph you will get multiple XWPFRuns which are multiple text blocks with same styling, some times same styling text blocks will be split into multiple XWPFRuns in which case you should look into this question to avoid splitting of your Runs, doing so will help identify your placeHolders without merging multiple Runs within same Paragraph. At this point you should expect that your placeHolder will not be split in multiple runs if that's the case then you can go ahead and Iterate over 'XWPFRun's for each paragraph and look for text matching your placeHolder, something like this will help:

XWPFParagraph para = (XWPFParagraph) xwpfParagraphElement;
for (XWPFRun run : para.getRuns()) {
    if (run.getText(0) != null) {
        String text = run.getText(0);
        Matcher expressionMatcher = expression.matcher(text);
        if (expressionMatcher.find() && expressionMatcher.groupCount() > 0) {
            System.out.println("Expression Found...");
        }
    }
}

Where expressionMatcher is Matcher based on a RegularExpression for particular PlaceHolder. Try having regex that matches something optional before your PlaceHolder and after as well e.g \([]*)(PlaceHolderGroup)([]*)^, trust me it works best.

Once you find the right XWPFRun extract text of your interest in it and create a replacement text which should be easy enough, then you should replace new text with previous text in this particular run by:

run.setText(text, 0);

If you were to replace this whole XWPFRun with a completely a new XWPFRun or perhaps insert a new Paragraph/Table after the Paragraph owning this run, you would probably run into a few problems, like A. ConcurrentModificationException which means you cannot modify this List(of XWPFRuns) you are iterating and B. finding the position of new Element to insert. To resolve these issues you should have a List<XWPFParagraph> of XWPFParagarphs that can hold paras after which new Element is to be inserted. Once you have your List of replacement you can iterator over it and for each replacement Paragraph you simply get a cursor and insert new element at that cursor:

for (XWPFParagraph para: paras) {
    XmlCursor cursor = (XmlCursor) para.getCTP().newCursor();
    XWPFTable newTable = para.getBody().insertNewTbl(cursor);
    //Generate your XWPF table based on what's inside para with your own logic
}

To create an XWPFTable, read this.

Hope this helps someone.

-1
votes
        // Text nodes begin with w:t in the word document
        final String XPATH_TO_SELECT_TEXT_NODES = "//w:t";
        try {
            // Open the input file
            String fileName="test.docx";
            String[] splited=fileName.split(".");
            File dir=new File("D:\\temp\\test.docx");
            WordprocessingMLPackage wordMLPackage =    WordprocessingMLPackage.load(new FileInputStream(dir));

            // Build a list of "text" elements
            List<?> texts = wordMLPackage.getMainDocumentPart().getJAXBNodesViaXPath(XPATH_TO_SELECT_TEXT_NODES, true);
            HashMap<String, String> mappings = new HashMap<String, String>();
            mappings.put("1", "one");
            mappings.put("2", "two");

            // Loop through all "text" elements
            Text text = null;

            for (Object obj : texts) {

               text = (Text) ((JAXBElement<?>) obj).getValue();
               String textToReplace = text.getValue();
               if (mappings.keySet().contains(textToReplace)) {
                   text.setValue(mappings.get(textToReplace));

               }
           }

       wordMLPackage.save(new java.io.File("D:/temp/forPrint.docx"));//your path



    } catch (Exception e) {

    }


    }

}