4
votes

I am using Apache POI to replace words of docx. For a normal paragraph, I success to use XWPFParagraph and XWPFRun to replace the words. Then I tried to replace words in text box. I referenced this https://stackoverflow.com/a/25877256 to get text in text box. I success to print the text in console. However, I failed to replace words in text box. Here are some of my codes:

    for (XWPFParagraph paragraph : doc.getParagraphs()) {
        XmlObject[] textBoxObjects =  paragraph.getCTP().selectPath("declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' declare namespace wps='http://schemas.microsoft.com/office/word/2010/wordprocessingShape' .//*/wps:txbx/w:txbxContent");
            for (int i =0; i < textBoxObjects.length; i++) {
                XWPFParagraph embeddedPara = null;
                try {
                XmlObject[] paraObjects = textBoxObjects[i].
                    selectChildren(
                    new QName("http://schemas.openxmlformats.org/wordprocessingml/2006/main", "p"));

                for (int j=0; j<paraObjects.length; j++) {
                    embeddedPara = new XWPFParagraph(CTP.Factory.parse(paraObjects[j].xmlText()), paragraph.getBody());
                    List<XWPFRun> runs = embeddedPara.getRuns();
                    for (XWPFRun r : runs) {
                        String text = r.getText(0);
                        if (text != null && text.contains(someWords)) {
                            text = text.replace(someWords, "replaced");
                            r.setText(text, 0);
                        }
                    }
                } 
                } catch (XmlException e) {
                //handle
                }
            }
    }

I think the problem is that I created a new XWPFParagraph embeddedPara and it's replacing the words of embeddedPara but not the origin paragraph. So after I write in a file, the words still not change.

How can I read and replace the words in the text box without creating a new XWPFParagraph?

1
See stackoverflow.com/questions/35459386/…. The problem is not the creating the new XWPFParagraph but the creating a CTP which is independent of the document. Your XmlObject[] paraObjects is an array of XmlObjects which should be instanceof CTP. So try: embeddedPara = new XWPFParagraph((CTP)paraObjects[j], paragraph.getBody());. Not tested - thats why a comment and not an answer.Axel Richter
@AxelRichter Tried embeddedPara = new XWPFParagraph((CTP)paraObjects[j], paragraph.getBody());, give an error: Cannot cast org.apache.xmlbeans.impl.values.XmlAnyTypeImpl to org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP. I have read your answer before, but still don't know how to modify my code.KC L
You are right. The problem is bigger. See my answer.Axel Richter

1 Answers

8
votes

The problem occurs because the Word text boxes may be contained in multiple different XmlObjects dependent of the Word version. Those XmlObjects may also be in very different name spaces. So the selectChildren cannot following the name space route and so it will return a XmlAnyTypeImpl.

What all text box implementatrion have in common is that their runs are in the path .//*/w:txbxContent/w:p/w:r. So we can using a XmlCursor which selects that path. Then we collect all selected XmlObjects in a List<XmlObject>. Then we parse CTRs from those objects, which are of course only CTRs outside the document context. But we can creating XWPFRuns from those, do the replacing there and then set the XML content of those XWPFRuns back to the objects. After this we have the objects containing the replaced content.

Example:

enter image description here

import java.io.FileOutputStream;
import java.io.FileInputStream;

import org.apache.poi.xwpf.usermodel.*;

import org.apache.xmlbeans.XmlObject;
import org.apache.xmlbeans.XmlCursor;

import  org.openxmlformats.schemas.wordprocessingml.x2006.main.CTR;

import java.util.List;
import java.util.ArrayList;

public class WordReplaceTextInTextBox {

 public static void main(String[] args) throws Exception {

  XWPFDocument document = new XWPFDocument(new FileInputStream("WordReplaceTextInTextBox.docx"));

  String someWords = "TextBox";

  for (XWPFParagraph paragraph : document.getParagraphs()) {
   XmlCursor cursor = paragraph.getCTP().newCursor();
   cursor.selectPath("declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' .//*/w:txbxContent/w:p/w:r");

   List<XmlObject> ctrsintxtbx = new ArrayList<XmlObject>();

   while(cursor.hasNextSelection()) {
    cursor.toNextSelection();
    XmlObject obj = cursor.getObject();
    ctrsintxtbx.add(obj);
   }
   for (XmlObject obj : ctrsintxtbx) {
    CTR ctr = CTR.Factory.parse(obj.xmlText());
    //CTR ctr = CTR.Factory.parse(obj.newInputStream());
    XWPFRun bufferrun = new XWPFRun(ctr, (IRunBody)paragraph);
    String text = bufferrun.getText(0);
    if (text != null && text.contains(someWords)) {
     text = text.replace(someWords, "replaced");
     bufferrun.setText(text, 0);
    }
    obj.set(bufferrun.getCTR());
   }
  }

  FileOutputStream out = new FileOutputStream("WordReplaceTextInTextBoxNew.docx");
  document.write(out);
  out.close();
  document.close();
 }
}

enter image description here