I want to do replacements in MS Word (.docx) document using regular expression (java RegEx):
Example:
…, с одной стороны, и %SOME_TEXT% именуемое в дальнейшем «Заказчик», в
лице %SOME_TEXT% действующего на основании %SOME_TEXT% с другой стороны,
заключили настоящий Договор о нижеследующем: …
I tried to get text templates (like %SOME_TEXT%) use Apache POI - XWPF and replace text, but replacement is not guaranteed, because POI separates runs => I get something like this(System.out.println(run.getText(0))
):
…
, с одной стороны, и
%
SOME_TEXT
%
именуемое
в дальнейшем «Заказчик», в лице
%
SOME
_
TEXT
%
code example:
FileInputStream fis = new FileInputStream(new File("document.docx"));
XWPFDocument document = new XWPFDocument(fis);
List<XWPFParagraph> paragraphs = document.getParagraphs();
paragraphs.forEach(para -> {
para.getRuns().forEach(run -> {
String text = run.getText(0);
if (text != null) {
System.out.println(text);
// text replacement process
// run.setText(newText,0);
}
});
});
I have found many similar questions (like this "Replacing a text in Apache POI XWPF "), but did not found answer to my problem (answer here "Seperated text line in Apache POI XWPFRun object" offer inconvenient solution).
I tried to use docx4j and this example => "docx4j find and replace", but docx4j works similar.
For docx4j, see stackoverflow.com/questions/17093781/… – JasonPlutext
I tried to use docx4j => documentPart.variableReplace(mappings);
, but replacement not guaranteed(plutext/docx4j).
Did you use VariablePrepare? stackoverflow.com/a/17143488/1031689 – JasonPlutext
Yes, no results:
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new File("test.docx"));
HashMap<String, String> mappings = new HashMap<>();
VariablePrepare.prepare(wordMLPackage);//see notes
mappings.put("SOME_TEXT", "XXXX");
wordMLPackage.getMainDocumentPart().variableReplace(mappings);
wordMLPackage.save(new File("out.docx"));
Input\output text:
Input:
…, с одной стороны, и ${SOME_TEXT} именуемое в дальнейшем «Заказчик» ...
Output:
…, с одной стороны, и SOME_TEXT именуемое в дальнейшем «Заказчик» ...
To see your runs after VariablePrepare, turn on INFO level logging for VariablePrepare, or just
System.out.println(wordMLPackage.getMainDocumentPart().getXML())
I understand that templates were separated to different Runs, but main question of the topic, how not to separate template to different Runs. I use System.out.println(wordMLPackage.getMainDocumentPart().getXML())
and saw:
<w:r>
<w:t xml:space="preserve">, с одной стороны, и </w:t>
</w:r>
<w:r><w:t>$</w:t></w:r>
<w:r><w:t>{</w:t></w:r>
<w:r>
<w:rPr>
<w:rFonts w:eastAsia="Times-Roman"/>
<w:color w:val="000000" w:themeColor="text1"/>
<w:lang w:val="en-US"/>
</w:rPr>
<w:t>SOME</w:t> <!-- First part of template: "SOME" -->
</w:r>
<w:r>
<w:rPr>
<w:rFonts w:eastAsia="Times-Roman"/>
<w:color w:val="000000" w:themeColor="text1"/>
</w:rPr>
<w:t>_</w:t> <!-- Second part of template: "_" -->
</w:r>
<w:r>
<w:rPr>
<w:rFonts w:eastAsia="Times-Roman"/>
<w:color w:val="000000" w:themeColor="text1"/>
<w:lang w:val="en-US"/>
</w:rPr>
<w:t>TEXT</w:t> <!-- Third part of template: "TEXT" -->
</w:r>
<w:r>
<w:rPr>
<w:rFonts w:eastAsia="Times-Roman"/>
<w:color w:val="000000" w:themeColor="text1"/>
</w:rPr>
<w:t>}</w:t>
</w:r>
, that template located in different xml tags and I do not understand WHY...
Please help me to find convenient approach to replace text.....
documentPart.variableReplace(mappings);
, but replacement not guaranteed(see question updates). – kozmo