I have been using Apache POI to manipulate Microsoft Word .docx files — ie open a document that was originally created in Microsoft Word, modify it, save it to a new document.
I notice that new paragraphs created by Apache POI are missing a Revision Save ID, often known as an RSID or rsidR. This is used by Word to identify changes made to a document in one session, say between saves. It is optional — users could turn it off in Microsoft Word if they want — but in reality almost everyone has it on so almost every document is fulls of RSIDs. Read this excellent explanation of RSIDs for more about that.
In a Microsoft Word document, word/document.xml
contains paragraphs like this:
<w:p w:rsidR="007809A1" w:rsidRDefault="007809A1" w:rsidP="00191825">
<w:r>
<w:t>Paragraph of text here.</w:t>
</w:r>
</w:p>
However the same paragraph created by POI will look like this in word/document.xml
:
<w:p>
<w:r>
<w:t>Paragraph of text here.</w:t>
</w:r>
</w:p>
I've figured out that I can force POI to add an RSID to each paragraph using code like this:
byte[] rsid = ???;
XWPFParagraph paragraph = document.createParagraph();
paragraph.getCTP().setRsidR(rsid);
paragraph.getCTP().setRsidRDefault(rsid);
However I don't know how I should be generating the RSIDs.
Does POI have a way or generate and/or keep track of RSIDs? If not, is there any way I can ensure that an RSID that I generate doesn't conflict with one that's already in the document?