0
votes

I'm using below code to generate word document with HTML content using docx4j and able to generate document successfully.

My requirement is to write content with some custom properties, so it would be easy to read the same document after modification made by user.

String finalData = "<h1> Heading One </h1>".aapend("<h2> Heading two </h2>");

String str1 = new StringBuffer()
                .append("<html><head><meta http-equiv=\"Content-Type\" content=\"text/html;
                charset=UTF-8\" /><style type='text/css'> 
                * { font-family: 'Arial Unicode MS'; } </style></head>")
                .append(finalData).append("</html>").toString();


        str1 = fixWhitespaceIssue(str1);
        str1 = cleanHTML(str1);
        
        System.out.println(str1);
        WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.createPackage();
        XHTMLImporterImpl XHTMLImporter = new XHTMLImporterImpl(wordMLPackage);
        XHTMLImporter.setRunFormatting(FormattingOption.CLASS_PLUS_OTHER);
        NumberingDefinitionsPart ndp = new NumberingDefinitionsPart();
        wordMLPackage.getMainDocumentPart().addTargetPart(ndp);
        ndp.unmarshalDefaultNumbering();
        wordMLPackage.getMainDocumentPart().getContent().addAll(XHTMLImporter.convert(str1, null));
        File exportFile = new File("test.docx");
        wordMLPackage.save(exportFile);

For example:

<h1> Heading One </h1> // i'll bind custom property for first element as c_property1
<h2> Heading two </h2> // i'll use custom property for second element as c_property2

Generated document can be reviewed by some user, he will made some changes after that the same updated document will come to my code so the code must be capable to read the document using custom properties.

If I wish to pull updated values from document, so I just wanted to provide custom property then it should return its associated values.

For c_property1, the code should return Heading One or update value, e.g. Updated Heading One.

For c_property2, the code should return Heading two or update value, e.g. Updated Heading two.

1
I recommend you look into the concept of Content controls. These can be linked to a Custom XML Part so that the control's content is saved to the XML file (Custom XML Part). Your code can then read the XML file.Cindy Meister
@CindyMeister Thanks for comment,can you please share me some sample examples or any reference link?KhAn SaAb
Just to clarify, you can bind a content control to one of the standard properties parts, so no need for an additional custom xml part. I'll see if I can find an example over the next day or so.JasonPlutext
@JasonPlutext Thanks buddy, Awaiting for you response with some sort of examples reference.KhAn SaAb

1 Answers

0
votes

TLDR: you can use docx4j's FieldUpdater to update the document surface from your custom properties, but you'll need to write some code to put suitable DOCPROPERTY fields into the docx (ie in your case, after you've converted your XHTML to docx).

Content controls don't help for custom properties

To set up your docx, in Word (recent versions), first enable the Developer menu (if you haven't already done so).

Click on "XML Mapping Pane". The task pane which appears allows you to choose from "core" or "extended" properties.

Right click on the property of interest; "Insert Content COntrol" > "PLain Text".

You'll see this in your docx.

The XML Mapping Pane doesn't include "custom" properties though, so you can't readily add these to your document this way.

Plus, you are creating your docx from XHTML, so I guess what you need is a programmatic solution for converting some specific tags to content controls which are bound to the custom properties part.

The Java code to create a content control can be generated from a sample docx, either using the Docx4j Helper Word AddIn, or the docx4j webapp.

You have to bind your content control to a CutomXML part by ItemID. Certain custom xml parts are "well-known": https://msdn.microsoft.com/en-us/library/ff531265(v=office.12).aspx

And when I bind a core-property, it uses w:storeItemID="{6C3C8BC8-F283-45AE-878A-BAB7291924A1}"

But there isn't a storeItemID for custom-properties?

https://social.msdn.microsoft.com/Forums/office/en-US/c7e66714-3224-4298-8673-1ce095db092a/how-to-create-databinding-between-custom-property-value-and-content-control-such-as-text?forum=oxmlsdk

You could try adding an itemProps part to your custom-properties part, but I doubt it would work!

So, if you really want to use custom properties via a content controls approach, you'd need to modify docx4j a bit to bind those.

DOCPROPERTY field to the rescue

But there is a legacy approach available: you can use a DOCPROPERTY field which points to a custom property.

And docx4j's DocPropertyResolver knows what to do with these. See FieldUpdater: https://github.com/plutext/docx4j/blob/master/src/main/java/org/docx4j/model/fields/FieldUpdater.java.