3
votes

I'm creating WordProcessingDocuments using openxml (which works fine and the produced word doc is exactly what I want), now I'm trying to convert these newly created docs to HTML using the openxml Powertools. I'm new to this so I'm hoping thats it's something stupid that I'm missing but was hoping someone could point me in the right direction with these nullable errors I'm receiving.

This is the exact error...

System.NullReferenceException: Object reference not set to an instance of an object.
at OpenXmlPowerTools.HtmlConverter.ConvertToHtmlTransform(WordprocessingDocument wordDoc, HtmlConverterSettings settings, XNode node, Func2 imageHandler)
at OpenXmlPowerTools.HtmlConverter.<>c__DisplayClass37.<ConvertToHtmlTransform>b__1d(XElement e)
at System.Linq.Enumerable.WhereSelectEnumerableIterator2.MoveNext()
at System.Xml.Linq.XContainer.AddContentSkipNotify(Object content)
at System.Xml.Linq.XElement..ctor(XName name, Object content)
at OpenXmlPowerTools.HtmlConverter.ConvertToHtmlTransform(WordprocessingDocument wordDoc, HtmlConverterSettings settings, XNode node, Func2 imageHandler)
at OpenXmlPowerTools.HtmlConverter.<>c__DisplayClass37.<ConvertToHtmlTransform>b__1c(XElement e)
at System.Linq.Enumerable.WhereSelectEnumerableIterator2.MoveNext()
at System.Xml.Linq.XContainer.AddContentSkipNotify(Object content)
at System.Xml.Linq.XContainer.AddContentSkipNotify(Object content)
at System.Xml.Linq.XElement..ctor(XName name, Object[] content)
at OpenXmlPowerTools.HtmlConverter.ConvertToHtmlTransform(WordprocessingDocument wordDoc, HtmlConverterSettings settings, XNode node, Func`2 imageHandler)

I'm using the exact same code you can find on Eric Whites blog.

public static void PrintHTML(string file)
{
    byte[] byteArray = File.ReadAllBytes(file);
    using (MemoryStream memoryStream = new MemoryStream())
    {
        memoryStream.Write(byteArray, 0, byteArray.Length);
        using (WordprocessingDocument doc =
            WordprocessingDocument.Open(memoryStream, true))
        {

            HtmlConverterSettings settings = new HtmlConverterSettings()
            {
                //PageTitle = "some title"
            };
            XElement html = HtmlConverter.ConvertToHtml(doc, settings);

            File.WriteAllText(@"C:\\Temp\Test.html", html.ToStringNewLineOnAttributes());
        }
    }
}

I know the code works because if i pass it a normal worddoc that I haven't created it works fine and converts to html fine. If i create a word doc using openxml then manually copy the contents into a new word file, save it, then pass it through the conversion code, that will work as well. So I'm thinking it must be something to do with the way I'm createing the word doc in openxml initially. Maybe im not adding a part to the file that is required.

Using the openxml sdk I have compared a working and non working file and they appear to have the same components/parts.

From the errors I've posted does anyone have any ideas of where the problem could be, ie, what is null? I can post the creation code for the word doc but it's quite extensive and it might just confuse people more.

2

2 Answers

1
votes

I finally got to the bottom of this. I had to dig out the source code for the HtmlConverter in the openxmlpower tools, after some debuging I found that this line in the code was erroring...

line 371

styleId = (string)wordDoc.MainDocumentPart.StyleDefinitionsPart
          .GetXDocument().Root.Elements(W.style)
          .Where(e => (string)e.Attribute(W.type) == "paragraph" &&
          (string)e.Attribute(W._default) == "1")
          .FirstOrDefault().Attributes(W.styleId).FirstOrDefault();

basically in my debugging the

(string)e.Attribute(W._default) 

was returning as True or False

so i changed the following line

 .Where(e => (string)e.Attribute(W.type) == "paragraph" &&
          (string)e.Attribute(W._default) == "1")

to

.Where(e => (string)e.Attribute(W.type) == "paragraph" && (
          (string)e.Attribute(W._default) == "1" || (string)e.Attribute(W._default) == "true"))

and now works as expected

0
votes

Had the same issue where I was saving a reportbuilder report to OpenWordXML and could not convert the bytes to html.

Had to add the following line of code for it to work correctly with version 2.8.1.0


private static IEnumerable<XElement> ParaStyleParaPropsStack(XDocument stylesXDoc,    
     string paraStyleName, XElement para)
    {
        if (stylesXDoc == null)
            yield break;
        var localParaStyleName = paraStyleName;
        while (localParaStyleName != null)
        {
            XElement paraStyle = stylesXDoc.Root.Elements(W.style).FirstOrDefault(s 
               =>
                **s.Attribute(W.type) != null &&**
                s.Attribute(W.type).Value == "paragraph" && 
                s.Attribute(W.styleId).Value == localParaStyleName);

s.Attribute(W.type) != null && // the liner that was added