2
votes

Input are Excel files - the cells may contain some basic HTML formatting like <b>, <br>, <h2>.

I want to read the strings and insert the text as formatted text into word documents, i.e. <b>Foo</b> would be shown as a bold string in Word.

I don't know which tags are used so I need a "generic solution", a find/replace approach does not work for me.

I found a solution from January 2011 using the WebBrowser component. So the HTML is converted to RTF and the RTF is inserted into Word. I was wondering if there is a better solution today.

Using a commercial component is fine for me.

Update

I came across Matthew Manela's MarkupConverter class. It converts HTML to RTF. Then I use the clipboard to insert the snippet into the word file

// rtf contains the converted html string using MarkupConverter
Clipboard.SetText(rtf, TextDataFormat.Rtf);
// objTable is a table in my word file
objTable.Cell(1, 1).Range.Paste();

This works, but will copy/pasting up to a few thousand strings using the clipboard break anything?

3
Do you need to use Office Interop, or would OpenXML be fine too?flipchart
I need to insert my strings into Word tables and measure the height of the table cells. Does this work with OpenXML, too?herrjeh42
OpenXML can be used to manipulate docx files, including inserting HTML (into tables). After the document is built you can measure the heights using office interop. I'm not sure if OpenXML would be able to give you correct height. I'll put up an example later todayflipchart
that sounds better than my copy-paste approach :-)herrjeh42

3 Answers

3
votes

You will need the OpenXML SDK in order to work with OpenXML. It can be quite tricky getting into, but it is very powerful, and a whole lot more stable and reliable than Office Automation or Interop.

The following will open a document, create an AltChunk part, add the HTML to it, and embed it into the document. For a broader overview of AltChunk see Eric White's blog

using (var wordDoc = WordprocessingDocument.Open("DocumentName.docx", true))
{
    var altChunkId = "AltChunkId1";
    var mainPart = wordDoc.MainDocumentPart;

    var chunk = mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.Html, altChunkId);
    using (var textStream = new MemoryStream())
    {
        var html = "<html><body>...</body></html>";
        var data = Encoding.UTF8.GetBytes(html);
        textStream.Write(data, 0, data.Length);
        textStream.Position = 0;
        chunk.FeedData(textStream);
    }

    var altChunk = new AltChunk();
    altChunk.Id = altChunkId;
    mainPart.Document.Body.InsertAt(altChunk, 0);
    mainPart.Document.Save();
}

Obviously for your case, you will want to find (or build) the table you want and insert the AltChunk there instead of at the first position in the body. Note that the HTML that you insert into the word doc must be full HTML documents, with an <html> tag. I'm not sure if <body> is required, but it doesn't hurt. If you just have HTML formatted text, simply wrap the text in these tags and insert into the doc.

It seems that you will need to use Office Automation/Interop to get the table heights. See this answer which says that the OpenXML SDK does not update the heights, only Word does.

2
votes

Use this code it is working..

Response.AppendHeader("content-disposition", "attachment;filename=FileEName.xls");
Response.Charset = "";
Response.Cache.SetCacheability(HttpCacheability.NoCache);
Response.ContentType = "application/vnd.ms-excel";
this.EnableViewState = false;
//Response.Write("Your HTML Code");
Response.Write("<table border='1 px solid'><tr><th>sfsd</th><th>sfsdfssd</th></tr><tr>
<td>ssfsdf</td><td><table border='1 px solid'><tr><th>sdf</th><th>hhsdf</th></tr><tr>
<td>sdfds</td><td>sdhjhfds</td></tr></table></td></tr></table>");
Response.End();
1
votes

Why not let WORD do its owns translation since it understands HTML.

  1. Read your Excel cells
  2. Write your values into a HTML textfile as it would be a WORD document.
  3. Open WORD and let it read that HTML file.
  4. Instruct WORD to save the document as a new WORD document (if that is required).