1
votes

In a C# VSTO addin project we are adding content controls to a Word document to keep track of the document structure. We use content controls to be able to nest different elements of the document. The nesting is basically like a book with different elements on different levels: chapter, subchapter, paragraph. We need to preserve this structure so we are able to export it to a certain XML format that we want to validate against an XSD, validating the structure of the document.

Everything works fine with the content controls except when we have to handle a big document, where we need many content controls. I'm talking over 2000 content controls, so I realize it is a lot to handle for Word. In that case Word becomes very slow, for instance scrolling down all the way to the bottom of the document takes a while while Word says it's repaginating and performing spelling checks. Sometimes Word will even crash opening such a document.

I already tried removing the undo information from the document, because I read somewhere that that may slow down Word with very big documents. The documentsize did shrink a bit after that, but the performance problem persists. Is there anything else I can do to speed this up or is content controls just a no-go when there's a need for this amount (i.e. > 500 content controls)?

And if the content controls is a no-go scenario are there any alternatives to keep track of the structure of the document? I've tried using styles, but that way you lose the nesting information of the individual elements of the document, so it becomes much harder to parse. I also tried putting bookmarks at the beginning of every grouping element, but I noticed that while typing bookmarks can be deleted.

Any ideas, hints and tips are welcome. Thanks in advance!

Ruben.

2

2 Answers

0
votes

Try using http://docx.codeplex.com/ then you don't even have to have MS word even installed.

0
votes

If you are not using the tag property of the content controls, have you looked as using merge fields instead? Depending on how you are processing the document with the content controls, it could give you the same functionality with much better performance. Merge fields require less memory space and are populated a lot quicker than content controls.