1
votes

Context: MS Word automation with interop, c#, a document with 2000 sections and 3 headers.

Problem: efficient processing of headers and footers (e.g. text search & replace in a header)

It seems the way to process headers (or footers) in a word is something like this:

foreach (Microsoft.Office.Interop.Word.Section section in theDoc.Sections)
{
   foreach (Microsoft.Office.Interop.Word.HeaderFooter header in section.Headers)
   {
      //processRange(theDoc, header.Range);
   }
}

Problem is the loop will run thousand of times for this document I bumped into even though it has only one header. Since section can actually have different headers, obviously, I can't just stop after the first hit on a header. But in this case, that's the objective since there is only 3 headers, not 6000.

I was thinking about what makes a header unique so I can keep keys in a hashset and just skip if already there.

What is that key, if there is one? range.Start+range.End+range.StoryType (range of headerfooter)? range.text ? Something else ?

Is there perhaps a better approach to avoid the redundancy?

Thanks for your help.

-Cristian

1

1 Answers

0
votes

I ended up with a hybrid solution, basically use sections-mode if there is just one or a few sections OR use story_range-mode if there are many sections. Based on posts on internet it seems sections-mode is actually faster if you have just one section.

The section mode is described in the question. The story range mode is described in these posts:

https://wls.wwco.com/blog/2010/07/03/find-and-replace-in-word-using-c-net/ http://word.mvps.org/faqs/customization/ReplaceAnywhere.htm