1
votes

I am having trouble trying to remove a list of data/text from a Word document using Word Interop. So far I thought that I could read through the document to find the starting text, then find the ending text, and save each of those index's to their own variable. Next I would just loop through the data from the starting index to the ending index and delete all the text in between.

Problem is that it works incorrectly and doesn't provide expected results. I must not be understanding how the Range interface works in document.Paragraphs[i+1].Range.Delete();. It deletes some lines but not all, and seems to go beyond the paragraphs that I care about to delete. What am I missing? There must be a better way to do this. Documentation seems low with Interop.

string text = " "; 
int StartLocation = 0;
int EndLocation = 0;
//I roughly know the starting location
//starting at I=2248 so I don't
//through entire document                             
for (int i = 2248; i < 2700; i++)
{
  text = document.Paragraphs[i + 1].Range.Text.ToString();
  if (text.Contains("firstWordImSearchingFor"))
  {
     StartLocation = i;
  }
  if (text.Contains("lastWordImSearchingFor"))
  {
     EndLocation = i;                        
  }
}
//delete everything between those paragraph locations 
//(not working correctly/ skips lines)
for(int i = StartLocation; i<EndLocation-1i++)
{
   document.Paragraphs[i+1].Range.Delete(); 
}
2
do a google search on how to remove text from a word doc C# Microsoft.Interop look at this for some possible pointers as well stackoverflow.com/questions/10231132/…MethodMan
If you are working with .docx files and not .doc files I would recommend you stop using the interop classes and switch to the newer SDK Microsoft released specifically for working with docx files.Scott Chamberlain
Yes I have done many google searches and looked over other questions/answers. Other I'm not finding answers that pertain to my problem or I'm missing something. For the most part I haven't had much trouble using Interop to create a document. For some reason I'm stuck on this as simple as it sounds.Bobby
Any reason not to use OpenXML for docx generation/modification?trailmax
Currently yes there is a reason, but hopefully will change in the near future. So .doc is what I have to deal with for now.Bobby

2 Answers

1
votes

The drawback to the approach you're trying is that the Start and End locations (number of characters from the beginning of the Document story) will vary depending on what non-visible / non-printing characters are present. Content Controls, field codes and other things affect this - all in different ways depending on how things are being queried.

More reliable would be to store the starting point in one Range then extend it to the end point.

I also recommend using Range.Find to search for the start and end points.

Bare-bones pseudo-code example, since I don't really have enough information to go on to give you full, working code:

Word.Range rngToDelete = null;
Word.Range rngFind = document.Content;
bool wasFound = false;
object missing = System.Type.Missing;
object oEnd = Word.WdCollapseDirection.wdCollapseEnd;
wasFound = rngFind.Find.Execute("firstWordImSearchingFor", ref missing, ref missing, 
           ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, 
           ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
if (wasFound)
{
  rngToDelete = rngFind.Duplicate //rngFind is now where the term was found!
  //reset the range to Find so it moves forward
  rngFind.Collapse(ref oEnd);
  rngFind.End = Document.Content.End 
      wasFound = rngFind.Find.Execute("lastWordImSearchingFor", ref missing, ref missing, 
           ref missing, ref missing, ref missing, ref missing, ref missing, ref missing, 
           ref missing, ref missing, ref missing, ref missing, ref missing, ref missing);
  if (wasFound)
  {
    rngToDelete.End = rngFind.End;
    rngToDelete.Delete();
  }
}
0
votes

This is completely untested and is offered as a suggestion:

var docRange = document.Content;
bool inDelete = false;
foreach(var para in docRange.Paragraphs)
{
    if(para.ToString().Contains("Start flag") || inDelete)
    {
        inDelete = true;
        docRange.Delete(para);
    }
    if (para.ToString().Contains("End flag"))
    {
        // remove following line to retain this paragraph
        docRange.Delete(para);
        break;
    }
}