1
votes

I'm trying to read the paragraph contents and shapes text from word file.
I have written following code:

foreach (Microsoft.Office.Interop.Word.Shape shape in document.Shapes)
{
    ParaInfo.Add(new ParaInfo{Text = shape.TextFrame.TextRange.Text});
}

foreach (Microsoft.Office.Interop.Word.Paragraph para in document.Paragraphs)
{
    ParaInfo.Add(new ParaInfo{Text = para.Range.Text});
}

But, this will change the sequence of paragraphs and shapes. I want to get them in the same sequence as they appear in the word document.
How can I achieve this using Interop word?

1

1 Answers

0
votes

There is no sequence in Word documents and you can't get a structure as you asked. See How to enumerate word document using office interop API?

The reason why you get the sequence "changed" is because you enumerate only shapes and then only tables. As you get only a text content, it might make sense to try document.Content.Text and see if you could build any "structure" out of it.