1
votes

I have an word document which I want to convert to text (.txt) file programmatically. I am using C# for this. I am able to read paragraphs and tables from word document and convert them to text. There are some textboxes in the word document and those textboxes contain text that I want to read and put them in text file. My problem is I do not know in which collection those textboxes are stored. For example, all tables are stored in tables collection, paragraphs in paragraphs collection. Can anyone please tell me how to read from these text boxes? Please let me know if you need any additional information.

1
Do the textboxs have any format in the code you can look for?Fitzchak Yitzchaki
Yes, background color of those textboxes is gray and text is in bold.First I thought that it was a single row single column table, but its an text box.Shekhar
From which collection we can get textbox?InlineShapes, shapes or formcontrol collection?Shekhar
Just now I found out that I can get the textboxes using 'Shapes' collection. But now problem is, shapes collection is different and paragraphs collection is different. If I process these two collections differently, then structure of final text file will change. Does anyone knows about the way in which we can process each item of word document sequentially? For example, take first item, put its text in text file, then take next item and so on...Shekhar

1 Answers

2
votes

There are text boxes and text frames. I'm pretty sure any text inside text boxes will be part of the Doc.Content range.

To find all the text frames in a document, I use this VBA code:

Dim Doc As Document
Dim Range As Range

' Load document

Set Range = Doc.StoryRanges(wdTextFrameStory)
Do Until Range Is Nothing
    ' Do something with Range.Text
    Set Range = Range.NextStoryRange
Loop