0
votes

I'm extracting text out of an MS Word document (.docx). I'm using the DocX C# library for this purpose, which works in general quit well. No, I want to be able to extract tables. The main problem is, that if I'm looping through the paragraphs, I can get whether I'm in a table cell with:

        ParentContainer == Cell

but I do not get any information about how many rows and cells. Second possibility which I see is that there is a list with tables as property of the document object. There I can see, how many rows / columns and so on - but I do not know where they are.

Does anyone has an idea how to deal with tables correctly? Any other solution would be appreciated as well :)

1

1 Answers

0
votes

I figured it out. The trick is, to check whether each paragraph is followed by a table. This can be done by

...
if (paragraph.FollowingTable != null)
{
    tableId = paragraph.FollowingTable.Index;
} 
...

The FollowingTable.Index will give you an index to the table, with which you can get all details about the table in the Document.Tables list.