1
votes

I am looking for a way to get the table of contents (not created but headings available) from word and store the chapter numbers and headings on Excel. Is there a method using Excel VBA to take those headings from word doc to excel? I have searched for this but everybody suggest using paste special however I want it automated since the data from TOC is sorted into a different table in Excel afterwards.

Sub importwordtoexcel()
    MsgBox ("This Macro Might Take a While, wait until next Message")
    Application.ScreenUpdating = False
    Sheets("Temp").Cells.Clear

     'Import all tables to a single sheet
    Dim wdDoc As Object
    Dim wdFileName As Variant
    Dim TableNo As Integer 'table number in Word
    Dim iRow As Long 'row index in Word
    Dim jRow As Long 'row index in Excel
    Dim iCol As Integer 'column index in Excel
    wdFileName = Application.GetOpenFilename("Word files               (*.docx),*.docx", , _
"Browse for file containing table to be imported")
If wdFileName = False Then Exit Sub '(user cancelled import file browser)
Set wdDoc = GetObject(wdFileName) 'open Word file
With wdDoc
    If wdDoc.Tables.Count = 0 Then
        MsgBox "This document contains no tables", _
        vbExclamation, "Import Word Table"
    Else
        jRow = 0
        For TableNo = 1 To wdDoc.Tables.Count
            With .Tables(TableNo)
                 'copy cell contents from Word table cells to Excel cells
                For iRow = 1 To .Rows.Count
                    jRow = jRow + 1
                    For iCol = 1 To .Columns.Count
                        On Error Resume Next
                        Sheets("Temp").Cells(jRow, iCol) = WorksheetFunction.Clean(.Cell(iRow, iCol).Range.Text)
                        On Error GoTo 0
                    Next iCol
                Next iRow
            End With
            jRow = jRow + 1
        Next TableNo
    End If
End With
Set wdDoc = Nothing

'Takes data from temp to RTM_FD
Dim nRow As Long
Dim mRow As Long
Dim Temp As Worksheet
Dim RTM As Worksheet
Set Temp = Sheets("Temp")
Set RTM = Sheets("RTM_FD")

mRow = 16
For nRow = 1 To Temp.Rows.Count
    If Temp.Cells(nRow, 1).Value = "Position" Or Temp.Cells(nRow, 1).Value = "" Then
    Else
        RTM.Cells(mRow, 1).Value = Temp.Cells(nRow, 1)
        RTM.Cells(mRow, 2).Value = Temp.Cells(nRow, 4)
        RTM.Cells(mRow, 2).Font.Bold = False
        RTM.Cells(mRow, 3).Value = Temp.Cells(nRow, 5)
        RTM.Cells(mRow, 3).Font.ColorIndex = 32
        If Temp.Cells(nRow, 3).Value = "P" Then
            RTM.Cells(mRow, 9).Value = "X"
            RTM.Cells(mRow, 9).Interior.ColorIndex = 44
        ElseIf Temp.Cells(nRow, 3) = "Q" Then
            RTM.Cells(mRow, 7).Value = "X"
            RTM.Cells(mRow, 7).Interior.ColorIndex = 44
        ElseIf Temp.Cells(nRow, 3) = "TA" Then
            RTM.Cells(mRow, 8).Value = "X"
            RTM.Cells(mRow, 8).Interior.ColorIndex = 44
        Else
        End If
        mRow = mRow + 1
    End If
Next nRow

Application.ScreenUpdating = True
MsgBox ("DONE")
Sheets("Temp").Cells.Clear
Dim SaveName As String
SaveName = InputBox("What Do You Want to Save the File As:")
ActiveWorkbook.SaveAs (SaveName)
MsgBox ("Your file is saved as " & SaveName)
MsgBox ("Please Accept Delete Operation")
Sheets("Temp").Delete
ActiveWorkbook.Save
End Sub
2
You would have to try different methods by yourselves first see what is working for you and what is not and if stuck you can ask for help here.Stupid_Intern
@Santosh copying and pasting works but I am trying to integrate it to an existing code, and that code is my 3rd attempt in optimizing something. Finally it works. It did not work in word vba for which I had the TOC code but for Excel I do not even know where to begin with.Mert Karakaya
To begin. In Excel, add a reference to Word. Create a WordDoc object. Set it/open it to the word doc you want to read from. Access the table, read from table, write to cells...MatthewD
@MatthewD I have posted the code I have above. So what you are saying is as long as there is TOC, it will read it as a table? Also, how can I create TOC in the beggining of the codeMert Karakaya
What do Word tables have to do with your requirement? How are the "TOC" entries you want to have identifiable in Word?Cindy Meister

2 Answers

1
votes

One way to get section headings without creating a TOC is by iterating with the selection object, using Selection.Goto. The folowing example prints all the sections headings in a document to the immediate window. I am sure you can adapt the concept to your code.

Sub PrintHeadings()
 Dim wrdApp As Word.Application
 Dim wrdDoc As Document
 Dim Para As Paragraph
 Dim oldstart As Variant

 Set wrdApp = CreateObject("Word.Application") 'open word
 Set wrdDoc = wrdApp.Documents.Open("C:\sample.docx", , True, False, , , , , , , , True) 'open file

 wrdDoc.ActiveWindow.ActivePane.View.Type = wdPrintView 'avoids crashing if opens on read view

  With wrdDoc.ActiveWindow.Selection
    .GoTo What:=wdGoToHeading, which:=wdGoToFirst 'go to first heading
    Do
      Set Para = .Paragraphs(1) 'get first paragraph
      Title = Replace(Para.Range.Text, Chr(13), "") 'gets title and remove trailing newline
      Debug.Print Title, "pg. "; .Information(wdActiveEndAdjustedPageNumber) 'prints title and page to console
      oldstart = .Start 'stores position
      .GoTo What:=wdGoToHeading, which:=wdGoToNext 'go to next heading
      If .Start <= oldstart Then Exit Do 'if looped around to first section (i.e. new heading is before old heading) we are done
    Loop
  End With

  wrdDoc.Close
  wrdApp.Quit

  Set Para = Nothing
  Set wrdDoc = Nothing
  Set wrdApp = Nothing

End Sub

I use early binding, so you will need to either add a reference to Word object model, or tweak the code to late binding (including finding out the numeric value of the enums).

1
votes

I worked fine with My Chinese words documents, it may require to change some of the codes for different heading style. If it won't work for you, I would love to have your words sample file and figure out why.

PS: The key point is to have the correct #OLE_LINK format.

My codes is as follows:

' Get your file and save in Range("A1")

Public Sub SelectAFile()

Dim intChoice As Integer
Dim strPath As String

'only allow the user to select one file
Application.FileDialog(msoFileDialogOpen).AllowMultiSelect = False
'make the file dialog visible to the user
intChoice = Application.FileDialog(msoFileDialogOpen).Show
'determine what choice the user made
If intChoice <> 0 Then
    'get the file path selected by the user
    strPath = Application.FileDialog( _
    msoFileDialogOpen).SelectedItems(1)
    'print the file path to sheet 1
    Cells(1, 1) = strPath
End If

End Sub

' Main program start here

Sub genWordIndex()

Dim rng As Range
Dim r As Range
Dim PageName As String
Dim TestValue As String

Dim WshShell As Variant
Set WshShell = CreateObject("WScript.Shell")

Set rng = Range("A1")   'Selection
Call CleanOldText(1)

PageName = rng.text

Call ReadIndexFromWords3(PageName)

End Sub

Sub ReadIndexFromWords3(ByVal FileName As String)

'
' This is a common routine for handling open file
'
Dim WA As Object
Dim wdDoc As Word.Document

On Error Resume Next
Set WA = GetObject(, "Word.Application")
If WA Is Nothing Then
    Set WA = CreateObject("Word.Application")
    Set wdDoc = WA.Documents.Open(FileName)
Else
    On Error GoTo notOpen
    Set wdDoc = WA.Documents(FileName)
    GoTo OpenAlready

notOpen: Set wdDoc = WA.Documents.Open(FileName) End If

OpenAlready:

wdDoc.Activate

'
' read index program start here。
'

Dim i As Integer: i = 2

Dim H_start, H_end, H_Caption, H_lvl, H_page As String
Dim H_txt As String

Dim Para As Paragraph

For Each Para In wdDoc.Paragraphs
    Para.Range.Select
    If Not Para.Range.Style Is Nothing Then

        If IsMyHeadingStype(Para.Range.Style) = True Then
            H_start = Para.Range.Start
            H_end = Para.Range.End
            H_txt = Para.Range.text
            H_Caption = Para.Range.ListFormat.ListString
            H_page = Para.Range.Information(wdActiveEndPageNumber)
            Dim myLinkAddress As String
            myLinkAddress = FileName & "#OLE_LINK" & i & vbTab & "1," & H_start & "," & H_end & ",2,," & H_txt

            Application.ActiveWorkbook.Activate
            ActiveSheet.Cells(i, 1).Select
            Dim CapLen As Integer:
            CapLen = Len(H_Caption) - 1
            If CapLen < 0 Then CapLen = 0
            ActiveSheet.Cells(i, 1) = Space(CapLen) & H_Caption & " " & H_txt
            ActiveCell.Hyperlinks.Add Anchor:=Selection, Address:=myLinkAddress, SubAddress:=""                    'TextToDisplay:=H_txt,
            ActiveSheet.Cells(i, 2) = H_page

            i = i + 1
        End If
    End If

Next

End Sub

' ' you may have to change your InStyle here ' Function IsMyHeadingStype(ByVal InStyle As String) As Boolean

Dim rc As Boolean: rc = False
If InStr(InStyle, "標題 1") Or InStr(InStyle, "標題 2") Or InStr(InStyle, "標題 3") Then
    rc = True
End If

IsMyHeadingStype = rc

End Function

' sub routine Sub CleanOldText(ByVal col1 As Integer)

Dim i As Integer
Dim lastR As Integer

lastR = Cells(10000, col1).End(xlUp).Row
For i = 2 To lastR
    Cells(i, col1).ClearContents
    Cells(i, col1 + 1).ClearContents
Next i

End Sub