Scenario: I have about 14000 word documents that need to be converted from "Microsoft Word 97 - 2003 Document" to "Microsoft Word Document". In other words upgraded to 2010 format (.docx).
Question: Is there an easy way to do this using API's or something?
Note: I've only been able to find a microsoft program that converts the documents to .docx but they still open in compatability mode. It would be nice if they could just be converted to the new format. Same functionality you get when you open an old document and it gives you the option to convert it.
Edit: Just found http://msdn.microsoft.com/en-us/library/microsoft.office.interop.word._document.convert.aspx looking into how to use it.
EDIT2: This is my current function for converting the documents
Private Sub btnConvert_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles btnConvert.Click
FolderBrowserDialog1.ShowDialog()
Dim mainThread As Thread
If Not String.IsNullOrEmpty(FolderBrowserDialog1.SelectedPath) Then
lstFiles.Clear()
DirSearch(FolderBrowserDialog1.SelectedPath)
ThreadPool.SetMaxThreads(1, 1)
lstFiles.RemoveAll(Function(y) y.Contains(".docx"))
TextBox1.Text += "Conversion started at " & DateTime.Now().ToString & Environment.NewLine
For Each x In lstFiles
ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf ConvertDoc), x)
Next
End If
End Sub
Private Sub ConvertDoc(ByVal path As String)
Dim word As New Microsoft.Office.Interop.Word.Application
Dim doc As Microsoft.Office.Interop.Word.Document
word.Visible = False
Try
Debug.Print(path)
doc = word.Documents.Open(path, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing)
doc.Convert()
Catch ex As Exception
''do nothing
Finally
doc.Close()
word.Quit()
End Try
End Sub`
It lets me select a path then find all doc files within the subfolders. That code isn't important, all the files for conversion are in lstFiles. Only problem at the moment is that it takes a really long time to convert even just 10 documents. Should I be using one word application per document instead of reusing it? Any suggestions?
Also it opens word after about 2 or 3 conversions and starts flashing but keeps converting.
EDIT3: Tweaked to code above a little bit and it runs cleaner. Takes 1min10sec to convert 8 files though. Considering I have 14000 I need to convert this method will take a reasonably long time.
EDIT4: Changed the code up again. Uses a threadpool now. Seems to run a bit faster. Still need to run on a better computer to convert all the documents. Or do them slowly by folder. Can anyone think of any other way to optimize this?