PDF TO DOCX(PDF memorystream to DOCX memorystrem)… Stream to stream conversion

Question

I will be getting PDF file (PDF file might contain hyperlink or image etc.... ) data in memorystream format in my server(I will not get PDF file path.. I will get only file data in memorystream)... I need to convert this memory stream to DOCX( DOCX data is also in Memorystream format)...

I have used aspose dlls for the above scenario.. But for larger PDF files its giving Array Index Out Of Bounds Exception... and some times process is getting stuck at doc.save...

I have posted my query in Aspose blog aswell... they are also getting same issue in there end Refer this link...

I have checked in the following sites .. but still no luck..

Is there any solution is available for scenario....

You can check out GemBox.Document if it can suite your needs. We have recently released a support for reading PDF files (it's currently in beta), you can read about it here. — GemBox Dev Team
We had discussion with Gembox support team as per the discussion only physical PDF files can be converted to DOCX.. but we will not get any physical file... we will be getting PDF file data in stream format — Arun
I apologize, but I don't recall such a discussion. Nevertheless note that GemBox.Document supports loading and saving both physical files and an in-memory files. Please refer to a following article about Working with document file stream. — Mario Z
Thanks Mario.. I will check details.. Where I can find sample SDK/dlls so that i can include those in my project and test some scenarios — Arun

gn1 gn1 · Accepted Answer · 2016-01-04T04:06:25

In the limited duration before a ASP.NET script times out, you are trying to convert very big PDFs. This is designed to fail. Do the conversion in a Windows service process. Use ASP.NET only to receive the documents for conversion. Make the service mail the converted documents.

PDF TO DOCX(PDF memorystream to DOCX memorystrem)… Stream to stream conversion

1 Answers