1
votes

I am trying to read contents of files like .txt,.docx,.pdf and so on with textract. when i use the below code, it throws error:

   @app.route('/upload', methods=['POST'])
    def upload():
        request_file = request.files['file']
        text = textract.process(request_file.stream)
        return (text)

when i uploaded a docx file,

File "/usr/lib/python2.7/genericpath.py", line 26, in exists os.stat(path) TypeError: coercing to Unicode: need string or buffer, instance found 10.0.2.2 -- [12/Apr/2018 09:04:58] "POST /upload HTTP/1.1" 500 -

How can i send these files with different extension into textract with flask?

2

2 Answers

1
votes

I was having same problem. We have to first upload the file on the server and then access it. It worked!!

0
votes

I think Textract cannot process file stream

Try instead with exact file path and its extension like :

textdata=textract.process("C:\some_path_to_file",extension=".pdf")

It works and give it a try