0
votes

I am trying for a week how to give input files to Textract from flask POST option.

@app.route('/input', methods=['POST'])
def input():
    request_file = request.files.get('file')
    r = textract.process(io.BytesIO(request_file.read()))
    return r 

The above code throws me error

TypeError: coercing to Unicode: need string or buffer, _io.BytesIO found

And I tried a small test with send_file to check if it actually takes the input and to check if BytesIO works well in my case:

@app.route('/input', methods=['POST'])
def input():
    request_file = request.files.get('file')
    return send_file(io.BytesIO(request_file.read()),attachment_filename=
request_file.filename)

the above code works fine for pdf files and send responses(to download pdf file). And when i tried .docx,.txt files it show some strange outputs on the screen : PK

My questions, how do i send this io.bytes(request_file.read()) as a file to Textract now ? I tried to find answer everywhere but i couldn't.

Am i supposed to decode or encode now?

1
r = textract.process(io.BytesIO(request_file.read()).read())? - CristiFati
Not working . Throws error : TypeError: stat() argument 1 must be encoded string without null bytes, not str - jane1912

1 Answers

1
votes

textract.process() expects a string, but you send io.BytesIO(request_file.read()) instead. I'm not really sure why you're using io.BytesIO. Can you try:

textract.process(request_file.read())