Google Cloud Vision Document OCR - keep layout in the resulted text

Question

I use Google Cloud Vision Document OCR API. The resulted text that is returned by com.google.cloud.vision.v1.AnnotateImageResponse.getFullTextAnnotation().getText() is a little bit messy and lose the text formatting presented on the original image.

Is there with Google Cloud Vision Document OCR API a way to keep the layout(formatting) in the resulted text?

ExtractTable.com ExtractTable.com · Accepted Answer · 2018-01-24T14:09:11

@alexanoid, at this point of time(1/24/2018), there is no Rich Text Output from the Google Vision OCR. You can only differentiate the lines. There is other way around which is pretty time consuming using the bounding boxes, I don't think that's the solution you are looking for.

Google Cloud Vision Document OCR - keep layout in the resulted text

1 Answers