I'm exploring the Google Vision API for OCR. We have lots of forms that are computer generated and filled by users. Like the Medical Reports and Registration Forms. We need to process those images and get the character out of it. I've tried Google Vision API and its works great in case of computer generated form, but the ones filled by hand are creating issues. Like If fill the form with the data a little above the y axis the words is considered as previous/next line. Like below is the output
Study Contact Name:
Test
expected
Study Contact Name: Test
Code reference: https://cloud.google.com/vision/docs/detecting-text#vision-text-detection-java
Is there a way to get this in one line, or understand if its part of that line?
Any other API that can help in this scenario?