opencv/emgucv chinese word detection

Question

I am currently developing a commercial software. I need to add the feature of chinese character and word detection but it seems the functions of Scene Text Detection can only detect english characters and words. I searched on google and nothing related showed up.

I will feed a scanned A4 paper image to the application for it to find some chinese words based on some pre-set conditions.For example, the image contains the word "你好"(it means "Hello" in chinese) twice but only extract it once and save it as a string when it meets the pre-set condition of it is next to the title of 姓名(Name).

Here is a small illustration of the example:

Greeting: 你好

姓名（Name）: 你好 <--- this word detection only

Can someone please, who has decent experience with opencv or emgucv help me out?

If a custom dataset is needed in order to achieve my goal, can someone guide me on how to perform dataset training for word detection in opencv or emgucv.

I would recommend you to take a look on github.com/tesseract-ocr/tesseract. This an ocr engine wich is able to detect text on scanned documents. The newest version has an already trained neural network. OpenCV has a wrapper for it. youtube.com/watch?v=vtSGSXKggEo — TruckerCat

jojobarcream jojobarcream · Accepted Answer · 2017-06-14T07:28:46

0

votes

(OpenCV or EmguCV is not your solution) You need Deep Neural NetWork(DNN) such as TensorFlow

opencv/emgucv chinese word detection

1 Answers