3
votes

Get different result for text detection from .Net code and demo app for the same image google vision api result and .net result

this is my code:

            var response = vision.Images.Annotate(
            new BatchAnnotateImagesRequest()
            {
                Requests = new[]
                {
                    new AnnotateImageRequest()
                    {
                        Features = new[]
                        {
                            new Feature()
                            {
                                Type =
                                    "TEXT_DETECTION"
                            }
                        },
                        Image = image
                    }
                }
            }).Execute();
2
I'm facing the same issue with PHP library, result is different between browser and REST API call, why?Emil Borconi

2 Answers

7
votes

As noted in Emil's answer, you want the DOCUMENT_TEXT_DETECTION feature rather than TEXT_DETECTION. However, you can do it all rather more simply than with the current code.

Rather than using Google.Apis.Vision.V1 (which it looks like you're doing, and which uses the REST endpoint), I'd suggest using Google.Cloud.Vision.V1 (which uses the gRPC endpoint, and is hopefully easier to use. Disclaimer: I work on the latter library). You can do most work with the REST endpoint, mind you.

Here's a complete example using the latter library.

using Google.Cloud.Vision.V1;
using System;
using System.Linq;

class Program
{
    static void Main(string[] args)
    {
        var client = ImageAnnotatorClient.Create();
        var image = Image.FromUri("https://i.stack.imgur.com/H21rL.png");
        var annotations = client.DetectDocumentText(image);
        var paragraphs = annotations.Pages
            .SelectMany(page => page.Blocks)
            .SelectMany(block => block.Paragraphs);
        foreach (var para in paragraphs)
        {
            var box = para.BoundingBox;
            Console.WriteLine($"Bounding box: {string.Join(" / ", box.Vertices.Select(v => $"({v.X}, {v.Y})"))}");
            var symbols = string.Join("", para.Words.SelectMany(w => w.Symbols).SelectMany(s => s.Text));
            Console.WriteLine($"Paragraph: {symbols}");
            Console.WriteLine();
        }
    }
}

That loses the spaces between the symbols, but shows that all the text is being detected - and the method call to perform the actual detection is very simple:

var client = ImageAnnotatorClient.Create();
var image = Image.FromUri("https://i.stack.imgur.com/H21rL.png");
var annotations = client.DetectDocumentText(image);

Most of the code above is handling the response.

1
votes

Found the problem, as per documentation when using Vision API we should use DOCUMENT_TEXT_DETECTION

+-------------------------+-----------------------------------------------------------------------------------------------------------------+
| TEXT_DETECTION          | Run OCR.                                                                                                        |
+-------------------------+-----------------------------------------------------------------------------------------------------------------+
| DOCUMENT_TEXT_DETECTION | Run dense text document OCR. Takes precedence when both DOCUMENT_TEXT_DETECTION and TEXT_DETECTION are present. |
+-------------------------+-----------------------------------------------------------------------------------------------------------------+

Therefore the code should look like this:

var response = vision.Images.Annotate(
            new BatchAnnotateImagesRequest()
            {
                Requests = new[]
                {
                    new AnnotateImageRequest()
                    {
                        Features = new[]
                        {
                            new Feature()
                            {
                                Type =
                                    "DOCUMENT_TEXT_DETECTION"
                            }
                        },
                        Image = image
                    }
                }
            }).Execute();