1
votes

As per the documentation provided in Cloud Vision Docs the BoundingPoly object in the blocks array should have a format like this

{
  "vertices": [
    {
      object (Vertex)
    }
  ],
  "normalizedVertices": [
    {
      object (NormalizedVertex)
    }
  ]
}

But when we tried https://vision.googleapis.com/v1/files:annotate?key=xxxxxx to perform OCR on a PDF file with the request:

{
    "requests": [{
        "inputConfig": {
            "content": "encoded content",
            "mimeType": "application/pdf"
        },
        "features": [{
            "type": "DOCUMENT_TEXT_DETECTION",
            "maxResults": 50
        }]
    }]
}

the response from server was

{
    "responses": [
        {
            "responses": [
                {
                    "fullTextAnnotation": {
                        "pages": [
                            {
                                "property": {
                                    "detectedLanguages": [
                                        {
                                            "languageCode": "en",
                                            "confidence": 0.65
                                        },
                                        {
                                            "languageCode": "fil",
                                            "confidence": 0.01
                                        }
                                    ]
                                },
                                "width": 841,
                                "height": 595,
                                "blocks": [
                                    {
                                        "property": {
                                            "detectedLanguages": [
                                                {
                                                    "languageCode": "en",
                                                    "confidence": 1
                                                }
                                            ]
                                        },
                                        "boundingBox": {
                                            "normalizedVertices": [
                                                {
                                                    "x": 0.4351962,
                                                    "y": 0.057142857
                                                },
                                                {
                                                    "x": 0.6052319,
                                                    "y": 0.057142857
                                                },
                                                {
                                                    "x": 0.6052319,
                                                    "y": 0.08571429
                                                },
                                                {
                                                    "x": 0.4351962,
                                                    "y": 0.08571429
                                                }
                                            ]
                                        },
                                        "paragraphs": [
                                            {
                                                "property": {
                                                    "detectedLanguages": [
                                                        {
                                                            "languageCode": "en",
                                                            "confidence": 1
                                                        }
                                                    ]
                                                },
                                                "boundingBox": {
                                                    "normalizedVertices": [
                                                        {
                                                            "x": 0.4351962,
                                                            "y": 0.057142857
                                                        },
                                                        {
                                                            "x": 0.6052319,
                                                            "y": 0.057142857
                                                        },
                                                        {
                                                            "x": 0.6052319,
                                                            "y": 0.08571429
                                                        },
                                                        {
                                                            "x": 0.4351962,
                                                            "y": 0.08571429
                                                        }
                                                    ]
                                                },
                                                "words": [
                                                    {
                                                        "property": {
                                                            "detectedLanguages": [
                                                                {
                                                                    "languageCode": "en"
                                                                }
                                                            ]
                                                        },
                                                        "boundingBox": {
                                                            "normalizedVertices": [
                                                                {
                                                                    "x": 0.4351962,
                                                                    "y": 0.057142857
                                                                },
                                                                {
                                                                    "x": 0.49346018,
                                                                    "y": 0.057142857
                                                                },
                                                                {
                                                                    "x": 0.49346018,
                                                                    "y": 0.08571429
                                                                },
                                                                {
                                                                    "x": 0.4351962,
                                                                    "y": 0.08571429
                                                                }
                                                            ]
                                                        },
                                                        "symbols": [
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ]
                                                                },
                                                                "text": "F",
                                                                "confidence": 0.99
                                                            },
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ]
                                                                },
                                                                "text": "a",
                                                                "confidence": 1
                                                            },
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ]
                                                                },
                                                                "text": "c",
                                                                "confidence": 0.99
                                                            },
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ]
                                                                },
                                                                "text": "t",
                                                                "confidence": 0.99
                                                            },
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ]
                                                                },
                                                                "text": "o",
                                                                "confidence": 1
                                                            },
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ]
                                                                },
                                                                "text": "r",
                                                                "confidence": 1
                                                            },
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ],
                                                                    "detectedBreak": {
                                                                        "type": "SPACE"
                                                                    }
                                                                },
                                                                "text": "y",
                                                                "confidence": 1
                                                            }
                                                        ],
                                                        "confidence": 0.99
                                                    },                                              
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ]
                                                                },
                                                                "text": "i",
                                                                "confidence": 0.99
                                                            },
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ]
                                                                },
                                                                "text": "n",
                                                                "confidence": 1
                                                            },
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ],
                                                                    "detectedBreak": {
                                                                        "type": "SPACE"
                                                                    }
                                                                },
                                                                "text": "g",
                                                                "confidence": 1
                                                            }
                                                        ],
                                                        "confidence": 0.99
                                                    },
                                                    {
                                                        "property": {
                                                            "detectedLanguages": [
                                                                {
                                                                    "languageCode": "en"
                                                                }
                                                            ]
                                                        },
                                                        "boundingBox": {
                                                            "normalizedVertices": [
                                                                {
                                                                    "x": 0.57431626,
                                                                    "y": 0.057142857
                                                                },
                                                                {
                                                                    "x": 0.6052319,
                                                                    "y": 0.057142857
                                                                },
                                                                {
                                                                    "x": 0.6052319,
                                                                    "y": 0.08571429
                                                                },
                                                                {
                                                                    "x": 0.57431626,
                                                                    "y": 0.08571429
                                                                }
                                                            ]
                                                        },
                                                        "symbols": [
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ]
                                                                },
                                                                "text": "L",
                                                                "confidence": 0.99
                                                            },
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ]
                                                                },
                                                                "text": "i",
                                                                "confidence": 0.99
                                                            },
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ]
                                                                },
                                                                "text": "s",
                                                                "confidence": 0.99
                                                            },
                                                            {
                                                                "property": {
                                                                    "detectedLanguages": [
                                                                        {
                                                                            "languageCode": "en"
                                                                        }
                                                                    ],
                                                                    "detectedBreak": {
                                                                        "type": "LINE_BREAK"
                                                                    }
                                                                },
                                                                "text": "t",
                                                                "confidence": 1
                                                            }
                                                        ],
                                                        "confidence": 0.99
                                                    }
                                                ],
                                                "confidence": 0.99
                                            }
                                        ],
                                        "blockType": "TEXT",
                                        "confidence": 0.99
                                    }

Is there anything to be considered if the vertices object is missing in the BoundingPoly object(boundingBox) property in the above json

When tried in the drag & drop demo, the json response for the OCR done on an image was

  "fullTextAnnotation": {
    "pages": [
      {
        "blocks": [
          {
            "blockType": "TEXT",
            "boundingBox": {
              "vertices": [
                {
                  "x": 31,
                  "y": 63
                },
                {
                  "x": 147,
                  "y": 63
                },
                {
                  "x": 147,
                  "y": 81
                },
                {
                  "x": 31,
                  "y": 81
                }
              ]
            },
            "confidence": 0.99,
            "paragraphs": [
              {
                "boundingBox": {
                  "vertices": [
                    {
                      "x": 31,
                      "y": 63
                    },
                    {
                      "x": 147,
                      "y": 63
                    },
                    {
                      "x": 147,
                      "y": 81
                    },
                    {
                      "x": 31,
                      "y": 81
                    }
                  ]
                },

Is this the intended behavior or any issues ? which field we should follow normalizedVertices or vertices !!

1

1 Answers

0
votes

The difference is that, in the request made from the code you're sending a PDF. In the drag & drop demo, you're sending an image (the demo doesn't accept files).

I replicated this and the behavior seems to be constant: PDF files are annotated with NormalizedVertices, whereas images are annotated with Vertices. My guess is that this is the intended behavior to enhance the performance of large PDF files annotation requests (large due to number of pages).

I sent a request to the Google Documentation so they can add this information in their docs.