I set up blob indexing and full-text searching for Azure as described in this article: Indexing Documents in Azure Blob Storage with Azure Search.
Some of my pdf's, however, fail in the indexer:
[
{
"key": null,
"errorMessage": "Error processing blob 'https://my-storage.blob.core.windows.net/my-container/mydocument.pdf' with content type '': 422"
}
]
I double-checked the properties on the blob to make sure its content type was set:
{
"container": "my-container",
"name": "mydocument.pdf",
"metadata": {},
"lastModified": "Fri, 08 Jul 2016 19:43:15 GMT",
"etag": "0xXXXXXXXXXXXXXXX",
"blobType": "BlockBlob",
"contentLength": "3863790",
"requestId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"contentSettings": {
"contentType": "application/pdf",
"contentMD5": "xxxxxxxxxxxxxxxxxxxxxx=="
},
"lease": {
"status": "unlocked",
"state": "available"
}
}
Now, this particular pdf has some security restrictions (no printing), so I thought that might affect it. I created some pdf's from scratch to test it out, and they worked just fine, both with and without the restrictions.