0
votes

Azure's documentation suggests that we should leverage blobs to be able to index documents like MS Word, PDF, etc. We have an Azure SQL Server database of thousands of documents stored in a table's nvarchar(MAX) field. The nature of the contents in each database record is in plain English text. In fact the application converted the PDF / MS Word into plain text and stored in database.

My question is that would it be possible to index the stored "documents" in database in the same way as Azure would do against blobs? I know how to create an SQL Azure indexer but I'd like to make sure that the way that the underneath search performs against blobs will be the same for documents stored in database table.

Thanks in advance!

1
Just curious - if you're storing PDF & Word documents in a column with nvarchar(MAX) data type, considering the format is binary, the content stored is not plain text. Am I correct in my understanding? - Gaurav Mantri
@GauravMantri Sorry for the confusion. I edited the question to be in right direction. - Arash

1 Answers

1
votes

This is not currently possible - document extraction can only be done on blobs stored in Azure storage.