1
votes

Is there any cloud based alternatives to azure search which could index the contents of azure blobs? (Mainly office based documents)?

I have an application that exposes full text search which is rarely used. Azure search works great for documentdb etc and the basic tier covers this usage.

However when applying indexing to blobs which may be search a couple of times a day if that, then the cost is extremely high in comparison to other functionality used in the stack.

We have also hit the 2gb limit on storage but less than 20% on document limit. Ideally we would want to increase storage but this isn't an option without increasing costs 3x for storage alone by updating to S1.

Alternatives found so far are running solr vm's or building our own capability which would likely still require VM's so solr would be better in this case.

It seems others hit this scaling problem in relation to cost but usually because of QPS. Our QPS is very low being a few searches within a 24hr period.

::Looking to keep this within Azure. AWS CloudSearch billing seems to work well with our use cases

1

1 Answers

1
votes

I am on the Azure Search engineering team. Sorry to hear that the pricing is not working for you. As you mentioned, running your own SOLR or ElasticSearch implementation in Azure is certainly an option, but I suspect one of the reasons you are looking to Azure Search is due to the fact that you are not looking to add management of search to your solution.

It is hard to get into specific options without knowing specifically what you are looking to do (for example, are you simply doing full text search over this content or are you doing more such as faceting, filtering, etc). Let me throw out one option.

Can you reduce the content size? For example, do you really need to have all of this content in Azure Search or could you say just index the key terms and phrase from this content so that you could identify the documents that contains the terms you are looking for? There are a lot of great technologies (such as Word2Vec) that are great for being able to extract terms and phrases.

The other advantage of this is that these terms can also be used for faceting and filtering and then you can simply load up the full content from some other store as needed.

The downside the term extraction might not include some terms that you find important.

There are many other options, but if I understand more about what you are looking to do maybe I can help more.

Liam