1
votes

I work at a company that is thinking about migrate his BI structure ( SSAS Olap and Power BI ) to the ElasticSearch/Kibana. Our BI work basically with aggregations as sum, max, min, count, first, last, average at aggregation in multiples hierarchy levels( organization level, customer level, unit level, etc...), and expressions between multiples measures( getting by aggregation and find first/last based on datetime ) like sum, subtraction, multiplication, division, percentage, we also do some date manipulations. We work with more than 1 billion rows at total. Before SSAS Olap we worked with SQL Server OLTP, and our queries were taking many minutes. Because that, we change to OLAP and we now have our aggregate measures in a few seconds. But the Microsoft License is bursting our new budget and now the company requires a free software. At the moment, our Business Intelligence solution doesn't have data mining, neither machine learning, only aggregations measures at multiples hierarchy levels, and operations between these aggregations.

And our solution is the entire on-premises.

I have 2 questions:

1-With the free/basic Elasticsearch license, we will be able to migrate our business intelligence solution to the ElasticEngine and do all these hierarchies, aggregation and search in multiple levels operations?

2-With the free/basic Elasticsearch engine, we will have these measures aggregations between these millions registers and operations between measures at a similar time as our OLAP cube?

Best Regards,

Luis

1

1 Answers

1
votes

In general, it is possible to use ElasticSearch in the way you described. However, this has real sense only if you often use 'like'-based filtering by text columns, this use-case is perfect for ElasticSearch. All facts needed for the reports should be included into one ES document; avoid usage of sub-collections, aggregate queries on fields from sub-collections (nested queries) may kill the performance.

If all you need is damn-fast aggregations and 'exact match' filtering it might be much better to use columnar SQL-compatible databases:

  • MemSQL - free version should be enough for 1 billion of rows, this is good choice if you need to support both OLAP/OLTP DB usage (say, often update rows)
  • Yandex ClickHouse - free/open source, suitable if your data is append-only (no frequent updates/deletes). Its aggregate queries performance is really ultimate, and it can handle queries to 1 billion of rows fast enough even in the single-node configuration

You mentioned that your existing BI infrastructure uses SSAS OLAP, and I may assume that many reports are in fact pivot tables. Most free BI tools (including Kibana) don't support pivot tables at all or only very-very primitive basic pivot tables. Fortunately, good alternative to PowerBI/Excel PivotTable exists - but it is not completely free. Here I mean SeekTable: it can connect to all DBs I mentioned (MemSQL, ClickHouse) and even to ElasticSearch [Disclaimer: I'm affiliated with this BI tool].