0
votes

We're considering Snowflake and want to understand how we could use it, and possibly other tools, to overcome one of our main problems - ETL! We currently use a legacy DWH with an ETL process consisting of SSIS and some views. This has all the common pitfalls of this methodology - most notably that it takes ages!

I was under the assumption that we'd move to an ELT model in Snowflake, I started to research tools to do the 'T' part of it, however, I'm just listening to this podcast: https://www.dataengineeringpodcast.com/snowflakedb-cloud-data-warehouse-episode-110/

And it's suggesting that just slapping a SQL View over something and exposing it in say PowerBI or Tableau is enough for the T part of things!...

Just wondering what people's experience was here? - Do you do transformations just by writing a view in Snowflake? - Do you use a third party tool specifically to address this need?

Secondary to this, for the Extraction and Loading, do you: - Do this using Snowflake only - Use a third party tool

I'm specifically interested if you do this to create some kind of timeseries in Snowflake from a non timeseries source. That's something we'd be keen to do.

2
While you have legitimate questions, software discussions/recommendations/opinions are off-topic for Stack Overflow (See What topics can I ask about here? and What types of questions should I avoid asking?) so I'm voting to close the question. Perhaps consider starting a thread on the Database sub-Reddit. - Eric Brandt
I've rephrased the question, hopefully that helps. I was keen to ask here as this was where Snowflake's site specifically directly me to in terms of a 'forum' as it seems Snowflake have migrated theirs off their site and on to here. I felt this would provide me with the most experienced Snowflake users which would be great :) - ChilliBob

2 Answers

1
votes

This question is hard to answer without sounding opinionated, especially not knowing your use case. For what it's worth here is what I think:

  • Don't stick views on top of your tables and expose to a reporting tool unless you have a very very simple setup. If you're considering a tool like Snowflake then you will probably want to go for something more sustainable, this approach can become prohibitive in terms of cost and complexity in your views.

  • Use a third-party tool to manage your ELT process. Your choice of tool will depend on your internal skills and cloud strategy, have a look at the tools out there like Stich, Fivetran etc. If you don't mind having on-premise technologies why not stick with SSIS or use something like Apache Airflow (requires up-skilling)

  • Snowflake will not help you with the E of ELT, you will need to use a third-party tool to manage the extract of data from your other systems like SSIS. It will help with the L part, for this you can use Snowpipe or COPY commands which are available within the Snowflake ecosystem. Snowflake will also help you share your data with external parties which is really nice.

0
votes

My organization has created a fairly complicated dimensional model in Snowflake using layers of SQL views, against which we can point our reporting tools. We use a separate replication tool for extraction from source systems and loading into Snowflake. Using views simplifies our approach in that we don't need to use an additional tool. It also makes managing the code easier than something like SSIS. For instance, we can search for code using the Snowflake interface or our version control tool instead of having to open individual SSIS packages.