0
votes

For on-prem -> S3 -> Snowflake ETL Solution

Currently I use tools:

  • snowSQL
  • snowflake UI

Currently I follow this load process

  1. Create warehouse
  2. Create indexes (DDL)
  3. Create tables (DDL)
  4. Copy files into User S3 Stage.
  5. Copy S3 stage files into Tables

Question: Is there a standard ingestion process that I should follow ? I am looking for a proven and production grade process that I can use to develop my data load (ingestion)

Criteria: I am open to using utilities provided by snowflake but not open to using a ETL Tool ( Informatica or TalenD or any other)

I have worked with other DB vendors in the past to come up with such processes. And I know that it does exist.

1
You should really look at the Snowflake documentation at "continuous data pipelines" a little googling will provide loads of information on thisSimon D
That helps. Also, Are there any other user communities for snowflake outside of stackoverflow ? May be on linkedIn or Slack or anywhere else ?osr

1 Answers

0
votes

There are two kind of source data load processes that one encounters for data integration

  1. Bulk Load 2 Continuous ingestion

As you have already showed your limited interest on ETL/ELT tool usage. What I would suggest

  1. For the bulk load operation (one off) use standard copy into command from either creating an external Stage directly or using external named stage whatever you like (you have provided the steps that should suffice) 2 For continuous ingestions , snowpipe would be the ideal option , check out more on that

https://docs.snowflake.com/en/user-guide/data-load-snowpipe.html

Thanks Palash Chatterjee