I am new to Snowflake data warehouse and want to load data using snowpipe and auto ingest from an external stage (S3 bucket) which contains multiple folders (one for each database) and inside those folders there are multiple files with 1 for each table. The files in the different folders will be updated at different times and I'm currently using auto-ingest to go S3 to SQS to Snowpipe.
My example S3 bucket is setup as follows:
myS3bucket.s3amazonaws.com
/db1
account.csv
product.csv
/db2
invoice.csv
employees.csv
My current understanding is that I would create one pipe per table so account.csv will relate to 1 pipe which will populate the the account table and then product.csv will relate to a different pipe to populate the product table etc.
At the moment it looks like all the pipes will be triggered if I add a new file to the bucket as the notification is on the bucket.
My questions are:
- Is that correct that all pipes are triggered when I add a new file? Do I need to setup db1 and db2 as different external stages so if I add a file to db1 folder then only the pipes linked to that external stage will be read?
- I've read SNS can be used but unsure how it works for snowpipe. Does S3 send a message so it can work out which pipe to call and fans out the message to that pipe?
- Is there a better approach to this then what I'm using above?
Thanks for any help