0
votes

I moved from manual clustering to auto clustering around 2 week back.

And the steps i used are below.

  1. Update AUTO_CLUSTERING_ON to yes for the table.
  2. create a middle table and insert the record in the table.
  3. then insert into the main table with order by clustering key from the middle table.

Then i see the clustering is all over the place. I once did the manual clustering as well and see the cluster doing good. however on next insert in the main table. clustering again looks trouble some.

Please suggest if I am missing anything.

please note:

The data loaded in middle table is insert from some other table as well. And that table is never clustered. I am not sure if that is the issue.(which i feel it should not be)

2
Curious what is the data type(s) for the clustering key columns, and how did you conclude that clustering is all-over-the-place?Stuart Ozer

2 Answers

1
votes

You may need to raise a case with Snowflake to enable automatic clustering. Accounts that were created a while ago won't have this enabled. From the documentation:

If manual reclustering is still available in your account, Automatic Clustering may not be enabled yet for your account.

You can request Automatic Clustering to be enabled for your account; however, it will only affect clustered tables that are defined from the time after the feature is enabled.

For clustered tables that were defined before the feature is enabled, you must explicitly “resume” Automatic Clustering for each table. You can use SQL to determine whether Automatic Clustering is enabled for a given table.

Also from the documentation here you should try to run the resume recluster command since the table may have been created prior to automatic clustering being enabled for your account:

alter table t1 resume recluster;

Dont forget that the table gets automatically gets reclustered at Snowflake discretion. Snowflake may simply not think the table requires reclustering based on a number of factors (which I don't know :))

I think raising a case with Snowflake will probably solve this pretty quickly so that may be the best route.

0
votes

Not specifically related to the question, but I have found that periodically rebuilding a table will achieve the best clustering results, especially for tables which churn frequently. To do this you can specify an ORDER BY clause which mimics your clustering keys.

CREATE OR REPLACE TABLE t1 COPY GRANTS AS
SELECT * FROM t1 ORDER BY a, b, c;