(Submitting on behalf of a Snowflake client)
.........................
I want to create a dimension with features as JSON attribute.
I am thinking of using HASH to uniquely identify my rows, including the JSON column.
I expect to have a few million rows in that dimension.
Snowflake documentation (https://docs.snowflake.net/manuals/sql-reference/functions/hash.html) says that HASH is likely to produce duplicates for 4 billion rows or more... and warn against using HASH as a key...
Is using a HASH value as key a reasonable approach when only having a few million row members?
.........................
Any ideas, alternative recommendations, or possible work-arounds? THANK YOU.