I'm trying to insert data from a Pandas dataframe into a table in Snowflake, and I'm having trouble figuring out how to do it properly. To start with, I have a created a table in Snowflake that has some columns of type VARIANT
. For example:
CREATE OR REPLACE TABLE
mydatabase.myschema.results(
DATE date,
PRODUCT string,
PRODUCT_DETAILS variant,
ANALYSIS_META variant,
PRICE float
)
Then in Pandas, I have a dataframe like this:
import pandas as pd
record = {'DATE': '2020-11-05',
'PRODUCT': 'blue_banana',
'PRODUCT_DETAILS': "{'is_blue': True, 'is_kiwi': nan}",
'ANALYSIS_META': "None",
'PRICE': 13.02}
df = pd.DataFrame(record, index=[0])
As you see, I've encoded VARIANT
columns as strings, as that's what I understood from the snowflake-connector
documentation, that a Snowflake VARIANT
type maps to str
dtype in Pandas and vice-versa.
So, what I've tried to far is the following:
from snowflake.connector import pandas_tools
pandas_tools.write_pandas(
conn=conn,
df=df,
table_name="results",
schema="myschema",
database="mydatabase")
And this does work, returning
(True,
1,
1,
[('czeau/file0.txt', 'LOADED', 1, 1, 1, 0, None, None, None, None)])
However, the results I get in Snowflake are not of the proper VARIANT
type. The field ANALYSIS_META
is correctly NULL
, but the field PRODUCT_DETAILS
is of type str
. See:
(also, for example this query throws an error:
SELECT * FROM
MYDATABASE.MYSCHEMA.RESULTS
WHERE PRODUCT_DETAILS:is_blue -- should work for json/variant fields
So with all that, my question is: how should I properly format my Pandas dataframe in order to insert he VARIANT
fields correctly as nested fields into a Snowflake table? I thought that casting a dictionary into a string would do the trick, but apparently it doesn't work as I expected. What I am missing here?
SELECT * FROM MYDATABASE.MYSCHEMA.RESULTS
? – demirciogluSQL compilation error: Invalid data type [VARIANT] for predicate [GET(TESTING.PRODUCT_DETAILS, 'is_blue')]
– taniaSELECT * FROM MYDATABASE.MYSCHEMA.RESULTS WHERE parse_json(PRODUCT_DETAILS):is_blue
– demirciogluSQL compilation error: Invalid data type [VARIANT] for predicate [GET(PARSE_JSON(TO_CHAR(TESTING.PRODUCT_DETAILS)), 'is_blue')]
), but even if it worked, I'm interested in making sure the type is properly inserted asVARIANT
. Some other tables in Snowflake that I can see are correctly inserted asVARIANT
type are highlighted blue and you can click on the values when you preview, as well as extract the nested subfields with the:
notation. I would like to achieve that without having to cast intojson
again when selecting. – tania