1
votes

I'm having trouble with the pandas connector for Snowflake.

The last line of this code causes the immediate death of the python kernel. Any suggestions on how to diagnose such a situation?

import pyarrow
import snowflake.connector
import pandas as pd

ctx = snowflake.connector.connect(
    user=********,
    password=********,
    account=********,
    warehouse='compute_wh',
    database='SNOWFLAKE_SAMPLE_DATA',
    schema='WEATHER'
)
cs = ctx.cursor()
cs.execute('select * from weather_14_total limit 10')
cs.fetch_pandas_all()

Note that if fetch_pandas_all() is replaced with fetchone() everything works fine.

Thanks in advance.

  • Keith
3
Any errors that you can share?Mike Walton
All I get is a dialog panel that opens saying "Kernel Restarting. The kernel appears to have died. It will restart automatically."Keith
Could it be that its because you're not actually creating a dataframe with that statement? Last line should be: df = cs.fetch_pandas_all() where df becomes the name of your dataframe. I wouldn't expect that to kill a kernel, but worth a shot, right?Mike Walton
Unfortunately it still kills the kernel just the same.Keith
You should report the issue to Snowflake support. I believe this is fairly new functionality for the Snowflake connector, so they will likely want to take a look at it.Mike Walton

3 Answers

1
votes

This worked for me:

import pandas as pd 
from snowflake.connector import connect 

qry = "SELECT * FROM TABLE LIMIT 5"

con = connect(
  account = 'ACCOUNT',
  user = 'USER',
  password = 'PASSWORD',
  role= 'ROLE',
  warehouse = 'WAREHOUSE',
  database = 'DATABASE',
  schema = 'SCHEMA'
)

df = pd.read_sql(qry, con)

However, this was the most upvoted answer for a similar question:

import pandas as pd

from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL

url = URL(
  account = 'xxxx',
  user = 'xxxx',
  password = 'xxxx',
  database = 'xxx',
  schema = 'xxxx',
  warehouse = 'xxx',
  role='xxxxx',
  authenticator='https://xxxxx.okta.com',
)
engine = create_engine(URL)
connection = engine.connect()

query = '''
  select * from MYDB.MYSCHEMA.MYTABLE
  LIMIT 10;
'''

df = pd.read_sql(query, connection)
0
votes

Our team had this same issue last week. The workaround for us ended up being to use the read_sql command instead.

0
votes

That pyarrow warning it throws is not to be ignored. Current databricks LTS 9.1 has incompatible pacakge versions and so connector.fetch_pandas_all() crashes python REPL.

Solution:

Go to cluster -> Libraries -> Install New -> PyPI -> pyarrow==5.0.0 -> restart