10
votes

In Amazon Redshift's Getting Started Guide, it's mentioned that you can utilize SQL client tools that are compatible with PostgreSQL to connect to your Amazon Redshift Cluster.

In the tutorial, they utilize SQL Workbench/J client, but I'd like to utilize python (in particular SQLAlchemy). I've found a related question, but the issue is that it does not go into the detail or the python script that connects to the Redshift Cluster.

I've been able to connect to the cluster via SQL Workbench/J, since I have the JDBC URL, as well as my username and password, but I'm not sure how to connect with SQLAlchemy.

Based on this documentation, I've tried the following:

from sqlalchemy import create_engine
engine = create_engine('jdbc:redshift://shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy')

ERROR:

Could not parse rfc1738 URL from string 'jdbc:redshift://shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy'
5
Have you tried using the Postgres engine?kylieCatt
Expanding on the above comment, in your connection string you're using jdbc:redshift:, but that means it's trying to connect to the redshift endpoint, not the postgres adaptor for you redshift DB. I don't know if redshift gives you a different connection endpoint (maybe it's the same hostname but a different port)?Tom Dalton

5 Answers

7
votes

I don't think SQL Alchemy "natively" knows about Redshift. You need to change the JDBC "URL" string to use postgres.

jdbc:postgres://shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy

Alternatively, you may want to try using sqlalchemy-redshift using the instructions they provide.

6
votes

I was running into the exact same issue, and then I remembered to include my Redshift credentials:

eng = create_engine('postgres://[LOGIN]:[PWORD]@shippy.cx6x1vnxlk55.us-west-2.redshift.amazonaws.com:5439/shippy
2
votes

sqlalchemy-redshift is works for me, but after few days of reserch packages (python3.4):

SQLAlchemy==1.0.14 sqlalchemy-redshift==0.5.0 psycopg2==2.6.2

First of all, I checked, that my query is working workbench (http://www.sql-workbench.net), then I force it work in sqlalchemy (this https://stackoverflow.com/a/33438115/2837890 helps to know that auto_commit or session.commit() must be):

db_credentials = (
'redshift+psycopg2://{p[redshift_user]}:{p[redshift_password]}@{p[redshift_host]}:{p[redshift_port]}/{p[redshift_database]}'
    .format(p=config['Amazon_Redshift_parameters']))
engine = create_engine(db_credentials, connect_args={'sslmode': 'prefer'})
connection = engine.connect()
result = connection.execute(text(
    "COPY assets FROM 's3://xx/xx/hello.csv' WITH CREDENTIALS "
    "'aws_access_key_id=xxx_id;aws_secret_access_key=xxx'"
    " FORMAT csv DELIMITER ',' IGNOREHEADER 1 ENCODING UTF8;").execution_options(autocommit=True))
result = connection.execute("select * from assets;")
print(result, type(result))
print(result.rowcount)
connection.close()

And after that, I forced to work sqlalchemy_redshift CopyCommand perhaps bad way, looks little tricky:

import sqlalchemy as sa
tbl2 = sa.Table(TableAssets, sa.MetaData())
copy = dialect_rs.CopyCommand(
    assets,
    data_location='s3://xx/xx/hello.csv',
    access_key_id=access_key_id,
    secret_access_key=secret_access_key,
    truncate_columns=True,
    delimiter=',',
    format='CSV',
    ignore_header=1,
    # empty_as_null=True,
    # blanks_as_null=True,
)

print(str(copy.compile(dialect=RedshiftDialect(), compile_kwargs={'literal_binds': True})))
print(dir(copy))
connection = engine.connect()
connection.execute(copy.execution_options(autocommit=True))
connection.close()

We make just that I made with sqlalchemy, excute query, except comine query by CopyCommand. I have not see some profit :(.

1
votes

The following works for me with Databricks on all kinds of SQLs

  import sqlalchemy as SA
  import psycopg2
  host = 'your_host_url'
  username = 'your_user'
  password = 'your_passw'
  port = 5439
  url = "{d}+{driver}://{u}:{p}@{h}:{port}/{db}".\
            format(d="redshift",
            driver='psycopg2',
            u=username,
            p=password,
            h=host,
            port=port,
            db=db)
  engine = SA.create_engine(url)
  cnn = engine.connect()

  strSQL = "your_SQL ..."
  try:
      cnn.execute(strSQL)
  except:
      raise
0
votes
import sqlalchemy as db
engine = db.create_engine('postgres://username:password@url:5439/db_name')

This worked for me