2
votes

I have a SQLite file in an Object Storage instance on IBM's Bluemix. I want to access that file from a Python notebook in Spark as a Service. Right now, the following code does not work:

df = sqlContext.read.format('jdbc').\
     options(url='jdbc:sqlite:swift://container.spark/db.sqlite',\
     dbtable='schema.tablename').load()
print df.take(10)

The error I get is: No suitable driver found for jdbc:sqlite://container.spark/db.sqlite

I've been looking for documentation for this, but I haven't been able to find it. Any help would be much appreciated. How do I access a SQLite file on Object Storage from Spark as a Service on Bluemix?

1

1 Answers

1
votes

You are missing one parameter to specify in option that is which driver to look for.

df = sqlContext.read.format('jdbc').\
     options(url='jdbc:sqlite:Chinook_Sqlite.sqlite',\
     dbtable='employee',driver='org.sqlite.JDBC').load()

Add driver='org.sqlite.JDBC' and you would note get the driver not found.

But for reading sqlite from Object storage, i don't think that is supported by org.sqlite.JDBC. The only three methods allowed are absolute path, relative path from your file system and memory. https://bitbucket.org/xerial/sqlite-jdbc

In below example, i have downloaded sample sqlite database in Bluemix spark service's local disk and then used it.

https://cdsx.ng.bluemix.net/data/notebooks/1fa97ca4-721a-41b2-a958-ee58bc47577b/view?access_token=5f4e48fd292c59949ffc969b048ab849861b0a95ed708d0d679bf65ce2bf1588

Github importable notebook:- https://github.com/charles2588/bluemixsparknotebooks/blob/master/sqllite_jdbc.ipynb

Thanks, Charles.