After having followed the beginner Java tutorials for Apache Flink on their documentation sites I wanted to try some transformations on my own data. However, I'm having trouble gathering input from my Microsoft SQL database running on a server in the network.
The examples in the section about possible sources for DataSets contain a section that looked like what I need, where a DataSet is built using env.createInput(...) with a JDBCInputFormat. So I added the Maven dependency for Flink JDBC
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-jdbc_2.11</artifactId>
<version>0.10.2</version>
</dependency>
and remodeled the given code to fit to my own database like this:
// create and configure input format
JDBCInputFormat inputFormat = JDBCInputFormat.buildJDBCInputFormat()
.setDrivername("org.apache.derby.jdbc.EmbeddedDriver")
.setDBUrl(sqlserver)
.setUsername(username)
.setPassword(password)
.setQuery(query)
.finish();
// create and configure type information for DataSet
TupleTypeInfo typeInformation = new TupleTypeInfo(Tuple2.class, STRING_TYPE_INFO, INT_TYPE_INFO);
// Read data from a relational database using the JDBC input format
DataSet<Tuple2<String, Integer>> dbData = environment.createInput(inputFormat, typeInformation);
Server address, user name and password are the same that work in another Java program of mine where I use JDBC only. The query is a simple SELECT on two columns, one containing String values, the other Integers.
When running the program I get a ClassNotFoundException referring to the selected driver: JDBC-Class not found. - org.apache.derby.jdbc.EmbeddedDriver at org.apache.flink.api.java.io.jdbc.JDBCInputFormat.open
Now, I seem to be missing some imports here, but I can't figure out which (and where to get them), as I was expecting Flink JDBC to support this minimal example. The same driver name is also given in the JDBCInputFormat Javadoc. I tried adding JDBC 4.2 manually which did not work.
What do I need to add or change so that the driver will be found? Additionally, is there some official material about Flink JDBC and its usage, apart from the Javadoc? I am even having difficulties finding tutorials about Flink and SQL sources in general.