0
votes

Main purpose is transfer data from Kafka topic into Clickhouse table. I guess one can think why not to use Clickhouse Kafka Engine? Well, there is known problem - duplicated messages. I tried up to latest version of Clickhouse server but it's all the same.

So, I decided to use Kafka Connect JdbcSinkConnector but got an error:

java.sql.SQLException: No suitable driver found for jdbc:clickhouse://localhost:8123/default at java.sql.DriverManager.getConnection(DriverManager.java:689) at java.sql.DriverManager.getConnection(DriverManager.java:208) at io.confluent.connect.jdbc.dialect.GenericDatabaseDialect.getConnection(GenericDatabaseDialect.java:211) at io.confluent.connect.jdbc.util.CachedConnectionProvider.newConnection(CachedConnectionProvider.java:88) at io.confluent.connect.jdbc.util.CachedConnectionProvider.getConnection(CachedConnectionProvider.java:62) at io.confluent.connect.jdbc.sink.JdbcDbWriter.write(JdbcDbWriter.java:56) at io.confluent.connect.jdbc.sink.JdbcSinkTask.put(JdbcSinkTask.java:74) at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:538) at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:321) at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:224) at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:192) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

Clickhouse JDBC driver need to be install. I found the official JDBC driver and downloaded clickhouse-jdbc-0.2.4.jar from 'releases' tab into container.

Also installed jdk:

apt-get update && apt-get update
apt-get install default-jdk

By the way Kafka Connect docker container is built from this image:

confluentinc/cp-kafka-connect:5.2.1

I tried to run the jar-file in several ways (there is no Main-Class in MANIFEST.MF):

java -jar clickhouse-jdbc-0.2.4.jar 

This returned the error:

no main manifest attribute, in clickhouse-jdbc-0.2.4.jar

Then I found out that if there is no Main-Class in MANIFEST.MF the -jar won't work. And I tried command:

java -cp clickhouse-jdbc-0.2.4.jar ru.yandex.clickhouse.ClickHouseDriver

And fail again with error:

Error: Could not find or load main class ru.yandex.clickhouse.ClickHouseDriver

What is the correct way to install clickhouse-jdbc?

1
and you need clickhouse-jdbc-0.2.4-shaded.jar (5MB) Not the clickhouse-jdbc-0.2.4.jar (202kb) -- it's only CH-jdbc code without 3party libs - Denny Crane

1 Answers

2
votes
  1. Drivers don't have main classes. You cannot run them directly.

  2. The Docker image already has a valid JDK, and installing another won't solve the error.

  3. The ClickHouse Kafka Ingestor probably has "at-least-once" semantics, so duplicates cannot be avoided, anyway. The JDBC Source Connector can have the same issue.

  4. You would put the Driver JAR under /usr/share/java/kafka-connect-jdbc - https://www.confluent.io/blog/kafka-connect-deep-dive-jdbc-source-connector/#no-suitable-driver-found