0
votes

I want to benchmark Spark vs Flink, for this purpose I am making several tests. However Flink doesn't work with Kafka, meanwhile with Spark works perfect.

The code is very simple:

val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment

val properties = new Properties()
properties.setProperty("bootstrap.servers", "localhost:9092")
properties.setProperty("group.id", "myGroup")
println("topic: "+args(0))
val stream = env.addSource(new FlinkKafkaConsumer09[String](args(0), new SimpleStringSchema(), properties))
stream.print

env.execute()

I use kafka 0.9.0.0 with the same topics (in consumer[Flink] and producer[Kafka console]), but when I send my jar to the cluster, nothing happens:

Cluster Flink

What it could be happening?

2
Are you reading from a pre-filled Kafka topic (to have identical input for Flink and Spark) or simultaneously writing data to and reading data from Kafka?Fabian Hueske
I send data over the producer at the same time that Flink is upAdrian P
Have you tried the FlinkKafkaConsumer082 connector and/or specifying the zookeeper.connect property as shown here: link? While the docs say the zookeeper.connect property is not required for the FlinkKafkaConsumer09 connector, it may be a good experiment. If yes, does the flink job stay running? Where are you looking for the output?jagat
I tried with the connector of my version (0.9), although I tested with both options (with and without zookeeper.connect) and it didn't fix :(Adrian P

2 Answers

2
votes

Your stream.print will not print in console on flink .It will write to flink0.9/logs/recentlog. Other-wise you can add your own logger for confirming output.

0
votes

For this particular case (a Source chained into a Sink) the Webinterface will never report Bytes/Records sent/received. Note that this will change in the somewhat near future.

Please check whether the job-/taskmanager logs do not contain any output.