0
votes

I have 2 jobs read data from 2 topics in Kafka. Business logic of each job is different and can be run in parallel but they are using any common lib, function,... So I write 2 jobs in one project java. I have any questions to run 2 jobs above:

Opt1: Upload one jar (1 main class includes 2 streams), and run 1 job.

=> But with the checkpoint, job 1 can be affected to job 2 and affect to performance.

Opt2: Upload one jar (2 main class corresponding to 2 streams) and run 2 jobs through EntryClass:

=> But when I run 2 jobs, I have catch a error org.apache.kafka.common.config.ConfigException: Invalid value org.apache.kafka.common.serialization.StringSerializer for configuration key.serializer: Class org.apache.kafka.common.serialization.StringSerializer could not be found. If i run only 1 job, There will no error. I think flink conflict when deploy the same 2 file jars.

Opt3: Each job builds one jar and runs 2 jobs corresponding to 2 jars:

=> I think similar Opt2.

1

1 Answers

1
votes

Until you KNOW for sure that you've got an issue, simpler is better. So I would first use one jar with one workflow (your Opt1), and only if you run into issues would I look at creating two jars (each with their own workflow) that you run at the same time on your cluster (your Opt3).

As an aside, the issue you encountered with Opt2 sounds like a packaging problem with your jar.