I'm researching docker/k8s deployment possibilities for Flink 1.9.1.
I'm after reading/watching [1][2][3][4].
Currently we do think that we will try go with Job Cluster approach although we would like to know what is the community trend with this? We would rather not deploy more than one job per Flink cluster.
Anyways, I was wondering about few things:
How can I change the number of task slots per task manager for Job and Session Cluster? In my case I'm running docker on VirtualBox where I have 4 CPUs assigned to this machine. However each task manager is spawned with only one task slot for Job Cluster. With Session Cluster however, on the same machine, each task manager is spawned with 4 task slots.
In both cases Flink's UI shows that each Task manager has 4 CPUs.
How can I resubmit job if I'm using a Job Cluster. I'm referring this use case [5]. You may say that I have to start the job again but with different arguments. What is the procedure for this? I'm using checkpoints btw.
Should I kill all task manager containers and rerun them with different parameters?
How I can resubmit job using Session Cluster?
How I can provide log config for Job/Session cluster? I have a case, where I changed log level and log format in log4j.properties and this is working fine on local (IDE) environment. However when I build the fat jar, and ran a Job Cluster based on this jar it seams that my log4j properties are not passed to the cluster. I see the original format and original (INFO) level.
Thanks,
[1] https://youtu.be/w721NI-mtAA
[2] https://youtu.be/WeHuTRwicSw
[3] https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/docker.html
[4] https://github.com/apache/flink/blob/release-1.9/flink-container/docker/README.md