I am writing Kafka data stream to bucketing sink in a HDFS path. Kafka gives out string data. Using FlinkKafkaConsumer010 to consume from Kafka
-rw-r--r-- 3 ubuntu supergroup 4097694 2018-10-19 19:16 /streaming/2018-10-19--19/_part-0-1.in-progress
-rw-r--r-- 3 ubuntu supergroup 3890083 2018-10-19 19:16 /streaming/2018-10-19--19/_part-1-1.in-progress
-rw-r--r-- 3 ubuntu supergroup 3910767 2018-10-19 19:16 /streaming/2018-10-19--19/_part-2-1.in-progress
-rw-r--r-- 3 ubuntu supergroup 4053052 2018-10-19 19:16 /streaming/2018-10-19--19/_part-3-1.in-progress
This happens only when I use some mapping function to manipulate the stream data on the fly. If I directly write the stream to HDFS its working fine. Any idea why this might be happening? I am using Flink 1.6.1, Hadoop 3.1.1 and Oracle JDK1.8