0
votes

I am writing a batch based on https://github.com/dataArtisans/flink-training-exercises/blob/master/src/main/java/com/dataartisans/flinktraining/exercises/dataset_java/mail_count/MailCount.java

In the following code, input has to be .csv, otherwise I get error. I tried a .zip file with a csv in it. In the MailCount.java, I see that the readCsvFile accepts .gz file as input and works fine. Could you please help?

env.readCsvFile(input) .ignoreFirstLine() .includeFields(fields) .types(String.class,String.class);

Thanks Aruna

1

1 Answers

1
votes

Flink supports reading compressed files out of the box, if the files have a proper extension. However, not all types of compression are supported. You can find the list of supported compression types in [1].

For example, .gz is supported, that's why the example works, but .zip isn't, so you get an error.

Best regards, Konstantin

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/batch/index.html#read-compressed-files