4
votes

I want to stream a csv file and perform sql operations using flink. But the code i have written just reads once and stops. It does not stream. Thanks in advance,

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

StreamTableEnvironment tableEnv = StreamTableEnvironment.getTableEnvironment(env);

CsvTableSource csvtable = CsvTableSource.builder()
    .path("D:/employee.csv")
    .ignoreFirstLine()
    .fieldDelimiter(",")
    .field("id", Types.INT())
    .field("name", Types.STRING())
    .field("designation", Types.STRING())
    .field("age", Types.INT())
    .field("location", Types.STRING())
    .build();

tableEnv.registerTableSource("employee", csvtable);

Table table = tableEnv.scan("employee").where("name='jay'").select("id,name,location");
//Table table1 = tableEnv.scan("employee").where("age > 23").select("id,name,age,location");

DataStream<Row> stream = tableEnv.toAppendStream(table, Row.class);

//DataStream<Row> stream1 = tableEnv.toAppendStream(table1, Row.class);

stream.print();
//stream1.print();

env.execute();
1

1 Answers

5
votes

The CsvTableSource is based on a FileInputFormat which reads and parses the referenced file line by line. The resulting rows are forwarded into the streaming query. So in CsvTableSource is streaming in the sense that rows are continuously read and forwarded. However, the CsvTableSource terminates at the end of the file. Hence, it emits a bounded stream.

I assume the behavior that you expect is that the CsvTableSource reads the file until its end and then waits for appending writes to the file. However, this is not how the CsvTableSource works. You would need to implement a custom TableSource for that.