0
votes

I'm trying to create a Batch oriented Flink job, with Flink 1.5.0 and wish to use the Table and SQL APIs to process the data. My problem is trying to create the BatchTableEnviroment I get a compiling error

BatchJob.java:[46,73] cannot access org.apache.flink.streaming.api.environment.StreamExecutionEnvironment

caused at

final BatchTableEnvironment bTableEnv = TableEnvironment.getTableEnvironment(bEnv);

As far as I know I have no dependency on the streaming environment. My code is as the snippet below.

import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.TableEnvironment;
import org.apache.flink.table.api.java.BatchTableEnvironment;
import org.apache.flink.table.sources.CsvTableSource;
import org.apache.flink.table.sources.TableSource;

import java.util.Date;


public class BatchJob {

    public static void main(String[] args) throws Exception {
        final ExecutionEnvironment bEnv = ExecutionEnvironment.getExecutionEnvironment();
        // create a TableEnvironment for batch queries
        final BatchTableEnvironment bTableEnv = TableEnvironment.getTableEnvironment(bEnv);
    ... do stuff
    // execute program
        bEnv.execute("MY Batch Jon");
    }

My pom dependencies are as as below

<dependencies>
        <!-- Apache Flink dependencies -->
        <!-- These dependencies are provided, because they should not be packaged into the JAR file. -->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-java</artifactId>
            <version>${flink.version}</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-table_2.11</artifactId>
            <version>${flink.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-scala_2.11</artifactId>
            <version>${flink.version}</version>
        </dependency>




        <!-- Add connector dependencies here. They must be in the default scope (compile). -->


        <!-- Example:

        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-kafka-0.10_${scala.binary.version}</artifactId>
            <version>${flink.version}</version>
        </dependency>
        -->

        <!-- Add logging framework, to produce console output when running in the IDE. -->
        <!-- These dependencies are excluded from the application JAR by default. -->
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-log4j12</artifactId>
            <version>1.7.7</version>
            <scope>runtime</scope>
        </dependency>
        <dependency>
            <groupId>log4j</groupId>
            <artifactId>log4j</artifactId>
            <version>1.2.17</version>
            <scope>runtime</scope>
        </dependency>

    </dependencies>

Please can someone help me understand what the dependency of the Streaming API is and why I need it for a batch job? Thanks very much in advance for your help. Oliver

1

1 Answers

0
votes

Flink's Table API and SQL support are unified APIs for batch and stream processing. Many internal classes are shared between batch and stream execution and Scala / Java Table API and SQL and hence link to Flink's batch and streaming dependencies.

Due to these common classes, also batch queries require on the flink-streaming dependencies.