spark2.1.0 insert data into hive error

Question

spark version: 2.1.0

I want to insert Datasetinto hive withing partitioned by 'dt' field, but it failed.

when using 'insertInto()', the error is : 'spark2.0 insertInto() can't be used together with partitionBy()'

when using 'saveAsTale()', the error is: 'Saving data in the Hive serde table ad.ad_industry_user_profile_incr is not supported yet. Please use the insertInto() API as an alternative.'

And, the core code is as follows:

        rowRDD.foreachRDD(new VoidFunction<JavaRDD<Row>>() {
            @Override
            public void call(JavaRDD<Row> rowJavaRDD) throws Exception {
                Dataset<Row> profileDataFrame = hc.createDataFrame(rowJavaRDD, schema).coalesce(1);
                profileDataFrame.write().partitionBy("dt").mode(SaveMode.Append).insertInto(tableName);
//                profileDataFrame.write().partitionBy("dt").mode(SaveMode.Append).saveAsTable(tableName);
            }
        });

Help me, please ~

Martin Martin · Accepted Answer · 2017-06-18T15:23:07

using profileDataFrame.write().mode(SaveMode.Append).insertInto(tableName) without .partitionBy("dt")

spark2.1.0 insert data into hive error

1 Answers