2
votes

I'm trying to use PySpark to refresh table partitions using the below command. I can issue any other SQL commands but MSCK REPAIR TABLE is causing me problems

Code:

conf = SparkConf().setAppName("PythonHiveExample")\
                  .set("spark.executor.memory", "3g")\
                  .set("spark.driver.memory", "3g")\
                  .set("spark.driver.cores", "2")\
                  .set("spark.storage.memoryFraction", "0.4")
sc = SparkContext(conf=conf)
sqlContext = HiveContext(sc)
sqlContext.sql("MSCK REPAIR TABLE testdatabase.testtable;")

Error:

File "/usr/hdp/2.3.0.0-2557/spark/python/pyspark/sql/context.py", line 488, in sql return DataFrame(self._ssql_ctx.sql(sqlQuery), self) File "/usr/hdp/2.3.0.0-2557/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in call File "/usr/hdp/2.3.0.0-2557/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o43.sql. : org.apache.spark.sql.AnalysisException: missing EOF at ';' near '10'; line 1 pos 41

NewError:

            py4j.protocol.Py4JJavaError: An error occurred while calling o43.sql.
            : org.apache.spark.sql.AnalysisException: missing EOF at 'MSCK' near 'testdatabase'; line 1 pos 17
                    at org.apache.spark.sql.hive.HiveQl$.createPlan(HiveQl.scala:254)
                    at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:41)
                    at org.apache.spark.sql.hive.ExtendedHiveQlParser$$anonfun$hiveQl$1.apply(ExtendedHiveQlParser.scala:40)
                    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:136)
                    at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:135)
                    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
                    at scala.util.parsing.combinator.Parsers$Parser$$anonfun$map$1.apply(Parsers.scala:242)
2
Can you try removing ; at the end of query? I remember I had a case where this solved the issue.mehmetminanc
try using this sqlContext.sql("use testdatabase;") sqlContext.sql("MSCK REPAIR TABLE testtable;")Ankit Agrahari
Tried above suggestions. Still getting an error. Have added it above.Colman

2 Answers

0
votes

I am currently using Spark 1.6 and the below statement is working for me to update partitions with hive metastore.

sqlContext.sql("alter table schema.table_name add partition (key = value )")

0
votes

You can try this command:

ALTER TABLE table_name ADD PARTITION