I am new to Pyspark and nothing seems to be working out. Please rescue. I want to read a parquet file with Pyspark. I wrote the following codes.
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
sqlContext.read.parquet("my_file.parquet")
I got the following error
Py4JJavaError Traceback (most recent call last) /usr/local/spark/python/pyspark/sql/utils.py in deco(*a, **kw) 62 try: ---> 63 return f(*a, **kw) 64 except py4j.protocol.Py4JJavaError as e:
/usr/local/spark/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 318 "An error occurred while calling {0}{1}{2}.\n". --> 319 format(target_id, ".", name), value) 320 else:
then I tried the following codes
from pyspark.sql import SQLContext
sc = SparkContext.getOrCreate()
SQLContext.read.parquet("my_file.parquet")
Then the error was as follows :
AttributeError: 'property' object has no attribute 'parquet'
SQLContext.read.format("parquet").load("my_file.parquet")
. same error ? – Steven