0
votes
from pyspark import SparkContext, SparkConf, sql
from pyspark.sql import Row
sc = SparkContext.getOrCreate()
sqlContext = sql.SQLContext(sc)
df = sc.parallelize([ \
                 Row(nama='Roni', umur=27, tingi=168), \
                 Row(nama='Roni', umur=6, tingi=168),
                 Row(nama='Roni', umur=89, tingi=168),])

df.show()

error: Traceback (most recent call last):

File "ipython-input-24-bfb18ebba99e", line 8, in df.show()

AttributeError: 'RDD' object has no attribute 'show'

1

1 Answers

5
votes

The error is clear as df is an rdd. You should change it to a dataframe using toDF likes in the following code:

df = df.toDF()
df.show()