How to create hive table from Pyspark data frame, using its schema?

Question

i have created data frame using below code:

  import pyspark
    from pyspark.sql import functions as F

    sc = pyspark.SparkContext()
    spark = pyspark.sql.SparkSession(sc)

    data = [('A', 'B', 1), ('A', 'B', 2), ('A', 'C', 1)]
    columns = ['Column1', 'Column2', 'Column3']
    data = spark.createDataFrame(data, columns)
    data.printSchema()
 root
 |-- Column1: string (nullable = true)
 |-- Column2: string (nullable = true)
 |-- Column3: long (nullable = true)

I want to create a hive table using my pySpark dataframe's schema in pyspark? here I have mentioned sample columns but I have many columns in my dataframe, so is there a way to automatically generate such query?

dsk dsk · Accepted Answer · 2020-06-15T14:40:11

I belive your table creation is an one time activity,in that case the data type might differ between spark and a Hive table.

The best what you can do in case if you have a lots of columns..

print(data.schema)

So that you will get all the schema

How to create hive table from Pyspark data frame, using its schema?

1 Answers