I'm using pyspark to create a dataframe from a JSON file.
The structure of the JSON file is as follows:
[
{
"Volcano Name": "Abu",
"Country": "Japan",
"Region": "Honshu-Japan",
"Location": {
"type": "Point",
"coordinates": [
131.6,
34.5
]
},
"Elevation": 571,
"Type": "Shield volcano",
"Status": "Holocene",
"Last Known Eruption": "Unknown",
"id": "4cb67ab0-ba1a-0e8a-8dfc-d48472fd5766"
},
{
"Volcano Name": "Acamarachi",
"Country": "Chile",
"Region": "Chile-N",
"Location": {
"type": "Point",
"coordinates": [
-67.62,
-23.3
}]
I will read in the file using the following line of code:
myjson = spark.read.json("/FileStore/tables/sample.json")
However, I keep on getting the following error message:
Spark Jobs
myjson:pyspark.sql.dataframe.DataFrame
_corrupt_record:string
Can someone let me know what I might doing wrong?
Is the problem with the structure of the json file?