I am mapping a Dataset of Row to a Dataset of a custom class.
Dataset<Row> rows= sparkSession.read().orc(path);
Dataset<customClass> dataset =
rows.map(I'm parsing row to map it to customClass,
Encoders.bean(customClass.class));
And I am getting this AnalysisException:
AnalysisException: cannot resolve 'named_struct()' due to data type mismatch: input to function named_struct requires at least one argument;
I am using Spark 2.3.0 and am encoding my custom class using javaBeans.
I checked whether the schema was effectivly inferred by Encoders and it is the case. So, technically, the map operation should work.
Has anyone ever faced this exception message ? What does the named_struct function do ? I found no relevant information related to Spark ...
root
|-- field1: struct (nullable = true)
| |-- value: string (nullable = true)
|-- field2: string (nullable = true)
|-- field3: integer (nullable = true)
|-- field4: double (nullable = true)
|-- field5: struct (nullable = true)
| |-- value: double (nullable = true)
|-- field6: struct (nullable = true)
| |-- field61: double (nullable = true)
| |-- field62: string (nullable = true)
| |-- field63: integer (nullable = true)
| |-- field64: struct (nullable = true)
| | |-- value: string (nullable = true)
|-- field7: struct (nullable = true)
| |-- value: double (nullable = true)
|-- field8: struct (nullable = true)
| |-- value: double (nullable = true)
|-- field9: struct (nullable = true)
| |-- field91: map (nullable = true)
| | |-- key: struct
| | |-- value: struct (valueContainsNull = true)
| | | |-- value: string (nullable = true)
| | | |-- field911: struct (nullable = true)
| | | | |-- value: double (nullable = true)
| | | |-- field912: struct (nullable = true)
| | | | |-- value: double (nullable = true)
| | | |-- field913: map (nullable = true)
| | | | |-- key: struct
| | | | |-- value: struct (valueContainsNull = true)
| | | | | |-- value: integer (nullable = false)
| | | | | |-- field9131: struct (nullable = true)
| | | | | | |-- value: double (nullable = true)
| | | | | |-- field9131: struct (nullable = true)
| | | | | | |-- value: double (nullable = true)
| | | |-- field914: struct (nullable = true)
| | | | |-- value: double (nullable = true)
| | | |-- field915: string (nullable = true)
|-- field10: string (nullable = true)
|-- field11: struct (nullable = true)
| |-- field111: map (nullable = true)
| | |-- key: struct
| | |-- value: struct (valueContainsNull = true)
| | | |-- value: integer (nullable = false)
| | | |-- field1111: struct (nullable = true)
| | | | |-- value: double (nullable = true)
| | | |-- field1112: struct (nullable = true)
| | | | |-- value: double (nullable = true)
|-- field12: boolean (nullable = true)
|-- field13: struct (nullable = true)
| |-- field131: integer (nullable = false)
| |-- field132: integer (nullable = false)
|-- field14: struct (nullable = true)
| |-- field141: string (nullable = true)
named_struct
. What are the fields of yourcustomClass
? - Jacek Laskowskiq.explain(true)
? Could you add the output ofSystem.out.println(Encoders.bean(customClass.class))
too? Can you remove all uses ofMap
in yourcustomClass
and start over (just to test it could give a better result with no maps as they are partially supported)? - Jacek LaskowskiDataset<customClass>
. I think the problem is related toMaps
: on thecustomClass
schema, keys toMaps
arestruct
and don't have values linked to them. Maybe that's why I am having this problem. - Malkolmnamed_struct
error : One of the fields I was using was declaredfinal
, which means it didn't have a setter. This violates theJavaBean
contract. - Malkolm