1
votes

I would like to register a UDAF class (written in Scala or Python) and use it in Spark SQL.
For example:
Mock code

// mock code:
class MyUDAF extends UserDefinedAggregateFunction{
...
}

spark.udaf.registerJavaFunction("myagg", "MyUDAF",IntegerType)

Then I can use the udaf directly within spark sql like following:

spark.sql("select myagg(field) from mytable group by something")

Spark has only provided spark.udf.registerJavaFunction method to register a UDF class.

Anyone knows how to register a UDAF?

2
This blog post develops a UDAF in Java: ankithoodablog.wordpress.com/2017/09/07/…DNA

2 Answers

0
votes

You can just register it using Hive SQL.

spark.sql("CREATE FUNCTION myagg AS 'com.mysite.MyUDAF'")
spark.sql("select myagg(field) from mytable group by something")
0
votes

You can do the same for udaf:

spark.udf.register("udaf_name", new UdafClass())

Then you can use it in Spark SQL.