My intention is to do the equivalent of the basic sql
select shipgrp, shipstatus, count(*) cnt
from shipstatus group by shipgrp, shipstatus
The examples that I have seen for spark dataframes include rollups by other columns: e.g.
df.groupBy($"shipgrp", $"shipstatus").agg(sum($"quantity"))
But no other column is needed in my case shown above. So what is the syntax and/or method call combination here?
Update A reader has suggested this question were a duplicate of dataframe: how to groupBy/count then filter on count in Scala : but that one is about filtering by count
: there is no filtering here.
count
so it's not a clear duplicate. – WestCoastProjects