Given a dataframe with an index column ("Z"):
val tmp= Seq(("D",0.1,0.3, 0.4), ("E",0.3, 0.1, 0.4), ("F",0.2, 0.2, 0.5)).toDF("Z", "a", "b", "c")
+---+---+---+---+
| Z | a| b| c|
---+---+---+---+
| "D"|0.1|0.3|0.4|
| "E"|0.3|0.1|0.4|
| "F"|0.2|0.2|0.5|
+---+---+---+---+
Say im interested in the first row where Z = "D":
tmp.filter(col("Z")=== "D")
+---+---+---+---+
| Z | a| b| c|
+---+---+---+---+
|"D"|0.1|0.3|0.4|
+---+---+---+---+
How do i get the min and max values of that Dataframe row and its corresponding column name while keeping the index column?
Desired output if i want top 2 max
+---+---+---
| Z | b|c |
+---+---+--+
| D |0.3|0.4|
+---+---+---
Desired output if i want min
+---+---+
| Z | a|
+---+---+
| D |0.1|
+---+---+
What i tried:
// first convert that DF to an array
val tmp = df.collect.map(_.toSeq).flatten
// returns
tmp: Array[Any] = Array(0.1, 0.3, 0.4) <---dont know why Any is returned
//take top values of array
val n = 1
tmp.zipWithIndex.sortBy(-_._1).take(n).map(_._2)
But got error:
No implicit Ordering defined for Any.
Any way to do it straight from dataframe instead of array?