0
votes

I tried to round off a double value without decimal points in spark dataframe but same value is obtained at the output.

Below is the dataframe column value .

+-----+-----+
| SIG1| SIG2|
+-----+-----+
| 46.0| 46.0|
| 94.0| 46.0|

The schema for the dataframe column is as below.

scala> df.printSchema
root
 |-- SIG1: double (nullable = true)
 |-- SIG2: double (nullable = true)

The expected output is as below

+-----+-----+
| SIG1| SIG2|
+-----+-----+
| 46  |   46|
| 94  |   46|

I have tried rounding of column as below as per the document

+------------------------------------------------------------------+
|ReturnType| Signature     |                            Description|
+------------------------------------------------------------------+
|DOUBLE    |round(DOUBLE a)| Returns the rounded BIGINT value of a.|

the code used is

val df1 = df.withColumn("SIG1", round(col("SIG1"))).withColumn("SIG2", round(col("SIG2")))

Do we need to cast the column into int/bigint or is it possible with round function itself?

Thanks in advance!

2
Please do not take this as an opportunity to down vote, As i mentioned I have seen in a document that it is possible and not happening, just want to confirm that!Antony

2 Answers

1
votes

round function returns double values too, so if you want int type then cast it.

scala> Seq(1.9999,2.1234,3.6523).toDF().select(round('value,2)).show()
+---------------+
|round(value, 2)|
+---------------+
|            2.0|
|           2.12|
|           3.65|
+---------------+


scala> Seq(1.9999,2.1234,3.6523).toDF().select(round('value,0)).show()
+---------------+
|round(value, 0)|
+---------------+
|            2.0|
|            2.0|
|            4.0|
+---------------+


scala> Seq(1.9999,2.1234,3.6523).toDF().select(round('value)).show()
+---------------+
|round(value, 0)|
+---------------+
|            2.0|
|            2.0|
|            4.0|
+---------------+

scala> Seq(1.9999,2.1234,3.6523).toDF().select('value.cast("int")).show()
+-----+
|value|
+-----+
|    1|
|    2|
|    3|
+-----+
-1
votes

You don't need to cast the column. If you want to get rid of the digits after the decimal point you can use round(colName, 0).