An assignment I have asked to find the total of a column, but it wants the answer as decimal, rounded to two digits. I got the total pretty easily. Took a bit to get it rounded, but now that I have that, the value is returned as a float object. I can't figure out how to convert this to a decimal. I've tried converting the float into a new dataframe, tried casting the float to doubletype, tried casting as part of the sum formula, I've even tried multiple ways of summing the column. I can't get this!
The DataFrame has a column called "tolls_amount". I tried two ways of summing it, neither of which seemed to return a decimal type (though both return the correct answer):
df2 = df.withColumn("tolls_amount", df["tolls_amount"].cast(DoubleType()))
sum_number = df2[["tolls_amount"]].groupBy().sum().collect()[0][0]
df2 = df.withColumn("tolls_amount", df["tolls_amount"].cast(DoubleType()))
sum_number = df2.agg({"tolls_amount":"sum"}).collect()[0]
just_num = sum_number["sum(tolls_amount)"].round(2)
Here's some of what I tried to get it as a decimal:
sum_number = df2[["tolls_amount"]].groupBy().sum().collect()[0][0]
sum_df = sum_number.withColumn("Total", sum_number.cast(DoubleType()))
sum_2 = sum_number.cast(DoubleType())
sum_number = df2[["tolls_amount"]].groupBy().sum().cast(DoubleType()).collect()[0][0]
I had some others, but have been removing them as they didn't work. Is there any way to accomplish this? The total of the column comes out to 2430066.7000020845 - rounded to 2430066.7 with round(sum_number, 2)
, but both have a type of float instead of decimal.