I have a dataframe in pyspark which I read as follow:
df = spark.table('db.table')
.select(F.col('key').alias('key_a'),
F.to_date('move_out_date', 'yyyyMMdd').alias('move_out_date'))
Now I want to compare the move_out_date
column with a date which is 20151231
. But the code below isn't working
from pyspark.sql import functions as F
df.filter(F.datediff(F.col('move_out_date'), F.to_date('20151231', 'yyyyMMdd')) > 0)
How do you compare to_date columns with one single value?
df.filter(df.move_out_date > f.to_date(f.lit('20151231'), 'yyyyMMdd'))
– SMaZ