I'm having the world of issues performing a rolling join of two dataframes in pyspark (and python in general). I am looking to join two pyspark dataframes together by their ID & closest date backwards (meaning the date in the second dataframe cannot be greater than the one in the first)
Table_1:
Table_2:
Desired Result:
In essence, I understand an SQL Query can do the trick where I can do spark.sql("query") So anything else. I've tried several things which aren't working in a spark context. Thanks!


