I got a dataframe with two columns that are holding Longitude and Latitude coordinates:
import pandas as pd
values = {'Latitude': {0: 47.021503365600005,
1: 47.021503365600005,
2: 47.021503365600005,
3: 47.021503365600005,
4: 47.021503365600005,
5: 47.021503365600005},
'Longitude': {0: 15.481974060399999,
1: 15.481974060399999,
2: 15.481974060399999,
3: 15.481974060399999,
4: 15.481974060399999,
5: 15.481974060399999}}
df = pd.DataFrame(values)
df.head()
Now I want to apply a rolling window function on the dataframe that takes the Longitude AND Latitude (two columns) of one row and another row (window size 2) in order to calculate the haversine distance.
def haversine_distance(x):
print (x)
df.rolling(2, axis=1).apply(haversine_distance)
My problem is that I never get all four values Lng1, Lat1 (first row) and Lng2, Lat2 (second row). If I use axis=1, then I will get Lng1 and Lat1 of the first row. If I use axis=0, then I will get Lng1 and Lng2 of the first and second row, but Longitude only.
How can I apply a rolling window using two rows and two columns? Somewhat like this:
def haversine_distance(x):
row1 = x[0]
row2 = x[1]
lng1, lat1 = row1['Longitude'], row1['Latitude']
lng2, lat2 = row2['Longitude'], row2['Latitude']
# do your stuff here
return 1
Currently I'm doing this calculation by joining the dataframe with itself by shift(-1) resulting in all four coordinates in one line. But it should be possible with rolling as well. Another option is combining Lng and Lat into one column and apply rolling with axis=0 onto that. But there must be an easier way, right?
shift(-1)
and applying your function to each row is the most efficient way to do this. I don't know of any way to apply a function to a rolling window on multiple columns at once. – Ken Syme