0
votes

I have a pandas dataframe like below including four columns but two points(lat1,lon1) and (lat2,lon2) per row:

   lat1       lon1        lat2        lon2
=========   =========   =========   =========  
30.172705   31.126725   30.188281   31.132326
30.272805   31.226725   30.288281   31.232326
30.372905   31.326725   30.388281   31.332326
30.472105   31.426725   30.488281   31.432326
30.572205   31.526725   30.588281   31.532326

what is the most efficient way to calculate the distance between (lat1,lon1) and (lat2,lon2) mentioned in each row in KM using geopy?

2
Did you try anything?Binyamin Even

2 Answers

0
votes

What coordinate systems is your lat/lon?

Lets try Euc- squrt(delta(lat)**2+ delta(lon)**2)

 df['dist']=((df.lat1.sub(df.lat2)**2).add(df.lon1.sub(df.lon2)**2))**0.5
0
votes

Even if you not seem to have done any research yourself here you go: Short googling yielded: https://geopy.readthedocs.io/en/stable/#module-geopy.distance

With the hands on that you can now find out relatively easy how to access a DF in pandas and to apply operations (@wwnde showed that already).

Combining those two basics gives:

import pandas as pd
import numpy as np
from geopy import distance

# Generate some random data (lon, lat must be in (-90, 90)
df = pd.DataFrame(np.random.randint(-90, 90, size=(100, 4)), columns=list(['lo1', 'la1', 'lo2', 'la2']))
print(df)

# applies the distance function as described in the provided link
df['km'] = df.apply(lambda x: distance.distance((x[0], x[1]), (x[2], x[3])), axis=1)
print(df)

Additionally I found this as a first link, but didn't read it as the solution is quite simple.

As the CoC for StackOverflow suggests pleas provide what you have tried and have the correct behaviour to have simple look at google.