0
votes

I have a set of data for my vehicle tracking system that requires me to calculate the distance base on lat and long. Understand that by using haversine formula can help getting distance between rows but I'm sort of stucked as I need my distance based on 2 field(Model type and mode).

As shown below is my code:

def haversine(lat1,lon1,lat2,lon2, to_radians = True, earth_radius =6371):
    if to_radians:
        lat1,lon1,lat2,lon2 = np.radians([lat1,lon1,lat2,lon2])

    a = np.sin((lat2-lat1)/2.0)**2+ np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2

    return earth_radius *2 * np.arcsin(np.sqrt(a))

mydataset = pd.read_csv(x + '.txt')
print (mydataset.shape)
mydataset = mydataset.sort_values(by=['Model','timestamp']) #sort
mydataset['dist'] = 
np.concatenate(mydataset.groupby(["Model"]).apply(lambda 
         x: haversine(x['Latitude'],x['Longitude'],
         x['Latitude'].shift(),x['Longitude'].shift())).values)

With this, I am able to calculate the distance based on the model(by using sorting) between the rows.

But I would like to take it a step further to calculate based on both Mode and model. My fields are "Index, Model, Mode, Lat, Long, Timestamp"

Please advice!

Index, Model, Timestamp, Long, Lat, Mode(denote as 0 or 2), Distance Calculated
1, X, 2018-01-18 09:16:37.070, 103.87772815, 1.35653496, 0, 0.0
2, X, 2018-01-18 09:16:39.071, 103.87772815, 1.35653496, 0, 0.0
3, X, 2018-01-18 09:16:41.071, 103.87772815, 1.35653496, 0, 0.0
4, X, 2018-01-18-09:16:43.071, 103.87772052, 1.35653496, 0, 0.0008481795
5, X, 2018-01-18 09:16:45.071, 103.87770526, 1.35653329, 0, 0.0017064925312804799
6, X, 2018-01-18 09:16:51.070, 103.87770526, 1.35653329, 2, 0.0
7, X, 2018-01-18 09:16:53.071, 103.87770526, 1.35653329, 2, 0.0
8, X, 2018-01-18 09:59:55.072, 103.87770526, 1.35652828, 0, 0.0005570865824842293

I need it to calculate distance of total journey of a model and also distance of total journey of a model in whichever mode

1
I tried editing my groupby(["Model","Mode"]) but it doesnt work. Can someone help me with this too? - ThanksForHelping
Can you add some data sample? 5, 6 rows. - jezrael
it's difficult to comment without looking at your data, why can't you groupby separately for Mode and then concatenate? - min2bro
Hi updated with data, due to confidential info, this is all i can show. I've done this up based on some other posting on stackoverflow. As im still new to python, would appreciate it if you are able to explain the terms used. Sorry for any inconvenience caused - ThanksForHelping
@ThanksForHelping Do you need two new distance columns, first the distance from Mode and second from Model? - min2bro

1 Answers

0
votes

I think need add DataFrame contructor to function and then add another column name to groupby like ["Model", "Mode(denote as 0 or 2)"] or ["Model", "Mode"] by columns names:

def haversine(lat1,lon1,lat2,lon2, to_radians = True, earth_radius =6371):
    if to_radians:
        lat1,lon1,lat2,lon2 = np.radians([lat1,lon1,lat2,lon2])

    a = np.sin((lat2-lat1)/2.0)**2+ np.cos(lat1) * np.cos(lat2) * np.sin((lon2- 
    lon1)/2.0)**2

    return pd.DataFrame(earth_radius *2 * np.arcsin(np.sqrt(a)))


mydataset['dist'] = (mydataset.groupby(["Model", "Mode(denote as 0 or 2)"])
                              .apply(lambda x: haversine(x['Lat'],
                                                         x['Long'], 
                                                         x['Lat'].shift(),
                                                         x['Long'].shift())).values)

#if need replace NaNs to 0  
mydataset['dist'] = mydataset['dist'].fillna(0)

print (mydataset)
   Index Model               Timestamp        Long       Lat  \
0      1     X 2018-01-18 09:16:37.070  103.877728  1.356535   
1      2     X 2018-01-18 09:16:39.071  103.877728  1.356535   
2      3     X 2018-01-18 09:16:41.071  103.877728  1.356535   
3      4     X 2018-01-18 09:16:43.071  103.877721  1.356535   
4      5     X 2018-01-18 09:16:45.071  103.877705  1.356533   
5      6     X 2018-01-18 09:16:51.070  103.877705  1.356533   
6      7     X 2018-01-18 09:16:53.071  103.877705  1.356533   
7      8     X 2018-01-18 09:59:55.072  103.877705  1.356528   

   Mode(denote as 0 or 2)  Distance Calculated      dist  
0                       0             0.000000  0.000000  
1                       0             0.000000  0.000000  
2                       0             0.000000  0.000000  
3                       0             0.000848  0.000848  
4                       0             0.001706  0.001706  
5                       2             0.000000  0.000557  
6                       2             0.000000  0.000000  
7                       0             0.000557  0.000000