I have two dataframes, first one is creating by users manually and second one is errors from machines. I want to merge them based on time interval in first dataframe(df_a)
Here are the dataframes;
d_a = {'Station' : ['A1','A2'],
'Reason_a' : ['Electronic','Feed'],
'StartTime_a' : ['2019-01-02 02:00:00','2019-01-02 04:22:00'],
'EndTime_a' : ['2019-01-02 02:20:00', '2019-01-02 04:45:00']}
d_b = {'Station' : ['A1','A1','A1','A2','A2','A2'],
'Reason_b' : ['a','n','c','d','e','n'],
'StartTime_b' : ['2019-01-02 00:00:00.000','2019-01-02 00:05:00.000','2019-01-01 23:55:00.000','2019-01-02 04:19:53.000','2019-01-02 04:19:37.000','2019-01-02 04:23:00.000'],
'EndTime_b' : ['2019-01-02 00:19:15.000','2019-01-02 00:29:45.000','2019-01-02 00:12:12.000','2019-01-02 04:27:12.000','2019-01-02 04:47:16.000','2019-01-02 04:52:45.000']}
df_a = pd.DataFrame(d_a)
df_b = pd.DataFrame(d_b)
Any intersection point of time intervals of two dataframes considered as valid record.
condition1 = df_b start_time start after df_a start time and ends before df_a endtime
condition2 = df_b start_time starts before df_a start time but ends before df_a endtime
condition3 = df_b start_timestarts between df_a starttime and df_a end time but ends after df_a endtime
In the end I want to merge these two dataframes based on conditions. my ideal table looks like below
Station Reason_a a n c d e
A1 Electronic 1 1 1 0 0
A2 Feed 0 1 0 1 0
How should I approach this problem? Any comment would be helpful.
Thanks in advance.