I am using the yelp dataset to get the hours multiple businesses are open. The time range is put in a list full of dictionaries for each company as shown below.
{'Monday': '0:0-0:0', 'Tuesday': '8:0-18:30', 'Wednesday': '8:0-18:30', 'Thursday': '8:0-18:30', 'Friday': '8:0-18:30', 'Saturday': '8:0-14:0'}
{'Monday': '8:0-22:0', 'Tuesday': '8:0-22:0', 'Wednesday': '8:0-22:0', 'Thursday': '8:0-22:0', 'Friday': '8:0-23:0', 'Saturday': '8:0-23:0', 'Sunday': '8:0-22:0'}
{'Monday': '7:0-20:0', 'Tuesday': '7:0-20:0', 'Wednesday': '7:0-20:0', 'Thursday': '7:0-20:0', 'Friday': '7:0-21:0', 'Saturday': '7:0-21:0', 'Sunday': '7:0-21:0'}
{'Wednesday': '14:0-22:0', 'Thursday': '16:0-22:0', 'Friday': '12:0-22:0', 'Saturday': '12:0-22:0', 'Sunday': '12:0-18:0'}
{'Monday': '0:0-0:0', 'Tuesday': '6:0-22:0', 'Wednesday': '6:0-22:0', 'Thursday': '6:0-22:0', 'Friday': '9:0-0:0', 'Saturday': '9:0-22:0', 'Sunday': '8:0-22:0'}
{'Monday': '0:0-0:0', 'Tuesday': '10:0-18:0', 'Wednesday': '10:0-18:0', 'Thursday': '10:0-18:0', 'Friday': '10:0-18:0', 'Saturday': '10:0-18:0', 'Sunday': '12:0-18:0'}
{'Monday': '9:0-17:0', 'Tuesday': '9:0-17:0', 'Wednesday': '9:0-17:0', 'Thursday': '9:0-17:0', 'Friday': '9:0-17:0'}
None
{'Monday': '0:0-0:0', 'Tuesday': '6:0-21:0', 'Wednesday': '6:0-21:0', 'Thursday': '6:0-16:0', 'Friday': '6:0-16:0', 'Saturday': '6:0-17:0', 'Sunday': '6:0-21:0'}
This goes on for 150,000 elements.
As you can see, one element does not have any information and says "None" instead. I used the dropna() command to clear those. However, it leaves a gap in the list and disrupts a for loop I have that calculates the hours.
Here is an small example to explain what I mean.
The table starts as this
index,0
0,0.0
1,1.0
2,3.0
3,NaN
4,4.0
5,5.0.
and changes to
index,0
0,0.0
1,1.0
2,3.0
4,4.0
5,5.0
after using dropna()
As you can see, the table skips from 2 to 4.
In my for loop I made it calculate the total time for each week in a range(1-150000), but since there is no row 8 and other rows, it throws an error and stops the loop.
So... My question is, how can I run my for loop such that is skips over these non existent rows.
Also, here is my code:
df_new = df_hours.dropna()
for i in range(1,150000):
dc = df_new[i]
print(dc)
sum_elapsed = 0
for _, v in dc.items():
start, end = v.split('-')
hhs, mms = (int(v) for v in start.split(':'))
hhe, mme = (int(v) for v in end.split(':'))
elapsed = (hhe * 60 + mme) - (hhs * 60 + mms)
sum_elapsed += elapsed
print(sum_elapsed)