I've seen similar questions posted but they're not exactly the same as what I've encountered. I am using Python 3.7 and Pandas 0.25.0.
Weirdly, if I download this zip file directly from this link, I am able to read it via pd.read_csv
as follows:
TeamId TeamName SubmissionDate Score
0 688191 Sergey Mushinskiy 2017-05-24 12:20:34 0.06630
1 688203 DeepVoltaire 2017-05-24 12:25:03 0.06630
2 688237 RakeshNikam 2017-05-24 13:02:31 0.06512
However, if I do:
this_leaderboard_df = pd.read_csv('https://www.kaggle.com/c/6649/publicleaderboarddata.zip,
I will get a BadZipFile
error as follows. Why does this happen?
--------------------------------------------------------------------------- BadZipFile Traceback (most recent call last) in ----> 1 this_leaderboard_df = pd.read_csv(this_leaderboard_link, compression='zip') 2 this_leaderboard_df.head(e)
BadZipFile: File is not a zip file
can't login to this page so it gets HTML pages with login form instead of zip file. – furasSelenium
to control web browser and login to kaggle and click on link to download file. – furas