I am trying to combine the values of different columns in a single data frame's row into a string, separated by a comma, so that I can create a custom SQL insert string to be executed on a MySQL database. I have 67 different columns, and I am trying to prevent writing code that addresses each column's name individually, mainly to maximize reusability of the code for different size dataframes. I could potentially have anywhere from 1 to 2000 rows to iterate through, with each row having an INSERT query.
For example, if my DataFrame includes the following:
RecDate WindDir WindSpeed OutdoorTemperature OutdoorHumidity
20160321 121 3 67.5 43.8
20160322 87 5 73.1 53.2
20160323 90 2 71.1 51.7
20160324 103 7 68.3 47.0
I am wanting to create a string for each row in the dataframe: INSERT INTO tablename VALUES (20160321, 121, 3, 67.5, 43.8) INSERT INTO tablename VALUES (20160322, 87, 5, 73.1, 53.2) INSERT INTO tablename VALUES (20160323, 90, 2, 71.1, 51.7) INSERT INTO tablename VALUES (20160324, 103, 7, 68.3, 47.0)
I have considered using the dataframe's to_sql() function, but was not able to get the code to work with my database structure.
So, my goal was to iterrate through each row, and manually creating the string in the parentheses, separated by a comma:
for index, row in df.iterrows():
print('INSERT INTO tablename VALUES (%s, %s, %s, %s, %s)' % (row['RecDate'], row['WindDir'], row['WindSpeed'], row['OutdoorTemperature'], row['OutdoorHumidity']))
To make my code "pythonic" and not as rigid, I tried to iterrate through each row, adding a comma between each column index:
for index, row in df.iterrows():
string = ''
for x in range(len(row)):
string += '%s, ' % row[x]
print('INSERT INTO tablename VALUES (%s)' % string)
I am routinely getting index errors and out of bound errors with the above code, and am not really sure what the correct route to go is. I'd appreciate an inspection of my code and thought process, and any recommendations on how I can improve the code. My goal is to be as efficient as possible, minimize the amount of code I have to write (especially when there's 67 columns!), but still make the code flexible for various uses, especially if the number of columns were to ever change.
Thank you!