0
votes

I would like to import this file : https://www.data.gouv.fr/fr/datasets/elections-municipales-2014-resultats-1er-tour/

But it has more column than column name and some rows are so long that it sometimes skip to the next row filling the wrong column. Please help me import this.

Thank you.

It will be very difficult to go forward without knowing more about the format and whether it follows any specifications at all. A quick investigation shows that the number of semicolons in each row differs wildly, with 28 and 39 being the most common.ktiu
the data is a csv delimited with semicolon, it has header for the first 28 columns but there is actually more columns. The other columns are in a pattern. they are by groups of 11 and there is at first glance more than 1000 columns.remsssteack
I see, that information is helpful! And yes, the file has 9907 lines, and the longest line has 138 semicolons (28 + 11 * 10). It looks like the task would be to (1) read the file once, truncating every line after the 28th semicolon and (2) assemble a dataset from the remaining columns. I imagine there is information (an ID of some sort) in the first 28 columns that you will want to keep for each line of 11 columns? Which column would that be?ktiu
Yes but I need the entire dataset. a primary key for this dataset would be a combination of "libéllé du département" and "libéllé de la commune". But don't worry anymore someone find a fix for my problem on the rstudio community website. ty you for your help though. I can link you the post if you want to see his solution. it's a bit complicated for me but he came up with the same idea.remsssteack