I have been given an excel spreadsheet: column names are in the first row, garbage text is in the second row, and the actual data begins in the third row. I want to use the readxl
package to read this into a dataframe, keeping the column names from the first row but discarding the second row.
Simply reading all the rows into a dataframe and then deleting the first row won't work, because the garbage that's in the second row of the excel file won't match the data type of the column.
I'd like a way to do this without manually editing the excel file.
read_excel
quite robust when it comes to "nonsense" lines: it will still read the file, but any potential cleaning is up to you. – Maurits Evers