0
votes

I am new to R and I have run into a problem. I have a folder with 50 csv files, each representing a city. I want to import the each csv files into R studio as independent data frames to eventually plot all 50 cities in one time series plot.

There are four things I want to do to each csv file, but in the end, have it automated that these four actions are done to each of the 50 csv files.

  1. Skip the first 25 row of the csv file

  2. Combine the Date and Time column for each csv file

  3. Remove the rows where the values in the cells in column 3 is empty

  4. Change the name of column 3 from "ug/m3" to "CO"

After skipping, the first row will be the header

I used the code below on one csv file to see if it would work on one csv.Everything work except for city[,3][!(is.na(city[,3]))].

city1 <- read.csv("path",
                        skip = 25)

city1$rtime <- strptime(paste(city1$Date, city1$Time), "%m/%d/%Y %H:%M")

colnames(city1)[3] <- "CO"

city[,3][!(is.na(city[,3]))] ## side note: help with this would be appreciated, I was if something goes before the comma especially.

I am not sure how to combine everything in an efficient manner in a function.

I would appreciate suggestions on an efficient manner to perform the 4 actions ( in a function statement maybe) to each csv file while importing them to R.

1

1 Answers

0
votes

Use this function for each csv you want to read

read_combine <- function(yourfile){

file <- read.csv(yourfile,skip=25)
file$rtime <- strptime(paste(file $Date, file $Time), "%m/%d/%Y %H:%M")
colnames(file)[3] <- "CO"
file$CO[!is.na(file$CO)]

}

yourfile must be "path"