0
votes

I wanted to read all the files and load multiple sheets from an excel file to R, use the xlsx package.
I pasted the code below:

filelist <- list.files(pattern = "\\.xls") # list all the xlsx files
library(xlsx)
allxlsx.files <- list()  # create a list to populate with xlsx data 

for (file in filelist) {

         wb <- loadWorkbook(file)
         sheets <- getSheets(wb)
         sheet <- sheets[['_']] # get sheets with field section reading
         res <- readColumns(sheet, 1, 2, 114, 120, colClasses=c("character", "numeric"))
}

traceback() 1: readColumns(sheet, 1, 2, 114, 120, colClasses = c("character", "numeric")) at #6

Can someone enlighten me how to proceed?

1
Welcome to SO!, could you update you post using edit option to include output of traceback() - Silence Dogood
The problem is on the sheet, it will display as NULL. What is the appropriate way of coding? I want to get all sheets containing "_" on it. There could be 4 sheets or 3 sheets per file. - J.M.

1 Answers

0
votes

I think you are subsetting the sheets incorrectly. You can use grep on names of sheets to get all the file names with "_".

I have created and used a single xlsx file with hypothetical data having 5 sheets with names as below for demonstration.

> names(sheets)
[1] "Sheet_1" "Sheet2"  "Sheet_3" "Sheet4"  "sheet_4"

Getting the required sheets can be done using

sheet = sheets[grep("_",names(sheets))]

You can check it by using

> names(sheet)
[1] "Sheet_1" "Sheet_3" "sheet_4"

So your final code will look like following

filelist <-  "sheeetLoadTrial1.xlsx" # single xlsx files
library(xlsx)
allxlsx.sheets <- list()  # create a list to populate with xlsx sheet data 

for (file in filelist) {

        wb <- loadWorkbook(file)
        sheets <- getSheets(wb)
        sheet = sheets[grep("_",names(sheets))]
        for(i in c(1:length(sheet))){
                res <- readColumns(sheet[[i]], 1, 2,1,8,header = F)
                allxlsx.sheets[[i]] = res
        }
        names(allxlsx.sheets) <- names(sheet)
}

after this your final required list will be

> allxlsx.sheets
$Sheet_1
  X1 X2
1  1  2
2  2  3
3  2  4
4  2  5
5  2  6
6  2  7
7  2  8
8  2  9

$Sheet_3
  X1 X2
1  1  2
2  2  3
3  2  4
4  2  5
5  2  6
6  2  7
7  2  8
8  2  9

$sheet_4
  X1 X2
1  1  2
2  2  3
3  2  4
4  2  5
5  2  6
6  2  7
7  2  8
8  2  9

For more than one file you can just append the allxlsx.sheets to allxlsx.files list.