I'm trying to write a script that simplifies the process of producing a clean corpus from a vector or data frame for text mining and NLP. However, my script produces an error when I run it. My script is as follows:
quick_clean <- function(data, Vector = TRUE, removeNumbers = TRUE, removePunctuation = TRUE,
stop.words = NULL, ...) {
if(Vector == TRUE) {
source <- VectorSource(data)
} else {
source <- DataframeSource(data)
}
corp <- VCorpus(source)
corp <- tm_map(corp, stripWhitespace)
if(removePunctuation == TRUE) {
corp <- tm_map(corp, removePunctuation)
}
if(removeNumbers == TRUE) {
corp <- tm_map(corp, removeNumbers)
}
if(is.null(stop.words)) {
return(corp)
} else {
corp <- tm_map(corp, removeWords, c(stopwords("en"), stop.words))
}
corp
}
When I run it, I get the following error:
Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'FUN' of mode 'function' was not found
I ran the traceback, but I'm not really sure how to use this information:
7. get(as.character(FUN), mode = "function", envir = envir)
6. match.fun(FUN)
5. lapply(X, FUN, ...)
4. tm_parLapply(content(x), FUN, ...)
3. tm_map.VCorpus(corp, removePunctuation)
2. tm_map(corp, removePunctuation)
1. quick_clean(swift_vec)
I also ran Debug and got the following...again, I'm not sure how to use this info:
Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'FUN' of mode 'function' was not found
Called from: get(as.character(FUN), mode = "function", envir = envir)
Browse[1]>
What am I doing wrong here?
sourceas a variable. Its a problem becausesourceis also a function name in R. Change that to something else. - Clock Slave