I have a dataframe that contains paragraphs that I would like to perform Latent Dirichlet allocation. To do this I need to create a term document matrix. This example code shows the error:
library(qdap)
library(topicmodels)
remove(list=ls())
doc <- c(1,2,3,4)
text <- c("The Quick Brown Fox Jumped Over The Lazy Dog",
"The Cow Jumped Over The Moon",
"Moo, Moo, Brown Cow Have You Any Milk",
"The Fox Went Out One Moonshiny Night")
works.df <- data.frame(doc,text)
works.tdm <- as.tdm(text.var = works.df$text, grouping.var = works.df$doc)
works.lda <- LDA(works.tdm, k = 2, control = list(seed = 1234))
where
works.tdm <- as.tdm(text.var=works.df$text, grouping.var=works.df$doc) Error in .TermDocumentMatrix(x, weighting) : argument "weighting" is missing, with no default
What I thought was that I would get a sparse matrix where, for example: term "the" appears in documents 1 (with a frequency of 2), 2 (with a frequency of 2) and 4 (with a frequency of 1); term "cow" appears in documents 2 and 3 (both frequency of 1); ...
Chan anyone advise as to what is missing or if there is a better way to achieve my task? Thanks.