I've been using R's tm package with much success on classificaiton issues. I know how to find the most frequent terms across the entire corpus (with findFreqTerms()), but don't see anything within the documentation that would find the most frequent term (after I've stemmed and removed stopwords, but before I remove sparse terms) in each individual document in the corpus. I've tried using apply() and the max command, but this gives me the maximum number of times the term in each document occurs, not the name of the term itself.
library(tm)
data("crude")
corpus<-tm_map(crude, removePunctuation)
corpus<-tm_map(corpus, stripWhitespace)
corpus<-tm_map(corpus, tolower)
corpus<-tm_map(corpus, removeWords, stopwords("English"))
corpus<-tm_map(corpus, stemDocument)
dtm <- DocumentTermMatrix(corpus)
maxterms<-apply(dtm, 1, max)
maxterms
127 144 191 194 211 236 237 242 246 248 273 349 352
5 13 2 3 3 10 8 3 7 9 9 4 5
353 368 489 502 543 704 708
4 4 4 5 5 9 4
Thoughts?