1
votes

I'm trying to run a keyness analysis, everything worked and then, for an unknown reason, it started to give me an error. I'm using data_corpus_inaugural which is the quanteda-package corpus object of US presidents' inaugural addresses.

My code:

> corpus_pres <- corpus_subset(data_corpus_inaugural, 
+                             President %in% c("Obama", "Trump"))
> dtm_pres <- dfm(corpus_pres, groups = "President", 
+                remove = stopwords("english"), remove_punct = TRUE)
Error: groups must have length ndoc(x)
In addition: Warning messages:
1: 'dfm.corpus()' is deprecated. Use 'tokens()' first. 
2: '...' should not be used for tokens() arguments; use 'tokens()' first. 
3: 'groups' is deprecated; use dfm_group() instead 
> 
2
It is possible that it is some kind of quanteda issue? even though quanteda is loaded, it cannot find textstat_keyness > keyness = textstat_keyness(dtm_pres, target = "Trump") Error in textstat_keyness(dtm_pres, target = "Trump") : could not find function "textstat_keyness" - Maayan Klimenko Feinstein
See github.com/quanteda/quanteda/blob/master/…, Should be groups = President in quanteda v3. - Ken Benoit

2 Answers

0
votes

In quanteda v3 "dfm() constructs a document-feature matrix from a tokens object" - https://tutorials.quanteda.io/basic-operations/dfm/dfm/

Try this:

toks_pres <- tokens(pres_corpus, remove_punct = TRUE) %>% 
    tokens_remove(pattern = stopwords("en")) %>%
    tokens_group(groups = President)

pres_dfm <- dfm(toks_pres)
0
votes

I came across same problem when analyzing tweeter accounts and this code works for me. You can search terms across accounts

# to make a group in corpus
twcorpus <- corpus(users) %>%
        corpus_group(groups= interaction(user_username))
        

# to visualize textplot_xray
textplot_xray(kwic(twcorpus, "helsin*"), scale="relative")