I am having a problem when running the same script I have written before. Back then, when I applied quanteda::corpus on a readtext object, it returned a "corpus" and "list" class object. But when I run the same script it returns "corpus" and "character" class objects now. And this affects the subsequent codes. What could be the reason for this and how can I solve this issue?
Here is the script:
txt <- readtext("C:/Users/aerol/Desktop/txt_sample")
corpus_txt <- corpus(txt) %>%
corpus_reshape(to = "sentences")
docvars(corpus_txt, "Treaty") <- corpus_txt$documents$`_document`
docvars(corpus_txt, "Year") <- as.integer(stri_sub(corpus_txt$documents$`_document`, -9, -6))
The files are international treaties. All the filenames are in the same format, they contain the name of the treaty and the year it was signed. And I was extracting these.
Back then the the class of corpus txt was "corpus" "list":
> class(corpus_txt)
[1] "corpus" "list"
But now:
> class(corpus_txt)
[1] "corpus" "character"
> packageVersion("quanteda")
[1] ‘2.1.2’
And I cannot extract information from the corpus the way I did before. Since I was working on this since the last October I should be using the same version all along.
Many thanks in advance.