Assume we have a tf-idf weighted dfm from a corpus of 10K rather small documents.
What's the quanteda
way of extracting the top feature, i.e., max tf-idf values by document?
I do want the entire corpus to be the reference when computing tf-idf. Something along the lines of
topfeatures(some_dfm_tf_idf, n =3, decreasing = TRUE, groups ="id")
returns an appropriate list. Yet it takes quite some time for something that is basically sorted out already at this point. Given that quanteda performs so well in everything I did so far, I am suspect I am might be doing something wrong here.
Maybe this is somewhat related to this discussion on github (https://github.com/quanteda/quanteda/issues/1646) and the example workaround that @Astelix shows.