I am currently using the tm package to do some text mining. I want to be able to export my document term matrix as a data frame with my corpus meta data attached (id variable, etc.) Here is my current workflow:
- Import data set
- Convert to corpus
- Basic cleaning
- Create TF-IDF Document Term Matrix
- Transform the DTM into a dataframe
- Export the dataframe with corpus meta data
Number 5 is where I am getting stuck. I feel like this should definitely be possible with the package, but I can't find any documentation. Does the metadata get lost when creating a DTM using tm?