I am trying to convert an annotated NLP model of size 1.2GB to dataframe. I am using the Udpipe package for natural language processing in R with following code:
# Additional Topic Models
# annotate and tokenize corpus
model <- udpipe_download_model(language = "english")
udmodel_english <- udpipe_load_model(model$file_model)
s <- udpipe_annotate(udmodel_english, cleaned_text_NLP)
options(java.parameters = "-Xmx32720m")
x <- data.frame(s)
Note that I have 32GB RAM and allocated all available memory to R to run the code. I also tried deleting large objects stored in the R environment space that are not relevant for running the above code. R cannot seem to allocate enough memory for the task and the following error message was the result:
Error in strsplit(x$conllu, "\n") :
could not allocate memory (4095 Mb) in C function 'R_AllocStringBuffer'
My question is two fold:
- What does the above error message mean?
- What workarounds are available to fix this issue?