FastText .bin file cannot fit in memory, even though I have enough RAM

Question

I'm trying to load one of the FastText pre-trained models that has a form of a .bin file. The size of .bin file is 2.8GB and I have 8GB RAM and 8GB swap file. Unfortunately, the model starts loading and it occupies almost 15GB and then it breaks with the following error:

Process finished with exit code 137 (interrupted by signal 9: SIGKILL)

By observing the system monitor, I can see that RAM and swap are fully occupied, so I think it breaks because it is out of memory.

I'm trying to load the file using Gensim wrapper for FastText

from gensim.models.wrappers import FastText model = FastText.load_fasttext_format('../model/java_ftskip_dim100_ws5')

My questions are the following:

1) Is there any way to fit this model in the current memory of my system?

2) Is it possible to reduce the size of this model? I tried the quantization using the following code

./fasttext quantize -output java_ftskip_dim100_ws5 -input unused_argument.txt

And I'm getting the following error:

terminate called after throwing an instance of 'std::invalid_argument' what(): For now we only support quantization of supervised models Aborted (core dumped)

I would really appreciate your help!

gojomo gojomo · Accepted Answer · 2019-10-16T15:37:48

Some expansion beyond the size-on-disk is expected – especially once you start performing operations like most_similar(). But, if you're truly getting that error from running a mere 2 lines to load the model, something else may be wrong.

You may want to try the non-wrappers gensim FastText implementation – from gensim.models import FastText – in the latest gensim, just in case there are extra memory issues with the version you're using.

(You may also want to check if using the original, compiled Facebook FastText implementation can load the file, and shows similar memory usage.)

I'm not aware of any straightforward ways to shrink a preexisting FastText model. (If you were training the model from your own data, there are a number of pre-training initialization options that could result in a smaller model. But those limits are not meaningful to apply to an already-trained model.)

As you've seen, Facebook has only implemented the 'quantize' trick for the supervised models – and even if that transformation could be applied to more modes, the supporting gensim code would also then need extra updates to understand the changed models.

If you could load it once, in the full (non-wrappers) gensim implementation, it might be practical to truncate all included vectors to be of a lower dimensionality for significant RAM savings, then re-save the model. But given that these are already just-100-dimension vectors, that might cost a lot in expressiveness.

FastText .bin file cannot fit in memory, even though I have enough RAM

1 Answers