I'm dealing with a huge text dataset for content classification. I've implemented the distilbert model and distilberttokenizer.from_pretrained() tokenizer.. This tokenizer is taking incredibly long to tokenizer my text data roughly 7 mins for just 14k records and that's because it runs on my CPU.
Is there any way to force the tokenizer to run on my GPU.