I have a document called words, and each line has a new word. I want to turn each one of these words into a list of its constituent characters. I do this by just doing list(x)
where x is the word.
This is what I am doing, but I want a way to parallelize this:
split = []
with open('wordprob.txt','rt') as lines:
for line in lines:
split.append(list(line))
I am using this approach so that I do not have to load the entire file (over 3 gb) into memory. When parallelizing it by loading the file first, my memory usage goes over 16 gb.
How can I parallelize it without loading the file into memory, just like in the loop above?
Thanks!
EDIT: Below it was pointed out that the list will take up lots of memory. Instead, how would I store each list of characters (originally from a single word) as a space delimited string on a new line of a new document. Again, I'm looking for a parallelized solution.