I have created a dictionary in python and dumped into pickle. Its size went to 300MB. Now, I want to load the same pickle.
output = open('myfile.pkl', 'rb')
mydict = pickle.load(output)
Loading this pickle takes around 15 seconds. How can I reduce this time?
Hardware Specification: Ubuntu 14.04, 4GB RAM
The code bellow shows how much time takes to dump or load a file using json, pickle, cPickle.
After dumping, file size would be around 300MB.
import json, pickle, cPickle
import os, timeit
import json
mydict= {all values to be added}
def dump_json():
output = open('myfile1.json', 'wb')
json.dump(mydict, output)
output.close()
def dump_pickle():
output = open('myfile2.pkl', 'wb')
pickle.dump(mydict, output,protocol=cPickle.HIGHEST_PROTOCOL)
output.close()
def dump_cpickle():
output = open('myfile3.pkl', 'wb')
cPickle.dump(mydict, output,protocol=cPickle.HIGHEST_PROTOCOL)
output.close()
def load_json():
output = open('myfile1.json', 'rb')
mydict = json.load(output)
output.close()
def load_pickle():
output = open('myfile2.pkl', 'rb')
mydict = pickle.load(output)
output.close()
def load_cpickle():
output = open('myfile3.pkl', 'rb')
mydict = pickle.load(output)
output.close()
if __name__ == '__main__':
print "Json dump: "
t = timeit.Timer(stmt="pickle_wr.dump_json()", setup="import pickle_wr")
print t.timeit(1),'\n'
print "Pickle dump: "
t = timeit.Timer(stmt="pickle_wr.dump_pickle()", setup="import pickle_wr")
print t.timeit(1),'\n'
print "cPickle dump: "
t = timeit.Timer(stmt="pickle_wr.dump_cpickle()", setup="import pickle_wr")
print t.timeit(1),'\n'
print "Json load: "
t = timeit.Timer(stmt="pickle_wr.load_json()", setup="import pickle_wr")
print t.timeit(1),'\n'
print "pickle load: "
t = timeit.Timer(stmt="pickle_wr.load_pickle()", setup="import pickle_wr")
print t.timeit(1),'\n'
print "cPickle load: "
t = timeit.Timer(stmt="pickle_wr.load_cpickle()", setup="import pickle_wr")
print t.timeit(1),'\n'
Output :
Json dump:
42.5809804916
Pickle dump:
52.87407804489
cPickle dump:
1.1903790187836
Json load:
12.240660209656
pickle load:
24.48748306274
cPickle load:
24.4888298893
I have seen that cPickle takes less time to dump and load but loading a file still takes a long time.
cPickle
? If not, please try it. You can just use it as a drop-in replacement. – Carstenpickle.HIGHEST_PROTOCOL
(or-1
). This will use a more compact binary-mode data format than the default ASCII-based one. – martineau