python - How to save and load numpy.array() data properly?

Question

I wonder, how to save and load numpy.array data properly. Currently I'm using the numpy.savetxt() method. For example, if I got an array markers, which looks like this:

enter image description here

I try to save it by the use of:

numpy.savetxt('markers.txt', markers)

In other script I try to open previously saved file:

markers = np.fromfile("markers.txt")

And that's what I get...

enter image description here

Saved data first looks like this:

0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00
0.000000000000000000e+00

But when I save just loaded data by the use of the same method, ie. numpy.savetxt() it looks like this:

1.398043286095131769e-76
1.398043286095288860e-76
1.396426376485745879e-76
1.398043286055061908e-76
1.398043286095288860e-76
1.182950697433698368e-76
1.398043275797188953e-76
1.398043286095288860e-76
1.210894289234927752e-99
1.398040649781712473e-76

What am I doing wrong? PS there are no other "backstage" operation which I perform. Just saving and loading, and that's what I get. Thank you in advance.

What's the output of the text file? Why not just write to a CSV file? — user554546
Do you need to save and load as human-readable text files? It will be faster (and the files will be more compact) if you save/load binary files using np.save() and np.load(). — ali_m
Thank you for your advice. It helped. However, can you explain why it is what it is, and if there is any way to allow saving data in *.txt format and loading it without headache? For example, when one want to work with matlab, java, or other tools/languages. — bluevoxel
To pass arrays to/from MATLAB you can use scipy.io.savemat and scipy.io.loadmat. — ali_m
The default for fromfile is to read the data as binary. loadtxt is the correct pairing with savetxt. Look at the function documentation. — hpaulj

xnx xnx · Accepted Answer · 2015-02-10T19:31:45

The most reliable way I have found to do this is to use np.savetxt with np.loadtxt and not np.fromfile which is better suited to binary files written with tofile. The np.fromfile and np.tofile methods write and read binary files whereas np.savetxt writes a text file. So, for example:

a = np.array([1, 2, 3, 4])
np.savetxt('test1.txt', a, fmt='%d')
b = np.loadtxt('test1.txt', dtype=int)
a == b
# array([ True,  True,  True,  True], dtype=bool)

Or:

a.tofile('test2.dat')
c = np.fromfile('test2.dat', dtype=int)
c == a
# array([ True,  True,  True,  True], dtype=bool)

I use the former method even if it is slower and creates bigger files (sometimes): the binary format can be platform dependent (for example, the file format depends on the endianness of your system).

There is a platform independent format for NumPy arrays, which can be saved and read with np.save and np.load:

np.save('test3.npy', a)    # .npy extension is added if not given
d = np.load('test3.npy')
a == d
# array([ True,  True,  True,  True], dtype=bool)

python - How to save and load numpy.array() data properly?

4 Answers