14
votes

Does any one have an idea for updating hdf5 datasets from h5py? Assuming we create a dataset like:

import h5py
import numpy
f = h5py.File('myfile.hdf5')
dset = f.create_dataset('mydataset', data=numpy.ones((2,2),"=i4"))
new_dset_value=numpy.zeros((3,3),"=i4")

Is it possible to extend the dset to a 3x3 numpy array?

1

1 Answers

16
votes

You need to create the dataset with the "extendable" property. It's not possible to change this after the initial creation of the dataset. To do this, you need to use the "maxshape" keyword. A value of None in the maxshape tuple means that that dimension can be of unlimited size. So, if f is an HDF5 file:

dset = f.create_dataset('mydataset', (2,2), maxshape=(None,3))

creates a dataset of size (2,2), which may be extended indefinitely along the first dimension and to 3 along the second. Now, you can extend the dataset with resize:

dset.resize((3,3))
dset[:,:] = np.zeros((3,3),"=i4")

The first dimension can be increased as much as you like:

dset.resize((10,3))