2
votes

I am trying to create a resizable dataset in h5py. It should be a simple one dimensional array with some initial values written in it, and then updated with additional values when they are available. When I try this:

ds = g2.create_dataset(wf, maxshape=(None), chunks=True, data=values)
size = ds.shape[0] + len(values)
ds.resize(size, axis=0)

I get this error:

ValueError: Unable to set extend dataset (Dimension cannot exceed the existing maximal size (new: 120 max: 60))

However, it seems that providing data or setting the shape overrides the maxshape and the dataset is not resizing, with the message that the current maximum shape is either that of data initially provided or set in the shape attribute.

According to the h5py documentation this is exactly how it should be done, and setting the maxshape to None should provide the unlimited extensions, while setting the chunks to True should enable automatic chunk size determination.

I have also tried something like this, and add data separately:

ds = g2.create_dataset(wf,(100,), maxshape=(None), chunks=True, dtype='i')

It throws the same error, and by now I am not sure if I am setting dimensions incorrectly or if it has anything to do with the data type or shape.

1
I think you have to do a resize to add new material. Isn't there something about that in the docs stackoverflow.com/questions/40062770/…hpaulj
I am resizing, but that is the problem, it does not want to accept the new size since it seems that maxshape has been set to the size of initial data, not the one set in maxshape variable. Thanks for the information, I did not find that post before.TeilaRei

1 Answers

0
votes

The only thing I'm doing different is use (None,) for shape, not (None); that is making sure I give it a tuple shape. I haven't tried it without the comma.

In [177]: f=h5py.File('test1.h5','w')
In [178]: ds = f.create_dataset('name', maxshape=(None,), chunks=True, data=np.arange(10))
In [179]: ds.shape
Out[179]: (10,)
In [180]: ds.resize((20,))
In [181]: ds[:]
Out[181]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
In [182]: ds[10:]=np.arange(10,20)
In [183]: ds[:]
Out[183]: 
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

maxshape must be a tuple. resize does not work with (None).