I am trying to create a 78TB HDF5 dataset by filling it in a 2d block-partition manner. This is very slow when the block I'm writing spans rows that haven't ever been written to, because HDF5 is going in and allocating the diskspace and filling in the missing entries with zero.
Instead, I would like h5py to allocate the disk space for my dataset as soon as its created, and never fill it. This is possible with the C api according to Table 16 in the HDF5 Dataset documentation, but how can I do this with h5py, preferably with the high level interface?