2
votes

I have a small python program which creates a hdf5 file using the h5py module. I want to write a python module to work on the data from the hdf5 file. How could I do that?

More specifically, I can set the numpy arrays to PyArrayObject and read them using PyArg_ParseTuple. This way, I can read elements from the numpy array when I am writing a python module. How to read hdf5 files so that I can access individual elements?

Update: Thanks for the answers below. I need to read hdf5 file from C and not from Python- I know how to do that. For example:

import h5py as t
import numpy as np
f=t.File('\tmp\tmp.h5', 'w')
#this file is 2+GB
ofmat=np.load('offsetmatrix.npy')
f['FileDataset']=ofmat
f.close()

Now I have a hdf5 file called '\tmp\tmp.h5'. What I need to do is read the individual array elements from the hdf5 file using C (and not python) so that I can do something with those elements. This shows how to extend numpy arrays. How to extend hdf5?

Edit: Grammar

2
If you're using "PyArrayObject", it sounds like you're using the C interface... Are you writing C or python?Joe Kington
If you actually want to read HDF5 files from C code, why don't you use a HDF5 C library? That will be much easier than using a library designed to be used from Python code.Sven Marnach
Thanks Sven, that's what I want to do. But I want to write a python module. This module will do some complex calculations (which is faster in C) and then return the result back to the python script.rchhe
You might want to think about using cython and call the low-level HDF5 C functions to access the data. Or alternatively, pull chunks of data into a numpy array and then do the calculations on the numpy array using cython or writing a python extension: scipy.org/Cookbook/C_Extensions/NumPy_arraysJoshAdel
JoshAdel, I used [this HDFF C function] (hdfgroup.org/HDF5/doc/RM/RM_H5F.html#File-Open) to open the hdf5 file directly and do my calculations -which I then send back to python. Thanks.rchhe

2 Answers

2
votes

h5py gives you a direct interface for reading/writing and manipulating data stored in an hdf5 file. Have you looked at the docs?

http://docs.h5py.org/

I advise starting with these. These have pretty clear examples of how to do simple data access. If there are specific things that you are trying to do that aren't covered by the methods in h5py, could you please give a more specific description of your desired usage?

1
votes

If you don't actually need a particular structure of HDF5, but you just need the speed and cross-platform compatibility, I'd recommend taking a look at PyTables. It has the built-in ability to read and write Numpy arrays.