Okay, I have a similar problem (although I'm not working with medical images) and found a solution, so I hope others will find it useful too.
- I assume that you need a custom function to retrieve the images in batches, because they won't fit all into the memory at once, because the *.hdr file format is not supported, and because the existing keras helper functions don't use regression. (I guess you're doing some type of segmentation, if you're using the u-net?)
- I also assume that you need the ImageDataGenerator, because you don't want to implement the data augmentation yourself.
So, because of 1) you'll need to use the fit_generator function in conjuction with the IDG, the only problem is that the ImageDataGenerator (IDG) does not support custom generators.
There are actually instances where you would use the IDG with the fit_generator function: The IDG flow function returns an Iterator of type NumpyArrayIterator. You can't use this one because it requires the data to fit into the working memory.
The way the IDG.flow function is used/works is that you first create an instance of the IDG object and then call the flow function which creates and returns a NumpyArrayIterator which holds a reference to your IDG object.
One solution is now to write your custom DataGenerator which inherits from the keras.preprocessing.image.Iterator
class and implements the _get_batches_of_transformed_samples
(see here).
Then you extend the IDG class and write a flow_from_generator
function which returns an instance of your custom DataGenerator.
This sounds more taxing than it really is, but be sure to familiarize yourself with the IDG, NumpyArrayIterator and Iterator code.
Here is how this would look like:
class DataGenerator(keras.preprocessing.image.Iterator):
def__init__(self, image_data_generator, *args, **kwargs):
#init whatever you need
self.image_data_generator = image_data_generator
#call Iterator constructor:
super(DataGenerator, self).__init__(number_of_datapoints, batch_size, shuffle, shuffle_seed)
def _get_batches_of_transformed_samples(self, index_array):
''' Here you retrieve the images and apply the image augmentation,
then return the augmented image batch.
index_array is just a list that takes care of the shuffling for you (see super class),
so this function is going to be called with index_array=[1, 6, 8]
if your batch size is 3
'''
x_transformed = np.zeros((batch_size, x_img_size, y_img_size, input_channel_num), dtype_float32)
y_transformed = np.zeros((batch_size, x_img_size, y_img_size, output_channel_num), dtype_float32)
for i, j in enumerate(index_array):
x = get_input_image_from_index(j)
y = get_output_image_from_index(j)
params = self.image_data_generator.get_random_transform(self.img_shape)
x = self.image_data_generator.apply_transform(x, params)
x = self.image_data_generator.standardize(x)
x_transformed[i] = x
y = self.image_data_generator.apply_transform(y, params)
y = self.image_data_generator.standardize(y)
y_transformed[i] = y
return(x_transformed, y_transformed)
class ImageDataGeneratorExtended(keras.preprocessing.image.ImageDataGenerator):
def flow_from_generator:(self, *args, **kwargs):
return DataGenerator(self, *args, **kwargs)
Okay, I hope that helps. Ive used my own version of the above code, but haven't completetly tested it (although it works for me now), so take it with a grain of salt :P
For the *.hdr issue: it seems that you can use the ImageIO
package (it supports the HDR and DICOM format, although I've never personally used that library).