1
votes

I am struggling with this problem for a couple of days. Basically, when I start training with the object detection API of tensorflow, it does one iteration and gets an error, if I use the data from a the tutorial raccoon detection it works perfectly.

I already tried only use one class, or multiple, different images, only checked images, use everything equal to the raccoon tutorial.

Thank you for your time.

Error:

InvalidArgumentError (see above for traceback): LossTensor is inf or nan. : Tensor had NaN values [[Node: CheckNumerics = CheckNumericsT=DT_FLOAT, message="LossTensor is inf or nan.", _device="/job:localhost/replica:0/task:0/cpu:0"]]

2

2 Answers

0
votes

The NaN error means that some value of the tensor analyzed it is null. May be some of your images has different sizes and the input it’s getting null values because of that. It’s just a guess, I don’t even know if you are using images or video to train the system, but if the code works with one sample and don’t work with another one the problem must be at the samples.

0
votes

You might want to check that object annotations are correct, the NaN error is most likely caused by incorrect calculation involving the annotations, i.e. check the following:

  1. No NaN values in annotations
  2. No bounding box is outside the image boundaries
  3. Annotations are in pixel values (i.e. not normalized)
  4. XMin < XMax and YMin < YMax
  5. There are no bounding boxes that are too small (e.g. 1% of the image)
  6. There is no problem due to data augmentation.

Reference: https://github.com/tensorflow/models/issues/1881