My goal is to evaluate model performance on test dataset for object detection task. Model was trained on dataset with 6 classes with Tensorflow Object Detection API. For some class there are 20 samples of objects and for some it can be only one sample. So data is imbalanced for both train and test sets. Can I use mean average precision (mAP) as metrics for evaluation? It seems to me that it is not correct to use it for imbalanced data. Therefore I don't know which other metrics to use. So what kind of metrics is suitable for this case?
I would appreciate any help on this.