To evaluate different imputation methods with cross-validation I am searching for an appropriate accuracy measure. My cross-validation sample consists of 100 univariate time-series of equal length (energy measurements of buildings) and I'm simulating missing data following a Missing at Random approach in R. Goal is to find the method that, overall, most accurately imputes the simulated missing data.
It became clear that, although widely applied, MAE/MAPE/RMSE are not appropriate as performance measures. My data includes (near) 0 measures (so MAPE is not useful) and due to cross-validation on a sample of 100 different-scale time-series, RMSE and MAE can't be used due to their scale-dependence (see Rob Hyndman's paper). In the same paper, the Mean Absolute Scaled Error (MASE) is proposed as an alternative.
My questions are: MASE is proposed as a forecasting measure. Can it also be applied for imputation? If so, how would the benchmark naïve MAE be calculated in R? What is the training data in this sense?
The current idea would be to calculate benchmark naïve MAE for each time-series over the full time-series data before simulating the missing data. Would that be appropriate?