Mean Absolute Scaled Error as imputation performance metric

Question

To evaluate different imputation methods with cross-validation I am searching for an appropriate accuracy measure. My cross-validation sample consists of 100 univariate time-series of equal length (energy measurements of buildings) and I'm simulating missing data following a Missing at Random approach in R. Goal is to find the method that, overall, most accurately imputes the simulated missing data.

It became clear that, although widely applied, MAE/MAPE/RMSE are not appropriate as performance measures. My data includes (near) 0 measures (so MAPE is not useful) and due to cross-validation on a sample of 100 different-scale time-series, RMSE and MAE can't be used due to their scale-dependence (see Rob Hyndman's paper). In the same paper, the Mean Absolute Scaled Error (MASE) is proposed as an alternative.

My questions are: MASE is proposed as a forecasting measure. Can it also be applied for imputation? If so, how would the benchmark naïve MAE be calculated in R? What is the training data in this sense?

The current idea would be to calculate benchmark naïve MAE for each time-series over the full time-series data before simulating the missing data. Would that be appropriate?

Steffen Moritz Steffen Moritz · Accepted Answer · 2020-12-29T21:51:34

Hard to tell without knowing more about the data. Choosing the right performance measure is quite specific to your dataset.

As you already said, MAE, MAPE, RMSE, SMAPE all have their drawbacks for certain kinds of data. But same applies to MASE.

Sometimes it is even about choosing the least bad metric.

I have to admit, until now I avoided actually using MASE. It just seems a little complicated to me and quite a lot of people (even in statistics) never heard about it. But pretty much true, MASE has some favorable characteristics, you won't find with MASE, RMSE,...

It sounds like your time series might have seasonality in it. In this case you'd have to use the MASE for seasonal time series. Which also means using a seasonal naive forecast.

My questions are: MASE is proposed as a forecasting measure. Can it also be applied for imputation?

You certainly will need some modifications, but I guess yes. But I also think it will be pretty hard to explain your metric quickly to other people.

If so, how would the benchmark naïve MAE be calculated in R?

I guess you could use locf (last observation carried forward) or any other naive imputation/forecast method e.g. moving average

But some general questions: While MASE is scale independent are results really comparable over your kind of different time series... I mean they could behave quite differently depending on occupant behavior. Some might have seasonality, some not.

What about evaluating each time series separately using e.g. MAE -or maybe if it isn't a good choice because of trends in the series - WMAPE(which mitigates the issues with the 0s) and doing a rank based overall comparison afterwards.

Mean Absolute Scaled Error as imputation performance metric

1 Answers