1
votes

I've created a neural network to model a certain (simple) input-output relationship. When I look at the time-series responses plot using the nntrain gui the predictions seem quite adequate, however, when I try to do out of sample prediction the results are nowhere close to the function being modelled.

I've googled this problem extensively and messed around with my code to no avail, I'd really appreciate a little insight into what I've been doing wrong.

I've included a minimal working example below.

 A = 1:1000;  B = 10000*sin(A); C = A.^2  +B;
 Set = [A' B' C'];
 input = Set(:,1:end-1);
 target = Set(:,end);
 inputSeries = tonndata(input(1:700,:),false,false);
 targetSeries = tonndata(target(1:700,:),false,false);

 inputSeriesVal = tonndata(input(701:end,:),false,false);
 targetSeriesVal = tonndata(target(701:end,:),false,false);

 inputDelays = 1:2;
 feedbackDelays = 1:2;
 hiddenLayerSize = 5;
 net = narxnet(inputDelays,feedbackDelays,hiddenLayerSize);

[inputs,inputStates,layerStates,targets] = preparets(net,inputSeries,{},targetSeries);
net.divideFcn = 'divideblock';  % Divide data in blocks
net.divideMode = 'time';  % Divide up every value

 % Train the Network
[net,tr] = train(net,inputs,targets,inputStates,layerStates);
Y = net(inputs,inputStates,layerStates); 

 % Prediction Attempt
delay=length(inputDelays); N=300;
inputSeriesPred  = [inputSeries(end-delay+1:end),inputSeriesVal];
targetSeriesPred = [targetSeries(end-delay+1:end), con2seq(nan(1,N))];
netc = closeloop(net);
[Xs,Xi,Ai,Ts] = preparets(netc,inputSeriesPred,{},targetSeriesPred);
yPred = netc(Xs,Xi,Ai);
perf = perform(net,yPred,targetSeriesVal);

 figure;
plot([cell2mat(targetSeries),nan(1,N);
      nan(1,length(targetSeries)),cell2mat(yPred);
      nan(1,length(targetSeries)),cell2mat(targetSeriesVal)]')
legend('Original Targets','Network Predictions','Expected Outputs')
  end 

I realise narx net with a time delay is probably overkill for this type of problem but I intend on using this example as a base for a more complicated time-series problem in the future.

Kind regards, James

3
If you have problem making out-of-sample forecasts you have probably overfitted. This is where crossvalidation comes in.Dennis Jaheruddin

3 Answers

0
votes

The most likely causes of poor generalization from the training data to new data is that either (1) there was not enough training data to characterize the problem, or (2) the neural network has more neurons and delays than are needed for the problem so it is overfitting the data (i.e. it is having an easy time memorizing the examples instead of having to figure out how they are related.

The fix for (1) is typically more data. The fix for (2) is to reduce the number of tap delays and/or neurons.

Hope this helps!

0
votes

I'm not sure if you solved the problem yet. But there is at least one more solution to your problem.

Since you are dealing with a time series it is better (at least in this case) to set net.divideFcn = 'dividerand'. The 'divideblock' will only use the first part of the time series for training which may result in lost information about the long-term trends.

0
votes

Increase the inputdelay, feedbackdelay and hiddenlayersize as following:

 inputDelays = 1:30;
 feedbackDelays = 1:3;
 hiddenLayerSize = 30;

Also change function as

net.divideFcn = 'dividerand';

this changes work for me even though network take time