0
votes

I am working with KNIME and trying to train my Naive bayes classification algorithm with test data. I tried to use 10-fold cross-validation to make my results accurate but I am not able to generate the PMML model: I keep getting the error Loop end already assigned (start node has more than one end node). This is my KNIME workflow:

workflow screenshot

1

1 Answers

0
votes

Exactly as the error message says, you have two loop end nodes (the PMML Ensemble Loop End and X-Aggregator nodes) but only one loop start (the X-Partitioner).

What is it you are trying to achieve? Normally the purpose of cross-validation is to estimate how well your predictive model is likely to perform on unknown data. If what you want at the end is a single trained Naive Bayes model that you can make predictions with, I think you want to delete the PMML Ensemble Loop End and instead connect the normalised data set to a second Naive Bayes Learner, configured the same as the first one, as well as to the X-Partitioner input. The output of the second learner is the model that you can then use for prediction. In that way the second learner node gets trained on the whole dataset, for the most accurate model, while the original one inside the cross-validation loop is just used to produce the estimate of how good the model is going to be.

If you want to make sure both learners are using the same settings, you can use flow variables to pass the setting values from the whole-dataset learner to the one inside the loop:

  • Show flow variable ports on the whole-dataset learner and the X-Partitioner and link the output of the former to the input of the latter
  • In the Flow Variables tab of the whole-dataset learner configuration, type a name in the box for each parameter you want to pass on
  • Run the whole-dataset learner
  • In the Flow Variables tab of the learner in the loop, you should now be able to select the variable names that you created in the drop-down beside the corresponding parameter.