I'm looking for a way to achieve multiple classifications for an input. The number of outputs is specified, and the class sets may or may not be the same for the outputs. The sample belongs to one class of each class set.
My question is, what should the target data and the output layer look like? What activation, loss and training functions could be used, and how should the layer be connected to the hidden layer? I'm not necessarily looking for an optimal solution, just a working one.
My current guess on what could work, is to make the target data be multiple concatenated one-hot vectors and the output layer have as many softmax units as the number of vectors. I don't know how the layers would be connected with that solution and how the net would figure out the sizes of class sets. I think a label powerset would not work for my needs.
I think the matlab patternnet function can create a net that does that, but I don't know how the resulting net works. Code for TensorFlow or Keras would be very welcome.