Which loss function would fit best in a categorical (discrete) supervised learning?
Binary Crossentropy
Any L2 loss
kullback-leibler (KL) loss
Mean Squared Error (MSE)