- There are 5000 examples of handwritten digits in the dataset. Some of the examples are shown below. We have 10 class labels here: the digits 0:9. Given an image we want to classify it as one of the 10 classes.
- The next set of equations show how the feed-forward and back-propagation learning happens in a neural net with a single hidden layer.
- The next figure shows how the cost function decreases with #iterations of the numerical optimization algorithm used (conjugate gradient) with 25 hidden units in the hidden layer.
- The next figures show the hidden features learnt for different number of hidden units in the neural net.
- The next figure shows the accuracy on the training dataset obtained with different number of hidden units in the neural net. Ideally we should expect a strict increase in accuracy on the training set (leading to overfitting), but there is a slight zig-zag pattern that can be attributed to the random initialization of the weights and the regularization.