- There are
*5000*examples of*handwritten digits*in the dataset. Some of the examples are shown below. We have*10**class labels*here: the digits 0:9. Given an image we want to classify it as one of the 10 classes. - The next set of equations show how the
**feed-forward**and**back-propagation**learning happens in a neural net with a**single hidden layer**. - The next figure shows how the
decreases with*cost function**#iterations*of the numerical optimization algorithm used*(conjugate gradient)*with*25 hidden units*in the*hidden layer*. - The next figures show the
**hidden features**learnt for different number of hidden units in the neural net. - The next figure shows the accuracy on the training dataset obtained with different number of hidden units in the neural net. Ideally we should expect a strict increase in accuracy on the training set (leading to
*overfitting*), but there is a slight zig-zag pattern that can be attributed to the random initialization of the weights and the regularization.

