Bias-Variance Trade-off – the impact of regularization on the Decision Boundary for the SVM and the Logistic Regression Classifier

In this article, the impact of varying regularization parameters for the logistic regression (with L2 norm) and the SVM binary classifiers on thedecision boundaries learnt during training (how they overfit or underfit) will be shown for a few dataset.

  1. The following animation shows the impact of varying the lambda parameter for the logistic regression classifier trained with polynomial features (upto degree 6, with 2 predictor variables) on a dataset (taken from Andrew Ng’s Coursera Machine Learning course). As can be seen from the change in the decision boundary contours, the model learnt overfits the training data for the low values of lambda andunderfits for the high values.
    logit_reg3
  2. The following animation shows the impact of varying the parameters C and sigma for the support vector machine classifier (with Gaussian Kernel) on another dataset (taken from Andrew Ng’s Coursera Machine Learning course). As can be seen from the change in the decision boundary contours again, the model learnt overfits the training data for the low values of lambda and underfits for the high values.
    svm3
  3. Next the following apples and oranges image will be used for binary classification: Orange will correspond to the positive class label and the apples to the negative class label. Only two color channels (red and green) will be used as predictor variables.
    ao
  4. First some training dataset is selected from the image to train the models. The yellow rectangles in the following two figures show the training datasets with positive label and negative label taken from the image respectively. As can be noticed, little bit of noise is introduced in the training data (for the positive class label) to test the robustness of the classifiers.

     

  5.  Next a few logistic regression models with polynomial features (upto 6 degrees) are learnt with the two predictors (namely the red and green channels), since the data is not linearly separable. The following animation shows the impact of varying the lambda parameter for the logistic regression classifier trained. As can be seen from the change in the decision boundary contours from the next animated figure, the model learnt overfits the training data for the low values of lambda and underfits for the high values.
    logit_reg1.gif
  6. The logistic regression models learnt with different values of the regularization parameters are then used to classify the entire image data. The data points (pixels) predicted by the model as positive classes are marked as black (ideally the model should predict all the pixles of the orange in the image as positive classes). Again, as can be seen from the next animation, with high values of the regularization parameter lambda, the model underfits and can’t classify the entire orange, although with low values of lambda it does a pretty good job in separating the orange out from the apples in the image.
    logit_reg2.gif
  7. Next a few SVM models with Gaussian Kernel (since non-linear decision boundary) are learnt with the two predictors (namely the red and green channels). The following animation shows the impact of varying the regularization parameter C and the kernel bandwidth parametersigma for the SVM classifier trained.
    svm1
  8. The SVM models learnt with different values of the regularization / kernel parameters are then used to classify the entire image data. The data points (pixels) predicted by the model as positive classes are marked as black (ideally the model should predict all the pixles of the orange in the image as positive classes). Again, as can be seen from the next animation, with high values of the regularization parameter C, the model underfits and can’t classify the entire orange, although with low values of C it does a pretty good job in separating the orange out from the apples in the image.
    svm2.gif
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s