Interactive Perceptron Training Toy

Monday September 7, 2015

A little while ago I contributed a simple perceptron model for the Simple Statistics JavaScript library. This makes possible things like this interactive perceptron model training environment, in which you can get a “hands-on” feel for how the model works in two dimensions.

The space below starts all red, which means the model starts predicting any given point is “negative.” As you update things, the color will update to show where the model would predict positive (blue) and negative (red).

The space below is clickable! One normal (left) click will add a blue (positive; label “1”) point. Clicking on a blue point will turn it red (negative; label “0”). Clicking on a red point will remove it. So you can cycle between no point, blue point, red point, no point.

To train the model, choose a point and use your other click (control click or right click). The perceptron model updates when it makes an incorrect prediction. You'll see details about this process below the box, as it happens!

Diagnostics appear with model training:

You can follow the model fitting step by step! (To reset everything, just reload the page.)

Since we're in two dimensions, each data point has an x and y coordinate, written [x, y]. The perceptron model has two weights that correspond to the x and y directions (let's just call them a and b), and a bias term which we can call c. The model predicts positive (blue) if ax + by + c is greater than zero, and negative (red) otherwise. (You can think of the bias term as a weight that is always multiplied by one.)

One thing that became particularly clear as I put this together is how really essential centering and scaling data is. With points within three of the origin in either direction, the model can usually do well in a reasonable number of training steps, especially if the points are spread around the origin. But try putting two points of opposing color on a diagonal near the same corner of the box. It takes forever to fit! The perceptron can have a hard time moving away from intercepting the origin.

You can also see quite plainly that the (single layer) perceptron classifies by linear separation, so it can't handle the XOR problem.