Perceptron Lab

Please review all requirements before starting to ensure your implementation will support all of them. Your report should be submitted as a single PDF file with appendices as required below:

(40%) Correctly implement the perceptron learning algorithm using the tool kit (or your variation of the toolkit if you do your own). It is not easy to tell if you have implemented the model exactly correct since training sets, parameters, etc., often have a random aspect. However, it is easy to see when your results are inconsistent with reasonable results, and points will be based on how well your results agree with intuition.
(5%) Create 2 ARFF files. Each should have 8 instances using 2 real valued inputs (ranging between -1 and 1) with 4 instances from each of two classes. One dataset should be linearly separable and the other not. Include these two ARFF files as appendices in your report.
(5%) For each set, train on the entire set with the Perceptron Rule. Models typically stop training when no longer making significant progress, most commonly, when you have gone a number of epochs (e.g. 5) with no significant improvement in accuracy. You should not merely look for the first epoch with no improvement, nor should you rely solely on a fixed maximum number of training epochs. Note that the weights/accuracy do not usually change monotonically. Describe your specific stopping criteria.
(5%) Also for each set, try different learning rates and report in tabular format the effect of each learning rate in terms of the final accuracy and how many epochs are completed before stopping. For these cases learning rate should have minimal effect, unlike with the Backpropagation lab.
(10%) Create a graph for each of the two datasets you created above. Plot the points for each dataset with different labels to represent different classes. Using the final weights from training on each dataset (use a LR of 0.1 for this and all subsequent requirements), derive the equation for the learned decision boundary represented by the weights. Show your work. Graph this line on the graph and indicate which label is predicted for each side of the decision boundary. For all graphs always label the axes!
(20%) Use the perceptron rule to learn this version of the voting task. This particular task is an edited version of the standard voting set, where we have replaced all the “don’t know” values with the most common value for the particular attribute. Randomly split the data into 70% training and 30% test set (the toolkit has a commandline option for this). Try it five times with different random 70/30 splits. Create a table which for each split reports the final training and test set accuracy and the # of epochs required. Also report the average of these values for the 5 trials in the table. Update weights after every instance. Shuffle the data order after each epoch. By looking at the weights, explain what the model has learned and how the individual input features affect the result. Which specific features are most critical for the voting task, and which are least critical? Do one graph of the average misclassification rate vs epochs (0th – final epoch). Show on the graph the average misclassification rate for the training set. Note that your larger number epochs will only be averaging those runs that trained for that long. In our helps page is some help for doing graphs. To clarify what specific graphs should look like, find examples for this and future projects here. As a rough sanity check, typical Perceptron accuracies for the voting data set are 90%-98%.
(15%) Do your own experiment with either the perceptron or delta rule. Include in your discussion what you learned from the experiment. Have fun and be creative! For this lab and all the future labs make sure you do something more than just try out the model on different data sets. One option you can do to fulfill this part is the following:

Use the perceptron rule to learn the iris task or some other task with more than two possible output values. Note that the iris data set has 3 output classes, and a perceptron node only has two possible outputs. Two common ways to deal with this are:
1. Create 1 perceptron for each output class. Each perceptron has its own training set which considers its class positive and all other classes to be negative examples. Run all three perceptrons on novel data and set the class to the label of the perceptron which outputs high. If there is a tie, choose the perceptron with the highest net value.
2. Create 1 perceptron for each pair of output classes, where the training set only contains examples from the 2 classes. Run all perceptrons on novel data and set the class to the label with the most wins (votes) from the perceptrons. In case of a tie, use the net values to decide.
You could implement one of these. For either of these approaches you can train up the models independently or simultaneously. For testing you just execute the novel instance on each model and combine the overall results to see which output class wins.

Note: In order to help you debug this and other projects we have included some small examples and other hints with actual learned hypotheses so that you can compare the results of your code and help ensure that your code is working properly. You may also discuss and compare results with classmates.

Deliverables (zipped as a single file):

Written report containing:

One graph for each of the linearly separable and nonlinearly separable datasets. Each graph shows the x,y points in the dataset and the line that is learned by your algorithm (remember the equation for the line comes from w_x * x + w_y * y + w_b * 1 = 0, where w_x and w_y and w_b are your learned weights. From that you can convert to slope-intercept format and draw the line.)
A table of training and test set accuracies on the voting task for each of five different random 70/30 splits, as well as the average training and test set accuracies over these 5 different splits.
A graph of the average misclassification rate vs epochs.
Other discussion and answers to questions as described above.

The actual ARFF files for the two datasets your created.
Your Perceptron implementation file.

Acknowledgments

Thanks to Dr. Tony Martinez for help in designing the projects and requirements for this course.