Exam Study Guide

The tests are closed book, but you may bring one single sided 1/2 page (8.5x11 cut in half). The spirit of this is a note page to put on equations or other items which are harder to memorize. It is not meant for trying to cram all the slides, book chapters, or knowledge from the course on a single sheet. You should know most of that without needing a sheet. Your note page will be handed in with the test. You should be prepared to answer questions from the following topic lists. You may (and should) also bring a non-programmable calculator.

Midterm

Perceptron
Delta Rule
Linear separability and linear models with non-linear feature preprocessing – specifically the Quadric machine
Linear regression
Logistic regression
Inductive Bias, need for Bias, No free lunch
Overfit – what causes it and how to prevent it
Predicting future accuracy (N-fold CV, etc.)
MLP with Backpropagation, learning, parameter selection, etc.
Features: Approaches for selection, representation, skew, normalization and reduction
Handling missing/unknown data
Wrapper algorithms
PCA
Decision Trees, ID3

Final

The final is comprehensive with heavy emphasis on topics covered since the midterm.

Data Mining Process Model/Cycle (just high level)
K-Nearest Neighbor algorithm (including distance weighted, regression, reduction techniques, strengths and weaknesses)
RBF networks
Clustering approaches (K-means, HAC)
Bayesian learning (Bayes rule, MAP and ML hypotheses, Bayes optimal classifier, Naïve Bayes)
Reinforcement Learning (especially Q-learning)
Ensembles (Bagging, Boosting, Stacking, overall pros and cons)
Genetic Algorithms (Basic algorithm, data representation, genetic operators, and parameter variations)

Acknowledgments

Thanks to Dr. Tony Martinez for help in designing the projects and requirements for this course.