Exam Study Guide
The tests are closed book, but you may bring one single sided 1/2 page (8.5x11 cut in half). The spirit of this is a note page to put on equations or other items which are harder to memorize. It is not meant for trying to cram all the slides, book chapters, or knowledge from the course on a single sheet. You should know most of that without needing a sheet. Your note page will be handed in with the test. You should be prepared to answer questions from the following topic lists. You may (and should) also bring a non-programmable calculator.
Midterm
- Perceptron
- Delta Rule
- Linear separability and linear models with non-linear feature preprocessing – specifically the Quadric machine
- Linear regression
- Logistic regression
- Inductive Bias, need for Bias, No free lunch
- Overfit – what causes it and how to prevent it
- Predicting future accuracy (N-fold CV, etc.)
- MLP with Backpropagation, learning, parameter selection, etc.
- Features: Approaches for selection, representation, skew, normalization and reduction
- Handling missing/unknown data
- Wrapper algorithms
- PCA
- Decision Trees, ID3
Final
The final is
comprehensive with heavy emphasis on topics covered since the midterm.
- Data Mining Process Model/Cycle (just high level)
- K-Nearest Neighbor algorithm (including distance weighted, regression, reduction techniques, strengths and weaknesses)
- RBF networks
- Clustering approaches (K-means, HAC)
- Bayesian learning (Bayes rule, MAP and ML hypotheses, Bayes optimal classifier, Naïve Bayes)
- Reinforcement Learning (especially Q-learning)
- Ensembles (Bagging, Boosting, Stacking, overall pros and cons)
- Genetic Algorithms (Basic algorithm, data representation, genetic operators, and parameter variations)
Acknowledgments
Thanks to Dr. Tony Martinez for help in designing the projects and requirements for this course.