This is an applied course in the modeling and discovery of relationships between multiple variables. Topics include parametric and nonparametric regression, and supervised learning techniques. Specific methods covered will include linear models, logistic regression, additive models, LASSO, kernel methods, support vector machines, neural networks, and classification and regression trees. There will also be a discussion of the need for and the implementation of dimension reduction techniques, including principal components. There will be a focus on model selection, residual analysis, diagnostics, detection of multicollinearity and nonstandard conditions. Examples will be taken from financial models. Required texts: Statistics and Data Analysis for Financial Engineering by David Ruppert, 2011 and An Introduction to Statistical Learning: with Applications in R by Gareth James, et al. Prerequisite: Introduction to Probability 46-921, Introduction to Statistical Inference 46-923.
Lecture: 100min/wk and Recitation: 50min/wk