15.077 Statistical Learning and Data Mining
Advanced introduction to the theory and application of statistics, data-mining, and machine learning, concentrating on techniques used in management science, marketing, finance, consulting, engineering systems, and bioinformatics. First half builds the statistical foundation for the second half, with topics selected from sampling, including the bootstrap, theory of estimation, testing, nonparametric statistics, analysis of variance, categorical data analysis, regression analysis, MCMC, EM, Gibbs sampling, and Bayesian methods. Second half focuses on data mining, supervised learning, and multivariate analysis. Topics selected from logistic regression; principal components and dimension reduction; discrimination and classification analysis, including trees (CART), partial least squares, nearest neighbors, regularized methods, support vector machines, boosting and bagging, clustering, independent component analysis, and nonparametric regression. Uses statistics software packages, such as R and MATLAB for data analysis and data mining. Includes a term project.
15.077 will not be offered this semester. It will be instructed by R. E. Welsch.
Lecture occurs 4:00 PM to 5:30 PM on Mondays and Wednesdays in E51-315.
This class counts for a total of 12 credits.
© Copyright 2015 Yasyf Mohamedali