Education
Course Highlight
Machine Learning
STATS 306B: Unsupervised learning
(topics)
Unsupervised vs supervised learning
Clustering methods
Kmeans clustering, Kmedoid clustering; choosing the number of clusters. The gap statistic. Silhouette statistic. Prediction strength
Agglomerative hierarchical clustering. Application to DNA microarrays
Vector quantization, treestructured VQ
Hybrid clustering
Gaussian mixtures; the EM algorithm. Modelbased clustering
Unsupervised problem cast as a supervised problem
Selforganizing maps
Principal components; principal surfaces
Factor analysis. Independent components analysis
Multidimensional scaling, ISOMAP, local linear embedding
STATS 315A: Supervised learning
(topics)
Gaussian discriminant analysis
Naive Bayes
Support vector machines
Model selection and feature selection
Least angle regression and the Lasso
SVM path algorithms
crossvalidation, bootstrap
Basic expansions and regularization
Fitting curves to data
Generalized additive models
STATS 315B: Treebased learning methods and ensemble methods
(topics)
Classification & regression trees (CART)
Multivariate adaptive regression splines (MARS)
Bagging
Boosting and additive trees (MART)
Neural networks
Prototype & nearneighbor methods
STATS 315C: Learning from matrix valued data
(topics)
Biplots and heatmaps
Anova models, Rasch models, correspondence analysis
Clustering, biclustering, spectral clustering
SVD, nonnegative matrix factorization, and generalizations
PageRank, TrustRank and generalizations
Prediction on graphs
Tensor methods for three way data
Matrix resampling and downsampling
Random matrix theory and TracyWidom laws
Graph based algorithms
CS 221: Artificial intelligence
(topics)
CS 369M: Algorithms for modern massive data set analysis
(topics)
Randomized algorithms for matrix problems
Data analysis and machine learning uses of matrix computations
Algorithmic approaches to graph partitioning problems
Novel datamotivated matrix factorizations
Relationship to numerical, statistical, largescale computational issues
Other coursework
EE 364A: Convex optimization
STATS 305: Intro to Statistical modeling
STATS 306A: Discrete data modeling
(topics)
Discrete distributions: Bernoulli, Binomial, Poisson, Multinomial
Related continuous distributions: Beta, Dirichlet
Chisquare tests
Logistic regression
Loglinear models for contingency tables
Generalized linear models
BradleyTerry and related models
Rasch and related models
Predicting ordered and unordered categorical values
STATS 324: Multivariate analysis
STATS 362: Monte Carlo sampling
STATS 352: Spatial Statistics
Statistical theory (STATS 300A, B, C); Probability theory (STATS 310A, B, C)
