Machine Learning

This course moves from supervised prediction to unsupervised discovery. We begin with linear models, add tree-based methods, then study how to evaluate and tune models before finishing with clustering techniques for unlabeled data.

  • Simple and multiple regression
  • Inference, diagnostics, regularization, and logistic regression
  • A running house-price example
Linear modeling
  • Tree structure, feature-space partitioning, and leaf predictions
  • Gini, entropy, stopping rules, and pruning
  • Practical considerations before moving to ensembles
Decision trees
  • Train/validation/test strategy, bias-variance, and learning curves
  • Metrics, cross-validation, feature selection, and tuning
  • Robust workflows for imbalance, thresholds, and leakage prevention
Model evaluation & feature selection
  • Hierarchical clustering, K-means, and K-medoids
  • Gaussian mixtures, DBSCAN, and OPTICS
  • Choosing the number of clusters and assessing cluster quality
Clustering techniques