Ante Hoc Explainability

Interpretable by Design

Ante hoc methods are transparent because their structure is already understandable. Instead of fitting a black box and explaining it later, we choose a model class whose reasoning can be read directly.

This is often the cleanest option when interpretability is a hard requirement.

Decision Trees

Decision trees are among the most intuitive explainable models.

each internal node asks a question about one feature,
each branch corresponds to an answer,
each root-to-leaf path forms a decision rule.

That makes trees especially attractive when we want explanations that non-specialists can inspect.

The lecture also highlights practical advantages: trees handle mixed feature types well, require little preprocessing, and place the most informative splits near the top.

A tree explanation can often be written as an explicit rule such as:

1if duration > 24 months and savings = low:
2    predict high risk
3else:
4    predict lower risk

This kind of rule-based explanation is much closer to natural reasoning than a long vector of hidden parameters.

Generalized Linear Models

GLMs remain interpretable because their coefficients have a direct meaning in the chosen link space. In logistic regression, for example,

\[ \log\frac{p(x)}{1-p(x)} = \beta_0 + \beta_1 x_1 + \cdots + \beta_p x_p. \]

Each coefficient tells us how the log-odds change when a predictor increases by one unit, holding the others fixed. If \(\beta_j = 0.5\), then the odds are multiplied by \(e^{0.5} \approx 1.65\).

This is why GLMs are still widely used in domains where interpretation matters as much as prediction.

Confounding and Simpson's Paradox

Interpretability is not the same as correctness. Even simple models can be misleading when important variables are omitted or when relationships reverse after aggregation.

Simpson's paradox is a classic warning:

a trend can appear in the pooled data,
the opposite trend can appear inside each subgroup,
and the difference is caused by a confounding variable.

A classic toy example is treatment comparison:

treatment A looks better than treatment B inside each subgroup,
but treatment B looks better after pooling everyone together,
because the subgroup sizes were imbalanced.

So even ante hoc models need domain knowledge and careful feature design.

When Ante Hoc Models Are Best

Interpretable-by-design models are especially strong when:

regulations demand simple explanations,
the dataset is moderate in size,
domain experts need rule-level visibility,
or we care as much about reasoning quality as predictive power.

Their main limitation is flexibility: the most interpretable model is not always the most accurate one.

Summary

In this lesson we covered:

Why ante hoc models are explainable by construction
How decision trees produce human-readable rules
Why GLM coefficients remain interpretable
Why confounding and Simpson's paradox still require caution

Next: We turn to global post-hoc methods for understanding complex models after they are trained.

Introduction & Background

Simple Linear Regression

Inference & Diagnostic

Multiple Regression and Feature Engineering

Model Selection and Regularization

Generalized Linear Models (GLM) and Logistic Regression

Mathematical Annexes

Introduction and Partitioning

Splitting Criteria and Best Split

Growth Control and Pruning

Foundations of Model Evaluation

Metrics for Regression and Classification

Cross-Validation Strategies

Feature Selection and Preprocessing

Hyperparameter Tuning and Early Stopping

Imbalanced Data and Threshold Selection

Introduction and Hierarchical Clustering

K-means and K-medoids

Gaussian Mixtures and the EM Algorithm

Density-Based Clustering and Practical Guidance