Ante Hoc Explainability

Ante hoc methods are transparent because their structure is already understandable. Instead of fitting a black box and explaining it later, we choose a model class whose reasoning can be read directly.

This is often the cleanest option when interpretability is a hard requirement.

Decision trees are among the most intuitive explainable models.

  • each internal node asks a question about one feature,
  • each branch corresponds to an answer,
  • each root-to-leaf path forms a decision rule.

That makes trees especially attractive when we want explanations that non-specialists can inspect.

The lecture also highlights practical advantages: trees handle mixed feature types well, require little preprocessing, and place the most informative splits near the top.

A tree explanation can often be written as an explicit rule such as:

1if duration > 24 months and savings = low:
2    predict high risk
3else:
4    predict lower risk

This kind of rule-based explanation is much closer to natural reasoning than a long vector of hidden parameters.

GLMs remain interpretable because their coefficients have a direct meaning in the chosen link space. In logistic regression, for example,

\[ \log\frac{p(x)}{1-p(x)} = \beta_0 + \beta_1 x_1 + \cdots + \beta_p x_p. \]

Each coefficient tells us how the log-odds change when a predictor increases by one unit, holding the others fixed. If \(\beta_j = 0.5\), then the odds are multiplied by \(e^{0.5} \approx 1.65\).

This is why GLMs are still widely used in domains where interpretation matters as much as prediction.

Interpretability is not the same as correctness. Even simple models can be misleading when important variables are omitted or when relationships reverse after aggregation.

Simpson's paradox is a classic warning:

  • a trend can appear in the pooled data,
  • the opposite trend can appear inside each subgroup,
  • and the difference is caused by a confounding variable.

A classic toy example is treatment comparison:

  • treatment A looks better than treatment B inside each subgroup,
  • but treatment B looks better after pooling everyone together,
  • because the subgroup sizes were imbalanced.

So even ante hoc models need domain knowledge and careful feature design.

Interpretable-by-design models are especially strong when:

  • regulations demand simple explanations,
  • the dataset is moderate in size,
  • domain experts need rule-level visibility,
  • or we care as much about reasoning quality as predictive power.

Their main limitation is flexibility: the most interpretable model is not always the most accurate one.

In this lesson we covered:

  1. Why ante hoc models are explainable by construction
  2. How decision trees produce human-readable rules
  3. Why GLM coefficients remain interpretable
  4. Why confounding and Simpson's paradox still require caution

Next: We turn to global post-hoc methods for understanding complex models after they are trained.