Glossary

Short definitions for the terms that show up across the chapters.

ATE

Average treatment effect: the average difference between what happens under treatment and what would have happened under control.

Read the chapter

AIPW

Augmented inverse probability weighting: an estimator that combines an outcome model with a propensity model.

Read the chapter

Bandit

A repeated decision problem where the learner balances exploring options with exploiting the best-known option.

Read the chapter

Collider

A variable caused by two other variables; conditioning on it can create a spurious association.

Read the chapter

Confounder

A variable that affects both the treatment and the outcome, making naive comparisons biased.

Read the chapter

Counterfactual

The outcome that would have happened under a different action or treatment.

Read the chapter

CATE

Conditional average treatment effect: how the average treatment effect changes for units with particular covariates or contexts.

Read the chapter

CUPED

A variance-reduction method that uses pre-experiment measurements to make randomized experiments more precise.

Read the chapter

Difference-in-differences

A design that compares outcome changes over time between treated and comparison groups, relying on a parallel-trends assumption.

Read the chapter

DAG

Directed acyclic graph: a diagram of causal assumptions using arrows and no cycles.

Read the chapter

Doubly robust

An approach that combines an outcome model with a treatment or behavior-policy model; under the right assumptions, one correct model can be enough for consistency.

Read the chapter

E-value

A sensitivity metric: how strong unmeasured confounding would need to be to explain away an observed association.

Read the chapter

Estimand

The precise causal quantity being targeted, including treatment, control, population, outcome, time horizon, and effect definition.

Read the chapter

Guardrail metric

A metric that protects against harm or unacceptable regressions while optimizing a primary outcome.

Read the chapter

Intent-to-treat

Analyze units by their assigned condition, even if they did not comply. This preserves randomization.

Read the chapter

Instrumental variable

A variable that shifts treatment but affects the outcome only through that treatment, enabling causal estimates under strong exclusion and relevance assumptions.

Read the chapter

LATE

Local average treatment effect: the causal effect for compliers whose treatment changes because of an instrument or encouragement.

Read the chapter

MDP

Markov decision process: the standard model for sequential decisions with states, actions, transitions, rewards, and policies.

Read the chapter

Posterior

The updated distribution of beliefs after combining prior information with observed data.

Read the chapter

Posterior predictive check

A model check that simulates replicated data from the fitted model and compares those simulations with important features of the observed data.

Read the chapter

Positivity

The requirement that every unit type relevant to the analysis has a nonzero chance of receiving each treatment condition being compared.

Read the chapter

Propensity score

The probability of receiving treatment given observed covariates.

Read the chapter

Randomization

Assigning treatment by a chance mechanism so treatment groups are comparable in expectation.

Read the chapter

Reward

The feedback signal an RL agent tries to maximize over time.

Read the chapter

Sample ratio mismatch

A diagnostic failure where the observed allocation across experiment arms differs from the planned allocation, often signaling assignment, eligibility, or logging problems.

Read the chapter

SRM

Sample ratio mismatch: an experiment diagnostic where observed arm counts do not match planned allocation ratios.

Read the chapter

SUTVA

Stable Unit Treatment Value Assumption: each unit's potential outcome depends only on its own treatment, with no hidden versions of treatment or spillovers across units.

Read the chapter

Off-policy evaluation

Estimating how a target policy would perform using data generated by a different behavior policy.

Read the chapter

On-policy evaluation

Estimating a policy's performance from data collected while that same policy is being run.

Read the chapter

Policy evaluation

Estimating the expected value, reward, or outcome of following a policy before deciding whether to keep, change, or deploy it.

Read the chapter

Regression discontinuity

A design that estimates a local causal effect around a cutoff where treatment assignment changes discontinuously.

Read the chapter

Washout

A buffer period between conditions in a crossover experiment to reduce carryover effects.

Read the chapter