# Answer Key: Onboarding Experiment Readout

Use this as a calibration guide, not a single correct answer.

## Core Readout

The treatment arm appears directionally better on 7-day activation in the toy data, and paid conversion moves in the same direction. Support tickets do not obviously worsen enough to dominate the decision, but the dataset is intentionally too small to support a production launch claim.

## What A Strong Answer Should Say

- Start with validity before lift: confirm randomization unit, exposure logging, missingness, and sample ratio before interpreting effects.
- Report the primary metric as an absolute difference, not only a percent lift.
- Treat support tickets as a guardrail, not as a secondary metric to celebrate if it happens to improve.
- Check acquisition-channel patterns, but call them exploratory unless segments were pre-specified.
- Recommend either a larger pre-registered rollout or a cautious ship only if operational risk is low and the effect is consistent with prior evidence.

## Common Mistakes

- Saying "the new onboarding worked" without mentioning uncertainty.
- Ignoring support tickets because activation improved.
- Over-interpreting channel-level differences from tiny cells.
- Treating the toy dataset as if it had passed SRM, power, and instrumentation checks.

## Instructor Notes

A good discussion question is: "What would you need to see before shipping this to all new workspaces?" Strong answers mention sample size, exposure logs, pre-specified guardrails, and a decision rule.