# Worked Example: Onboarding Checklist Experiment

## Decision

Ship, revise, or abandon a guided onboarding checklist for new self-serve workspaces.

## Causal Question

Among new self-serve workspace admins, what is the intent-to-treat effect of showing a guided setup checklist versus the existing blank workspace on activation within 7 days?

## Estimand

- Primary estimand: ITT
- Unit of analysis: workspace
- Randomization unit: workspace
- Exposure definition: assignment to checklist experience at first workspace creation
- Analysis population: eligible new self-serve workspaces, excluding internal and fraud-blocked accounts

## Hypothesis

The checklist will increase activation because it reduces ambiguity about the first three setup actions: inviting a teammate, connecting a data source, and completing the first project.

## Metrics

- Primary: activation within 7 days
- Guardrail: support tickets per workspace in first 7 days
- Guardrail: workspace invite rate
- Guardrail: trial-to-paid conversion within 30 days
- Secondary: completion of each checklist step
- Exploratory: effect by acquisition channel and team size

## Design

- Assignment: 50/50 random assignment at workspace creation
- Blocking: acquisition channel
- Runtime: 14 days minimum, then continue until planned sample size is reached
- Minimum detectable effect: 1.5 percentage point absolute lift in activation
- Power: 80% at alpha 0.05 using historical baseline activation of 28%
- Sequential rule: no early stopping for efficacy; stop only for severe guardrail harm or instrumentation failure

## Validity Checks

- SRM: observed assignment must remain within expected tolerance by channel
- Logging completeness: assignment and exposure events must be present for at least 99% of eligible workspaces
- Interference: workspace-level randomization avoids treated and control admins sharing the same onboarding state
- Non-compliance: users may ignore the checklist; ITT remains the launch-relevant estimate
- Missing data: workspaces without reliable activation logging are excluded only by pre-specified data-quality rules

## Decision Rule

Ship if activation improves by at least 1 percentage point, guardrails do not materially worsen, SRM and logging checks pass, and 30-day paid conversion is not directionally negative.

Do not ship if activation is negative, support burden rises materially, or trial-to-paid conversion declines.

Rerun if assignment, exposure, or metric logging fails.

Iterate if activation improves but the effect is concentrated in one channel or comes with a small support burden increase.

## Hostile Review

The biggest risk is mistaking shallow checklist completion for real activation. The 30-day paid conversion guardrail helps, but it arrives later than the primary readout.

The most likely ambiguous result is a modest activation gain with slightly higher support tickets. In that case, the decision should depend on ticket type: confusion tickets suggest the checklist needs revision; setup success tickets may be acceptable.

The result should not be generalized to enterprise onboarding because enterprise teams have sales assistance and different setup workflows.
