{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "7fb27b941602401d91542211134fc71a",
   "metadata": {},
   "source": [
    "# Causal Inference Course Kit: Feature Adoption and Retention\n",
    "\n",
    "Use this notebook to separate a naive product analytics association from a causal identification plan. The point is to name the assumptions before naming the effect.\n",
    "\n",
    "Dataset path on the education site: `/course-kits/causal-inference/feature_adoption_retention.csv`.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "acae54e37e7d407bbb7b55eff062a284",
   "metadata": {},
   "source": [
    "## 1. Load the data\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9a63283cbaf04dbcab1f6479b197f3a8",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "DATA_URL = \"https://education.dooperator.ai/course-kits/causal-inference/feature_adoption_retention.csv\"\n",
    "df = pd.read_csv(DATA_URL)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8dd0d8092fe74a7c96281538738b07e2",
   "metadata": {},
   "source": [
    "## 2. Core summary\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "72eea5119410473aa328ad9291626812",
   "metadata": {},
   "outputs": [],
   "source": [
    "naive = df.groupby(\"used_collaboration_week1\").agg(\n",
    "    workspaces=(\"workspace_id\", \"count\"), retention_rate=(\"retained_30d\", \"mean\")\n",
    ")\n",
    "naive"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8edb47106e1a46a883d545849b8ab81b",
   "metadata": {},
   "source": [
    "## 3. Diagnostic check\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "10185d26023b46108eb7d9f57d49d2b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "balance = df.groupby(\"used_collaboration_week1\").agg(\n",
    "    avg_team_size=(\"team_size\", \"mean\"),\n",
    "    avg_baseline_sessions=(\"baseline_sessions\", \"mean\"),\n",
    "    avg_invites_sent=(\"invites_sent\", \"mean\"),\n",
    "    retention_rate=(\"retained_30d\", \"mean\"),\n",
    ")\n",
    "balance"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8763a12b2bbd4a93a75aff182afb95dc",
   "metadata": {},
   "source": [
    "## 4. Decision memo prompt\n",
    "\n",
    "Write two paragraphs: the naive association, then why it is not yet credible as a causal estimate. End with the next design: adjustment plus sensitivity analysis, or a randomized prompt experiment.\n",
    "\n",
    "Keep the memo decision-oriented: say what you would do next, what assumption could break the recommendation, and what evidence would change your mind.\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
