# Exercise: Logged Support Triage Policy

Use `support_triage_logs.csv`.

## Goal

Practice auditing logged decision data before trusting offline policy evaluation.

## Questions

1. Which actions were most common under the behavior policy?
2. Which state-action pairs look weakly supported?
3. Do high-urgency tickets receive different actions than low-urgency tickets?
4. Which guardrails would you inspect before improving the policy?
5. Which tickets should use a fallback policy rather than a learned recommendation?

## SQL Starter

Action distribution:

```sql
SELECT
  action,
  COUNT(*) AS tickets,
  AVG(behavior_probability) AS avg_behavior_probability,
  AVG(resolved_24h) AS resolved_24h_rate,
  AVG(escalated) AS escalation_rate
FROM support_triage_logs
GROUP BY action
ORDER BY tickets DESC;
```

Support by urgency:

```sql
SELECT
  urgency,
  action,
  COUNT(*) AS tickets,
  MIN(behavior_probability) AS min_behavior_probability,
  AVG(resolved_24h) AS resolved_24h_rate
FROM support_triage_logs
GROUP BY urgency, action
ORDER BY urgency, action;
```

## Interpretation Prompt

Write a policy-readiness note:

- what the logs can support
- where the candidate policy should be constrained
- which guardrails must be monitored
- what staged rollout you would recommend

## Worked-Solution Standard

A strong answer treats high estimated reward as insufficient. It should discuss support, behavior-policy probabilities, escalation guardrails, and human review for high-risk tickets.