# Idle Recap Cache — Pre-Warmed Session Summaries

**Status: Implemented** (2026-02-28)

## Problem

Running `/recap` or `/hist` requires an LLM roundtrip (10-20K tokens, 5-15s). When a user returns to a session after a break, they often want a quick "where was I?" without waiting.

## Implementation

Idle recap pre-warming lives in `helm/state_poller.py`, wired into the existing 15s poll loop.

### Trigger

The state poller (every 15s) checks each active session:
1. **Idle >= 2 minutes** (from `last_event_at` in watch.db)
2. **Recap is stale** (new user turns since last generation, via `is_recap_stale()`)
3. **Cooldown not active** (5min per session to avoid re-kicking)

When all three conditions are met, `generate_recap()` fires as an async task (fire-and-forget).

### Cache Lifecycle

```
Session idle 2+ min → check turn count → stale? → generate recap → write cache
User reactivates   → cached recap is already fresh
User clicks recap  → served instantly from cache
Next Stop hook     → background refresh updates recap with latest turns
```

### Staleness Rules

- **Fresh**: turn count at generation == current turn count → no recompute
- **Stale**: new turns since last generation → kick background refresh
- **MIN_TURNS guard**: sessions with < 5 user turns skip generation entirely

### Output

Single HTML page + metadata per session:

```
~/.coord/helm_state/{sid}.recap.html  — full rendered page
~/.coord/helm_state/{sid}.recap.json  — metadata (generated_at, turn_count, goal, result)
```

### Anti-Thrashing

| Guard | Value | Purpose |
|-------|-------|---------|
| Idle threshold | 2 min | Don't recap mid-conversation |
| Cooldown | 5 min | Don't re-kick same session repeatedly |
| MIN_TURNS | 5 | Skip trivial sessions |
| Turn-count check | `is_recap_stale()` | No recompute if nothing happened |

### Key Files

| File | Role |
|------|------|
| `helm/state_poller.py` | `_refresh_stale_recaps()` — idle detection + kick logic |
| `helm/recap.py` | `generate_recap()` — LLM call, HTML render, cache write |
| `helm/api.py` | `/recap/{sid}` endpoint — serves cached HTML |

### Cost

- Model: **Haiku** (~800 tokens output, ~$0.001 per recap)
- Only fires once per idle period per session (cooldown prevents repeated calls)
- Turn-count staleness check is pure file I/O (no LLM call)
