# Helm → Jobs Integration: Prep & Architecture

## Vision

The codebase improves itself. TODO items are discovered, classified, prioritized,
and executed autonomously during idle time. Jobs already does this for external
content. Internal codebase work should use the same system.

## Architecture

```
                    DISCOVERY                    EXECUTION
External:   Finnhub/YouTube/Serper  ──→  jobs runner  ──→  Python handlers (in-process)
Internal:   Helm autodo scan        ──→  jobs runner  ──→  script / LLM / vario / Claude session
                                              │
                                         jobs DB (tracking)
                                              │
                                    ┌─────────┼──────────┐
                                 Doctor    Jobs UI    Jobs UI
                               (health)  (External) (Internal tab)
```

- **Helm** = discovery source. Scans TODO.md, convention checks, doc drift → writes work_items to jobs DB.
- **Jobs runner** = executes. Claims items, calls handler. Internal handler dispatches by execution tier.
- **Helm auto-respond** = orthogonal. Approves plans / answers questions for ALL waiting sessions.
- **Doctor** = monitors for free. Stuck items, failure classification, circuit breaker.

## Not Jobs (Sidekick Tasks)

Badge/tree LLM gen, learning extraction, iTerm2 updates, recap refresh,
auto-respond keystrokes — all event-driven reactions, not queued work. Stay in helm.

**Boundary**: lifecycle (pending→done) + retry/tracking/scheduling → job.
One-shot event reaction → sidekick.

## Execution Tiers

| Tier | When | Examples | Cost |
|---|---|---|---|
| **Script** | Deterministic, no LLM | Broken links, port mismatch, lint | Free |
| **LLM call** | Reasoning, no tools | Classify item, score violation, draft fix | ~$0.01 |
| **Vario** | Judgment, consensus helps | Design review, naming, eval criteria | ~$0.05-0.20 |
| **Claude session** | Multi-file editing, tools | Refactor, complex bugs, reports | ~$0.50-2.00 |

The `classify` stage picks the tier. The `execute` stage dispatches:

```python
match data.get("execution_tier", "claude_session"):
    case "script":         return await run_check_script(data)
    case "llm_call":       return await run_llm_fix(data)
    case "vario":          return await run_vario_review(data)
    case "claude_session": return await fork_and_monitor(data)
```

## Schema Mapping (helm Todo → jobs work_item)

Most fields go in `data` JSON: title, description, tier, type, risk, scope,
auto_start, source, fingerprint, max_minutes, pause_policy, effort, project.

Direct columns: `item_key` ← id, `priority` ← priority, `status` ← status,
`source_file` ← source_file (V3 schema has this).

New fields needed: `recurrence` (daily/weekly), `last_completed`.

## Tasks

### Task 1: Add `internal` flag to jobs config
Add `internal: bool` to job config in `jobs.yaml`. UI reads it to route to
correct tab. One field in config parsing + one filter in queries.

### Task 2: Add recurrence support to tracker
- `recurrence` + `last_completed` in `data` JSON or new columns
- After done: if recurrent, reset → pending, bump `last_completed`
- Guard: skip if completed within recurrence interval
- **Files**: `jobs/lib/tracker.py`, `jobs/runner.py`

### Task 3: Add eligibility gating to runner
Pre-claim filter on `data` JSON fields + idle time:
```yaml
eligibility:
  require_user_idle_minutes: 15
  data_filter: { risk: [low, null], auto_start: true }
```
- **Files**: `jobs/runner.py`, `jobs/jobs.yaml`
- Reads `~/.coord/last_active.json` (already written by helm hooks)

### Task 4: Write `autodo_scan` discovery strategy
Register in `jobs/lib/discovery.py`. Reuse planner logic from
`helm/autodo/planner.py` — scan TODO.md files, classify via LLM,
return as work items.

### Task 5: Write `codebase_maintenance` handler
`jobs/handlers/codebase_maintenance.py` with three stages:
- `classify`: in-process LLM → pick execution tier
- `execute`: dispatch by tier (Claude session tier imports fork/poll from helm)
- `review`: in-process LLM → accept/needs_work/reject

### Task 6: Add job entry to jobs.yaml
```yaml
codebase_maintenance:
  internal: true
  emoji: "🔧"
  discovery: { strategy: autodo_scan, interval: 3600 }
  handler: handlers.codebase_maintenance.process
  stages: [classify, execute, review]
  eligibility: { require_user_idle_minutes: 15 }
```

### Task 7: Add "Internal" tab to jobs UI
Filter `internal: true` jobs. Reuse existing components. Add: session link,
report link, review status columns.

### Task 8: Prototype with doc-health
End-to-end: discovery → claim → classify → execute → review → recurrence reset.
Verify in Internal tab.

### Task 9: Migrate queue.yaml → jobs DB
Script: read `queue.yaml` → insert work_items. Delete `helm/autodo/todo.py`,
`helm/todos/queue.yaml`. Keep `prompts.py`, `review.py`. Absorb `engine.py`
into handler.

## Future: Doctor's Role (post-integration)

Doctor already classifies external job failures. Post-integration questions:
- Monitor internal failures the same way?
- Check stuck Claude sessions (instead of helm)?
- Auto-classify: session timeout vs code bug vs user-returned?
- Feed failed items back into discovery as new TODOs?

Separate design exercise. Punt until integration works.

## Related Docs

- `2026-02-26-unified-work-system-v3.md` — Jobs cleanup + autodo schema
- `2026-02-26-jobs-action-plan.md` — Tier 1-4 improvements
- `2026-02-26-jobs-vario-critique-v2.md` — Multi-model consensus