# Preflight Skill Design

**Date**: 2026-03-02
**Status**: Approved

## Problem

We have 14 batch-job principles in `~/.claude/principles/batch-jobs.md` and a rich learning DB, but nothing that *applies* them proactively before kicking off long-running work. The knowledge exists — the application is missing.

Concrete anti-patterns this catches:
- Running a 500-item batch with no checkpoint — crash = start over
- stderr suppressed (`capture_output=True`, no logging) — "Failed but stderr was suppressed"
- Sequential batch stages — waiting for slowest stage across all items before starting next
- No progress indication — staring at a blank terminal
- No sample validation — discovers output is wrong after processing everything

## Solution

A `/preflight` skill that fires before any operation expected to take >30s. It:
1. Runs a quick `recall` pass for domain-specific gotchas from the learning DB
2. Reads the actual code about to execute
3. Scores it against 7 operational readiness dimensions
4. Produces a terse pass/fail verdict table

## The 7 Dimensions

| # | Dimension        | Pass                                                    | Fail                                                     |
|---|------------------|---------------------------------------------------------|----------------------------------------------------------|
| 1 | Cache            | Intermediate results saved; re-runs skip completed work | Re-fetches/re-computes everything from scratch           |
| 2 | Checkpoint       | Per-item/per-stage persistence; crash resumes           | All-or-nothing — crash = start over                      |
| 3 | Restartable      | Idempotent — safe to re-run without side effects        | Duplicates, double-sends, or corrupted state on re-run   |
| 4 | Progress         | Shows done/total, current stage, ETA or rate            | Silent until done (or failed)                            |
| 5 | Errors visible   | Failures logged with context, stderr not suppressed     | Silent swallowing, bare except, stderr to /dev/null      |
| 6 | Pipeline flow    | Stages pipelined — next starts as items complete        | Sequential batch — all items finish stage N before N+1   |
| 7 | Validate first   | Small sample (3-10) run + verified before full batch    | Runs all N, discovers output wrong at the end            |

## Output Format

```
## Preflight: [operation name]

**From past learnings:**
- [domain-specific gotcha from learning DB]
- [another if applicable]

| # | Check          | Verdict |
|---|----------------|---------|
| 1 | Cache          | ✅       |
| 2 | Checkpoint     | ❌ no per-item persistence — crash loses all progress |
| 3 | Restartable    | ✅       |
| 4 | Progress       | ⚠️ logs per-item but no total/ETA |
| 5 | Errors visible | ❌ subprocess stderr suppressed |
| 6 | Pipeline flow  | ✅ async stages with per-item handoff |
| 7 | Validate first | ⚠️ no explicit sample mode |

2 reds, 2 warnings. Fix before running at scale.
```

Verdicts: ✅ pass, ⚠️ partial, ❌ fail. Expand only on non-green.

## Procedure

1. **Identify the operation** — from context, current code, or explicit argument
2. **Recall** — `learn find -s "<operation context>" --limit 3` for domain gotchas
3. **Read the code** — Glob + Read the script/handler/pipeline (not guessing)
4. **Score each dimension** — by examining actual code
5. **Produce verdict table** — terse, expand only on reds/warnings
6. **Recommend** — if reds exist, suggest fixes. Don't block — user decides.

## Trigger Mechanism (3 layers)

1. **Skill description** (in system prompt): "Use before running any operation expected to take >30s — batch jobs, pipelines, data processing. Operational readiness checklist with recall integration."
2. **skill-triggers.md**: Natural language patterns for when to invoke
3. **CLAUDE.md convention**: "Before any operation >30s, invoke /preflight"

## Relationship to Other Skills

- **recall**: Preflight includes a recall pass — no need to invoke separately before execution
- **jobs**: Jobs skill monitors running jobs. Preflight fires *before* you start one
- **autonomous**: Autonomous hands off work. Preflight audits before handoff

## Not in Scope

- Doesn't run or profile the code
- Doesn't auto-fix (just flags)
- Doesn't apply to quick operations (<30s) unless explicitly invoked
