# Draft Demo Polish — Results Summary

**Date**: 2026-03-09
**Document**: "How to Do Great Work" by Paul Graham (11,517 words)
**Report**: https://static.localhost/reports/tmp/draft-demo-polish-20260309.html

## What Changed

### Critical Fix: Role Extraction Prompt
The extraction prompt was labeling 87% of segments as "explanation" — PG's bold claims,
analogies, and definitions were all being bucketed into one role. Three prompt changes fixed this:

1. **Disambiguation section** — explicit tests for claim vs explanation vs evidence
2. **Distribution check** — "if >50% of segments share one role, reconsider"
3. **Self-contained labels** — labels must be readable without the document context

**Result**: 44% claim / 22% explanation / 34% other (10 of 12 roles used)

### UI Improvements
- Document pane now shows H1 title, author name, and source URL
- Right-margin annotations show claim/analogy/concession labels alongside text
- clean_text_for_display() strips markdown artifacts (table borders, pipe chars)
- CLI --author flag for role map

### Files Modified
- draft/core/roles.py — prompt engineering
- draft/core/mapper.py — doc header, margin notes, text cleaning
- draft/cli.py — --author flag, text cleaning, source URL inference

## Gym Scores

| Gym      | Best Model    | Score  | Runner-up     | Score  |
|----------|---------------|--------|---------------|--------|
| Roles    | gemini-flash  | 77.0   | sonnet        | 73.5   |
| Style    | sonnet        | 77.3   | grok-fast     | 70.0   |
| Claims   | sonnet        | 77.2   | grok-fast     | 75.9   |

**Recommendation**: gemini-flash for roles (fast, free), sonnet for style+claims (quality).

## Demo Readiness

| Dimension     | Status    | Notes                                       |
|---------------|-----------|---------------------------------------------|
| Role Map      | Ready     | Diverse roles, good labels, margin notes    |
| Style Eval    | Ready     | Specific issues+strengths, calibrated scores|
| Claims        | Ready     | Detection works, evidence linking ~55%      |
| CLI           | Ready     | map, style, improve, claims all functional  |
| UI Server     | Running   | draft.localhost:7980                        |

**Overall**: Demo-ready. The role map is the strongest showcase — visually striking
with the color-coded segments and margin annotations. Style eval provides credible,
actionable feedback. Claims analysis works but evidence linking could be stronger.
