Session Bench v5

E2E: 0ms for 42 sessions (2 alive, 2 panes) | Correctness: 6/6 | 5 trials/op

E2E List
0ms
Sessions
42
Alive
2
JSONL
353 / 630MB
Correct
6/6

How Session Management Works

Claude Code runs in iTerm2 terminal panes. Each session has data in 3 places:

Data SourceWhatUpdated WhenAccess Cost
sessions.yaml Registry mapping session ID → iTerm pane ID, working directory, start time On session start (via SessionStart hook) ~1ms (YAML file read)
it2api iTerm2 CLI — lists live panes, reads terminal buffer, sets tab colors Real-time (queries iTerm2 directly) ~300ms per call (Python subprocess startup)
JSONL logs Full conversation transcripts — every message, tool call, version info Continuously as Claude works ~0.1ms per file (seek tail) or ~60ms (rg search all 630MB)
Hub DB SQLite — badge text (what the session is working on), set by sidekick On each user prompt (via hook) ~1ms (SQLite query)

Alive = session's iTerm pane still exists (confirmed by it2api). Stale = registered in sessions.yaml but pane was closed. Currently 2 alive, 40 stale of 42 total.

session_tool list cross-references all 4 sources: loads yaml, queries it2api for pane liveness + state, reads JSONL for version + wait state, loads hub badges. Buffer reads (~300ms each) are parallelized via ThreadPool(12).

Correctness

CheckExpectedActualDetail
Total sessions matches sessions.yaml4242
Alive count matches it2api cross-ref2 (yaml entries with live panes)2it2api: 2 panes total
No alive session has state=no_pane00
All dead sessions are no_pane0 exceptions0 exceptions
--alive-only returns only alive sessions22
No unregistered Claude panes00

E2E Operations

OperationWhatMedianMin–Max±NDistribution
session_tool list (42 sessions, 2 alive) Full E2E: load yaml, query 2 panes, read JSONL versions, detect quiescence + wait states — parallel enrichment 856ms 853–870 ±8 5 856ms
session_tool list --alive-only (2 alive) Same as list but filters to 2 alive sessions after enrichment 874ms 858–1003 ±63 5 874ms
session_tool cleanup --dry-run Load yaml + it2api list → find stale entries. No buffer reads 401ms 390–417 ±11 5 401ms
find --state idle Metadata-only: yaml + it2api list + title-based state. No rg, no buffer reads 411ms 401–424 ±10 5 411ms — 2 results
find 'model' 3-phase: metadata match → rg across all JSONL → parallel buffer reads for state 923ms 898–1240 ±144 5 923ms — 2 results
find 'benchmark' --all Same 3-phase but includes dead sessions (no pane). More JSONL matches to process 880ms 855–949 ±37 5 880ms — 21 results

Component Costs

ComponentWhatMedianMin–MaxNBar
it2api list-sessions List all iTerm2 panes — one subprocess spawn 303ms 300–331 5 303ms
it2api get-buffer (1 pane) Read terminal buffer for one pane — ~300ms is Python process startup 310ms 294–368 5 310ms
get-buffer x2 sequential Read 2 pane buffers one at a time 627ms 602–642 5 627ms
get-buffer x2 ThreadPool(12) Read 2 pane buffers via ThreadPoolExecutor(12) — current production path 423ms 373–426 5 423ms
rg content search (630MB) ripgrep 'model' across 353 JSONL files 55ms 54–92 5 55ms
JSONL version (1 session) Direct path lookup from cwd + seek last 8KB for version field 0ms 0–0 5 0ms
JSONL version (42 sessions) Direct path + seek tail for all 42 sessions 5ms 4–5 3 5ms — Found version in 23/42
tab color cycle (1 pane) set_tab_color → use_tab_color=true → use_tab_color=false. 3 it2api calls 1004ms 974–1028 5 1004ms
Hub DB badge load SQLite query for session badges 0ms 0–1 5 0ms — 0 badges
claude --version Get installed CC version — cached for 5min in production 51ms 51–53 3 51ms
YAML + it2api cross-ref Load sessions.yaml + it2api list → match entries to panes. Minimum cost for liveness check 327ms 299–343 5 327ms — 2 alive of 42

Scaling: How session_tool list Scales with N Alive Sessions

Fixed costs: yaml+it2api list (327ms) + JSONL version reads (5ms) + hub DB (~5ms).
Variable cost: buffer reads for alive sessions — 310ms per call via it2api subprocess.

N aliveSequentialThreadPool(12)MCPretentious (projected)
1 647ms 647ms 338ms
2 957ms 647ms 339ms
5 1888ms 647ms 342ms
10 3440ms 647ms 348ms
15 4992ms 957ms 354ms
25 8096ms 1268ms 366ms

ThreadPool(12): Currently implemented. Parallelizes buffer reads — effective when N > 12 (batches of 12).
MCPretentious: NOT implemented. Persistent WebSocket → 255x faster per buffer read (310ms → 1.2ms). The remaining optimization.

Optimization Status

OptimizationImpact
ThreadPoolExecutor(12) for buffer readsN×310ms → ceil(N/12)×310ms
Direct JSONL path from cwd (no rglob)~1000ms → 5ms for 42 sessions
seek(-8KB) tail read (no readlines)Avoids loading 45MB JSONL files
Cached claude --version (5min TTL)~50ms saved per call after first
MCPretentious for buffer reads310ms → 1.2ms per read (255x)

Generated: 2026-02-06 12:01:44 | v5