Session Bench v5

E2E: 0ms for 42 sessions (2 alive, 2 panes) | Correctness: 6/6 | 5 trials/op

How Session Management Works

Claude Code runs in iTerm2 terminal panes. Each session has data in 3 places:

Data Source	What	Updated When	Access Cost
sessions.yaml	Registry mapping session ID → iTerm pane ID, working directory, start time	On session start (via SessionStart hook)	~1ms (YAML file read)
it2api	iTerm2 CLI — lists live panes, reads terminal buffer, sets tab colors	Real-time (queries iTerm2 directly)	~300ms per call (Python subprocess startup)
JSONL logs	Full conversation transcripts — every message, tool call, version info	Continuously as Claude works	~0.1ms per file (seek tail) or ~60ms (rg search all 630MB)
Hub DB	SQLite — badge text (what the session is working on), set by sidekick	On each user prompt (via hook)	~1ms (SQLite query)

Alive = session's iTerm pane still exists (confirmed by it2api). Stale = registered in sessions.yaml but pane was closed. Currently 2 alive, 40 stale of 42 total.

session_tool list cross-references all 4 sources: loads yaml, queries it2api for pane liveness + state, reads JSONL for version + wait state, loads hub badges. Buffer reads (~300ms each) are parallelized via ThreadPool(12).

Correctness

	Check	Expected	Actual	Detail
✓	Total sessions matches sessions.yaml	42	42
✓	Alive count matches it2api cross-ref	2 (yaml entries with live panes)	2	it2api: 2 panes total
✓	No alive session has state=no_pane	0	0
✓	All dead sessions are no_pane	0 exceptions	0 exceptions
✓	--alive-only returns only alive sessions	2	2
✓	No unregistered Claude panes	0	0

Check

Expected

Actual

Detail

✓

Total sessions matches sessions.yaml

✓

Alive count matches it2api cross-ref

2 (yaml entries with live panes)

it2api: 2 panes total

✓

No alive session has state=no_pane

✓

All dead sessions are no_pane

0 exceptions

✓

--alive-only returns only alive sessions

✓

No unregistered Claude panes

E2E Operations

Operation	What	Median	Min–Max	±	N	Distribution
session_tool list (42 sessions, 2 alive)	Full E2E: load yaml, query 2 panes, read JSONL versions, detect quiescence + wait states — parallel enrichment	856ms	853–870	±8	5	856ms
session_tool list --alive-only (2 alive)	Same as list but filters to 2 alive sessions after enrichment	874ms	858–1003	±63	5	874ms
session_tool cleanup --dry-run	Load yaml + it2api list → find stale entries. No buffer reads	401ms	390–417	±11	5	401ms
find --state idle	Metadata-only: yaml + it2api list + title-based state. No rg, no buffer reads	411ms	401–424	±10	5	411ms — 2 results
find 'model'	3-phase: metadata match → rg across all JSONL → parallel buffer reads for state	923ms	898–1240	±144	5	923ms — 2 results
find 'benchmark' --all	Same 3-phase but includes dead sessions (no pane). More JSONL matches to process	880ms	855–949	±37	5	880ms — 21 results

Operation

What

Median

Min–Max

Distribution

session_tool list (42 sessions, 2 alive)

Full E2E: load yaml, query 2 panes, read JSONL versions, detect quiescence + wait states — parallel enrichment

856ms

853–870

±8

856ms

session_tool list --alive-only (2 alive)

Same as list but filters to 2 alive sessions after enrichment

874ms

858–1003

±63

874ms

session_tool cleanup --dry-run

Load yaml + it2api list → find stale entries. No buffer reads

401ms

390–417

±11

401ms

find --state idle

Metadata-only: yaml + it2api list + title-based state. No rg, no buffer reads

411ms

401–424

±10

411ms — 2 results

find 'model'

3-phase: metadata match → rg across all JSONL → parallel buffer reads for state

923ms

898–1240

±144

923ms — 2 results

find 'benchmark' --all

Same 3-phase but includes dead sessions (no pane). More JSONL matches to process

880ms

855–949

±37

880ms — 21 results

Component Costs

Component	What	Median	Min–Max	N	Bar
it2api list-sessions	List all iTerm2 panes — one subprocess spawn	303ms	300–331	5	303ms
it2api get-buffer (1 pane)	Read terminal buffer for one pane — ~300ms is Python process startup	310ms	294–368	5	310ms
get-buffer x2 sequential	Read 2 pane buffers one at a time	627ms	602–642	5	627ms
get-buffer x2 ThreadPool(12)	Read 2 pane buffers via ThreadPoolExecutor(12) — current production path	423ms	373–426	5	423ms
rg content search (630MB)	ripgrep 'model' across 353 JSONL files	55ms	54–92	5	55ms
JSONL version (1 session)	Direct path lookup from cwd + seek last 8KB for version field	0ms	0–0	5	0ms
JSONL version (42 sessions)	Direct path + seek tail for all 42 sessions	5ms	4–5	3	5ms — Found version in 23/42
tab color cycle (1 pane)	set_tab_color → use_tab_color=true → use_tab_color=false. 3 it2api calls	1004ms	974–1028	5	1004ms
Hub DB badge load	SQLite query for session badges	0ms	0–1	5	0ms — 0 badges
claude --version	Get installed CC version — cached for 5min in production	51ms	51–53	3	51ms
YAML + it2api cross-ref	Load sessions.yaml + it2api list → match entries to panes. Minimum cost for liveness check	327ms	299–343	5	327ms — 2 alive of 42

Component

What

Median

Min–Max

Bar

it2api list-sessions

List all iTerm2 panes — one subprocess spawn

303ms

300–331

303ms

it2api get-buffer (1 pane)

Read terminal buffer for one pane — ~300ms is Python process startup

310ms

294–368

310ms

get-buffer x2 sequential

Read 2 pane buffers one at a time

627ms

602–642

627ms

get-buffer x2 ThreadPool(12)

Read 2 pane buffers via ThreadPoolExecutor(12) — current production path

423ms

373–426

423ms

rg content search (630MB)

ripgrep 'model' across 353 JSONL files

55ms

54–92

55ms

JSONL version (1 session)

Direct path lookup from cwd + seek last 8KB for version field

0ms

0–0

0ms

JSONL version (42 sessions)

Direct path + seek tail for all 42 sessions

5ms

4–5

5ms — Found version in 23/42

tab color cycle (1 pane)

set_tab_color → use_tab_color=true → use_tab_color=false. 3 it2api calls

1004ms

974–1028

1004ms

Hub DB badge load

SQLite query for session badges

0ms

0–1

0ms — 0 badges

claude --version

Get installed CC version — cached for 5min in production

51ms

51–53

51ms

YAML + it2api cross-ref

Load sessions.yaml + it2api list → match entries to panes. Minimum cost for liveness check

327ms

299–343

327ms — 2 alive of 42

Scaling: How session_tool list Scales with N Alive Sessions

Fixed costs: yaml+it2api list (327ms) + JSONL version reads (5ms) + hub DB (~5ms).
Variable cost: buffer reads for alive sessions — 310ms per call via it2api subprocess.

N alive	Sequential	ThreadPool(12)	MCPretentious (projected)
1	647ms	647ms	338ms
2	957ms	647ms	339ms
5	1888ms	647ms	342ms
10	3440ms	647ms	348ms
15	4992ms	957ms	354ms
25	8096ms	1268ms	366ms

ThreadPool(12): Currently implemented. Parallelizes buffer reads — effective when N > 12 (batches of 12).
MCPretentious: NOT implemented. Persistent WebSocket → 255x faster per buffer read (310ms → 1.2ms). The remaining optimization.

Optimization Status

	Optimization	Impact
✓	ThreadPoolExecutor(12) for buffer reads	N×310ms → ceil(N/12)×310ms
✓	Direct JSONL path from cwd (no rglob)	~1000ms → 5ms for 42 sessions
✓	seek(-8KB) tail read (no readlines)	Avoids loading 45MB JSONL files
✓	Cached claude --version (5min TTL)	~50ms saved per call after first
○	MCPretentious for buffer reads	310ms → 1.2ms per read (255x)