Assessment of async badge update system
| Event | Handler |
|---|---|
| UserPromptSubmit | handler.py user_prompt |
| PostToolUse | handler.py tool_use |
Hooks are correctly wired in ~/.claude/settings.json
120 seconds between badge updates
Lock file: /tmp/sidekick-badge-lock
Prevents spam during rapid prompts.
Events posted to http://localhost:7895
Timeout: 2 seconds (non-blocking)
Gracefully handles hub being down.
Model: gemini/gemini-2.5-flash-lite
Now async via subprocess — hook returns immediately
subprocess.Popen([...badge_worker.py...], start_new_session=True)
✓ Fixed 2026-02-05
14 tests in supervisor/sidekick/hooks/test_handler.py
✓ Added 2026-02-05
Basic format tests added (header, label).
No visual/screenshot expectations yet.
| Component | Status | Issue |
|---|---|---|
| Hook wiring | OK | — |
| Hub posting | OK | — |
| Debounce | OK | — |
| LLM call | OK | Async subprocess (haiku, free) |
| Tests | OK | 14 tests passing |
| Expectations | PARTIAL | Basic format tests, no visual |
Claude Code Session
│
▼ (Hook: UserPromptSubmit)
┌──────────────────────────────────────┐
│ handler.py user_prompt │
│ ├─→ post_event() to Hub [2s timeout]│ ✅ Non-blocking
│ └─→ spawn_badge_worker() │ ✅ Fire-and-forget
└──────────────────────────────────────┘
│ (returns immediately)
▼
┌──────────────────────────────────────┐
│ badge_worker.py (subprocess) │
│ ├─→ litellm.completion(haiku) │ ~2s (free)
│ └─→ badge $'...' │ ✅ Sets badge
└──────────────────────────────────────┘
| File | Purpose |
|---|---|
supervisor/sidekick/hooks/handler.py | Main hook handler |
supervisor/sidekick/CLAUDE.md | Documentation |
~/.claude/settings.json | Hook wiring config |
/tmp/sidekick-badge-lock | Debounce lock file |
| Approach | Latency | Notes |
|---|---|---|
| Direct Anthropic SDK | ~1.0s | Best possible |
| Subprocess + SDK (current) | ~2.1s | Non-blocking |
| litellm | ~2.3s | +0.5s import overhead |
| claude --print | ~5.0s | Heavy CLI startup |
| Hot worker (future) | ~0.8s | No startup cost |
See: lib/llm/benchmarks/CALLING_OVERHEAD.md
A persistent LLM worker would cut latency to ~0.8s:
┌─────────────────────────────────────┐
│ llm-worker (always running) │
│ - anthropic client pre-initialized │
│ - Listens on unix socket or HTTP │
└─────────────────────────────────────┘
▲ POST /complete
handler.py → instant response
Options: add to hub, standalone worker, or unix socket daemon.