# Generic YouTube Portal — Implementation Plan

> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

**Goal:** Generalize the HG YouTube portal to support any channel, with a unified DB and channel-filtered views.

**Architecture:** One unified `youtube.db` in `lib/semnet/data/`. Portal generator parameterized by channel. Pool resolver as testable unit for URL selection. HG becomes an instance of generic code.

**Tech Stack:** Python, SQLite, Click CLI, lunr.js (client-side search), YAML config

---

### Task 1: Schema — add `channel` column to chunks table

**Files:**
- Modify: `lib/semnet/schema.py`

**Step 1:** Add `channel TEXT DEFAULT ''` to CHUNKS_SCHEMA, add index.

**Step 2:** Add migration in `ensure_schema()` — ALTER TABLE ADD COLUMN if missing (safe for existing DBs).

**Step 3:** Run existing tests to verify no breakage: `pytest lib/semnet/tests/ -v`

**Step 4:** Commit: `feat(semanticnet): add channel column to chunks schema`

---

### Task 2: Create `lib/semnet/data/` and pool config

**Files:**
- Create: `lib/semnet/data/.gitkeep`
- Create: `lib/semnet/pool/__init__.py`
- Create: `lib/semnet/pool/config.yaml`
- Create: `lib/semnet/pool/resolve.py`
- Create: `lib/semnet/pool/tests/test_resolve.py`

**Pool config shape:**
```yaml
channels:
  a16z:
    title: "a16z"
    sources:
      - type: youtube_channel
        url: https://www.youtube.com/@a16z
    max_age_days: 730
    min_score: 40
    limit: 50
    sort: relevance
    tags: [tech, vc, startups]

  healthygamer:
    title: "HealthyGamer"
    sources:
      - type: youtube_channel
        url: https://www.youtube.com/@HealthyGamerGG
    sort: recency
    tags: [mental-health, psychology]
```

**Resolver:** Reads from jobs DB (discovery data + scores), applies filters, returns ranked list. CLI: `python -m lib.semnet.pool resolve a16z`

**Step 1:** Write test for resolve with mock data.
**Step 2:** Implement resolver.
**Step 3:** Verify: `pytest lib/semnet/pool/tests/ -v`
**Step 4:** Commit: `feat(semanticnet): pool resolver for channel URL management`

---

### Task 3: Generic portal generator

**Files:**
- Create: `lib/semnet/portal/__init__.py`
- Create: `lib/semnet/portal/generate.py` (extracted from `projects/healthygamer/generate_portal.py`)

**Key changes from HG version:**
- `generate(channel: str, *, title: str | None = None, output_dir: Path | None = None)` — parameterized
- Reads from unified DB, filtered by `WHERE channel = ?`
- Title defaults to channel config title
- Output dir defaults to `projects/{channel}/portal/`
- No HG branding — uses channel title
- Drops review queue seeding (HG-specific)
- CLI: `python -m lib.semnet.portal generate a16z`

**Step 1:** Create portal/generate.py with parameterized generate().
**Step 2:** Verify generates same output for HG data.
**Step 3:** Commit: `feat(semanticnet): generic portal generator`

---

### Task 4: Generic pipeline

**Files:**
- Create: `lib/semnet/pipeline.py` (extracted from `projects/healthygamer/pipeline.py`)

**Key changes:**
- Takes channel name as parameter, looks up config
- Uses unified DB at `lib/semnet/data/youtube.db`
- Sets `channel` column on stored chunks
- CLI: `python -m lib.semnet.pipeline process a16z --limit 50`

**Step 1:** Create pipeline.py with channel-parameterized process.
**Step 2:** Commit: `feat(semanticnet): generic pipeline with channel parameter`

---

### Task 5: Make HG an instance

**Files:**
- Modify: `projects/healthygamer/generate_portal.py` → thin wrapper
- Modify: `projects/healthygamer/pipeline.py` → thin wrapper
- Keep: `projects/healthygamer/adapter.py` (taxonomy + prompts, still needed for extraction)

**Step 1:** Replace generate_portal.py body with import + call to generic.
**Step 2:** Replace pipeline.py body with import + call to generic.
**Step 3:** Verify HG portal still generates correctly.
**Step 4:** Commit: `refactor(healthygamer): use generic portal + pipeline`

---

### Task 6: a16z adapter + first test

**Files:**
- Create: `projects/a16z/` directory
- Create: `projects/a16z/adapter.py` (minimal — just points at content dir)
- Optional: `projects/a16z/portal/` (generated output)

**Step 1:** Create a16z adapter pointing at `jobs/data/a16z/` content.
**Step 2:** Run pool resolver: `python -m lib.semnet.pool resolve a16z`
**Step 3:** Run pipeline: `python -m lib.semnet.pipeline process a16z --limit 5`
**Step 4:** Generate portal: `python -m lib.semnet.portal generate a16z`
**Step 5:** Commit: `feat(a16z): first channel portal via generic pipeline`
