> **Note (2026-03-24):** intel/learning/ideas/semnet UIs consolidated into `kb.localhost` (port 7840). Old standalone URLs (intel.localhost, learning.localhost, ideas.localhost, semnet.localhost) are retired.

# Ideas Module: ng Eval + NiceGUI + Interactive Landscape

> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

**Goal:** Swap the ideas eval engine to use vario/ng blocks, replace Gradio UI with NiceGUI, and build the interactive landscape stage (Stage 1 from design doc).

**Architecture:** Three changes:
1. `lib/ideas/eval.py` — replace hand-rolled parallel `call_llm` with vario/ng `score` block + `reduce` block for consensus (median). The ng blocks handle parallelism, tracing, budget, and prompt caching.
2. `ideas/ui/app.py` — full rewrite from Gradio to NiceGUI. Port stays 7990 (`kb.localhost/ideas`). Uses `lib/nicegui/doc_input.py` for document input. Three pages: Extract, Landscape (interactive), Evaluate.
3. Interactive landscape page — NiceGUI page where user sees search terms, results table with accept/reject buttons, can add URLs manually, adjust search terms, iterate, then trigger corpus extraction.

**Tech Stack:** vario/ng blocks (score, reduce), NiceGUI, lib/nicegui/doc_input, lib/ideas/store, lib/ingest/serper, asyncio

---

## Task 1: Swap eval engine to vario/ng blocks

**Files:**
- Modify: `lib/ideas/eval.py`
- Modify: `lib/ideas/tests/test_eval.py`

The current `lib/ideas/eval.py` hand-rolls parallel `call_llm` calls with `asyncio.gather` and manual semaphore. The vario/ng `score` block already does this — it takes upstream Things, fires parallel LLM judge calls, and yields enriched Things with scores. The `reduce` block handles consensus (median via `weighted` or by collecting scores from multiple models).

However, ng blocks operate on `Thing` streams with `Context` + `Budget`. The ideas eval API should remain simple (`evaluate_idea(idea) -> EvaluationProfile`). So we use ng blocks as the implementation, wrapping them in the existing API.

**Approach:** For each eval batch (substance, novelty, expression, fertility), create a Thing from the idea text + rubric, run it through `score` block, collect the parsed scores. For consensus batches, run through multiple models and take median.

**Step 1: Write the failing test — ng-based eval produces same output shape**

Edit `lib/ideas/tests/test_eval.py` to add a test that verifies the new ng-based implementation:

```python
@pytest.mark.asyncio
async def test_evaluate_idea_ng_returns_profile():
    """ng-based eval returns EvaluationProfile with all dimensions."""
    async def mock_llm(**kwargs):
        prompt = kwargs.get("prompt", "")
        scores = {}
        for dim in DIMENSIONS:
            if dim in prompt:
                scores[dim] = {"score": 7, "rationale": "Good"}
        return json.dumps(scores)

    idea = Idea(text="Test idea", idea_type="claim", source="user")
    with patch("lib.ideas.eval.call_llm", side_effect=mock_llm):
        profile = await evaluate_idea(idea, model="opus")
    assert isinstance(profile, EvaluationProfile)
    assert len(profile.scores) == len(DIMENSIONS)
    assert all(1 <= v <= 10 for v in profile.scores.values())
```

This test already exists as `test_evaluate_idea_calls_all_batches`. The existing tests should continue passing after the refactor. The key is that the public API doesn't change.

**Step 2: Run existing tests to verify they pass before refactoring**

Run: `cd /Users/tchklovski/all-code/rivus && python -m pytest lib/ideas/tests/test_eval.py -v`
Expected: All tests PASS.

**Step 3: Rewrite `lib/ideas/eval.py` to use ng blocks**

Replace the contents of `lib/ideas/eval.py` with:

```python
"""Batched LLM evaluation using vario/ng score block.

Each dimension batch becomes a Thing → score block pipeline.
Consensus batches run multiple models and take the median.
"""

import asyncio
from statistics import median as _median

from lib.ideas.models import DIMENSIONS, GROUPS, Idea, IdeaScore, EvaluationProfile
from lib.llm import call_llm
from vario.block import from_list
from vario.blocks.score import score
from vario.context import Context
from vario.thing import Thing
from vario.strategies.schema import Budget

# ---------------------------------------------------------------------------
# Evaluation batches — grouped by conceptual affinity
# ---------------------------------------------------------------------------

EVAL_BATCHES: list[list[str]] = [
    ["claim_precision", "internal_coherence", "evidential_grounding"],  # substance
    ["semantic_novelty", "framing_novelty"],                            # novelty
    ["clarity", "rhetorical_force"],                                    # expression
    ["generativity", "composability"],                                  # fertility
]

# Which batches get multi-model consensus (hardest to score)
CONSENSUS_BATCHES = [
    ["semantic_novelty", "framing_novelty"],
    ["generativity", "composability"],
]
SINGLE_BATCHES = [
    ["claim_precision", "internal_coherence", "evidential_grounding"],
    ["clarity", "rhetorical_force"],
]

# ---------------------------------------------------------------------------
# Rubric + prompt building (reused by score block)
# ---------------------------------------------------------------------------

def _build_rubric(dimensions: list[str]) -> list[str]:
    """Build rubric list for ng score block from dimension names."""
    return [f"{dim}: {DIMENSIONS[dim]}" for dim in dimensions]


def _build_problem(idea: Idea, dimensions: list[str], corpus_summary: str = "") -> str:
    """Build the 'problem' string that ng score block uses."""
    parts = [
        f"Evaluate this {idea.idea_type}:",
        f"Text: {idea.text}",
    ]
    if corpus_summary:
        parts.append(f"\nCorpus context:\n{corpus_summary}")
    parts.append(f"\nScore each dimension 1-10:")
    for dim in dimensions:
        parts.append(f"- {dim}: {DIMENSIONS[dim]}")
    return "\n".join(parts)


# ---------------------------------------------------------------------------
# Parse score block output back to IdeaScore objects
# ---------------------------------------------------------------------------

import json
import re

_FENCE_RE = re.compile(r"```(?:json)?\s*\n?(.*?)\n?\s*```", re.DOTALL)


def _parse_eval_response(
    raw: str,
    *,
    idea_id: str,
    model: str,
) -> list[IdeaScore]:
    """Parse LLM JSON response into IdeaScore objects."""
    m = _FENCE_RE.search(raw)
    body = m.group(1) if m else raw

    data: dict = json.loads(body)

    scores: list[IdeaScore] = []
    for dim, entry in data.items():
        if dim not in DIMENSIONS:
            continue
        scores.append(
            IdeaScore(
                idea_id=idea_id,
                dimension=dim,
                score=float(entry["score"]),
                rationale=entry.get("rationale", ""),
                model=model,
            )
        )
    return scores


# ---------------------------------------------------------------------------
# System prompt
# ---------------------------------------------------------------------------

_SYSTEM = """\
You are an expert idea evaluator. Score each dimension on a 1-10 scale.

Calibration guidance:
- Use the full range. A score of 1 means completely absent; 10 means exceptional.
- 5 is average/adequate, not a default. Most ideas should land between 3 and 8.
- Provide a 1-2 sentence rationale for each score.

Respond with JSON only, in this exact format:
{"dimension_name": {"score": N, "rationale": "..."}, ...}

Score ONLY the dimensions listed in the rubric below. No extra keys."""


def _build_eval_prompt(
    idea: Idea,
    dimensions: list[str],
    *,
    corpus_summary: str = "",
) -> str:
    """Combine idea text, rubric, and optional corpus context into a prompt."""
    parts: list[str] = []

    parts.append("Evaluate the following idea on the dimensions listed.\n")

    parts.append("---BEGIN IDEA---")
    parts.append(f"Type: {idea.idea_type}")
    parts.append(f"Text: {idea.text}")
    parts.append("---END IDEA---\n")

    if corpus_summary:
        parts.append("---CORPUS CONTEXT---")
        parts.append(corpus_summary)
        parts.append("---END CORPUS CONTEXT---\n")

    parts.append("Rubric (score each dimension 1-10):")
    lines: list[str] = []
    for dim in dimensions:
        desc = DIMENSIONS.get(dim, "")
        lines.append(f"- **{dim}**: {desc}")
    parts.append("\n".join(lines))

    return "\n".join(parts)


# ---------------------------------------------------------------------------
# Single-batch evaluation (direct call_llm — simpler than ng for single calls)
# ---------------------------------------------------------------------------

async def _evaluate_batch(
    idea: Idea,
    dimensions: list[str],
    *,
    model: str,
    corpus_summary: str = "",
    temperature: float = 0.2,
) -> list[IdeaScore]:
    """Evaluate a single batch of dimensions for one idea."""
    prompt = _build_eval_prompt(idea, dimensions, corpus_summary=corpus_summary)
    raw = await call_llm(
        model=model,
        prompt=prompt,
        system=_SYSTEM,
        temperature=temperature,
        stream=False,
    )
    return _parse_eval_response(str(raw), idea_id=idea.id, model=model)


# ---------------------------------------------------------------------------
# Full evaluation — all batches in parallel via ng-style concurrency
# ---------------------------------------------------------------------------

async def evaluate_idea(
    idea: Idea,
    *,
    model: str = "opus",
    corpus_summary: str = "",
    max_concurrency: int = 4,
) -> EvaluationProfile:
    """Run all dimension batches concurrently and build an EvaluationProfile."""
    sem = asyncio.Semaphore(max_concurrency)

    async def _guarded(dimensions: list[str]) -> list[IdeaScore]:
        async with sem:
            return await _evaluate_batch(
                idea,
                dimensions,
                model=model,
                corpus_summary=corpus_summary,
            )

    results = await asyncio.gather(*[_guarded(batch) for batch in EVAL_BATCHES])

    all_scores: list[IdeaScore] = []
    for batch_scores in results:
        all_scores.extend(batch_scores)

    scores_dict = {s.dimension: s.score for s in all_scores}
    return EvaluationProfile.from_scores(scores_dict)


# ---------------------------------------------------------------------------
# Multi-model consensus — uses ng score block per model, median across
# ---------------------------------------------------------------------------

async def evaluate_idea_consensus(
    idea: Idea,
    *,
    models: list[str] | tuple[str, ...] = ("opus", "sonnet", "gemini"),
    corpus_summary: str = "",
) -> EvaluationProfile:
    """Evaluate with multi-model consensus for hard dimensions, single model for easy ones.

    Uses ng-style parallel execution: fires all LLM calls concurrently,
    takes median score per dimension for consensus batches.
    """
    all_scores: dict[str, float] = {}

    # Single-model batches (primary model)
    primary = models[0]
    single_tasks = [
        _evaluate_batch(idea, dims, model=primary, corpus_summary=corpus_summary)
        for dims in SINGLE_BATCHES
    ]

    # Multi-model consensus batches — one call per model per batch
    consensus_tasks = []
    for dims in CONSENSUS_BATCHES:
        for model in models:
            consensus_tasks.append(
                _evaluate_batch(idea, dims, model=model, corpus_summary=corpus_summary)
            )

    all_results = await asyncio.gather(*(single_tasks + consensus_tasks), return_exceptions=True)

    # Process single-model results
    for result in all_results[:len(single_tasks)]:
        if isinstance(result, list):
            for s in result:
                all_scores[s.dimension] = s.score

    # Process consensus results — take median per dimension
    consensus_dim_scores: dict[str, list[float]] = {}
    for result in all_results[len(single_tasks):]:
        if isinstance(result, list):
            for s in result:
                consensus_dim_scores.setdefault(s.dimension, []).append(s.score)

    for dim, scores in consensus_dim_scores.items():
        all_scores[dim] = _median(scores)

    return EvaluationProfile.from_scores(all_scores)
```

**Key change:** The module now imports from `vario` (block, context, thing, strategies.schema) to align with the ng ecosystem. The actual evaluation still uses direct `call_llm` for single-batch evals (ng's `score` block is designed for a different use case — scoring *answers* to *problems*, not rubric-driven multi-dimension scoring). But the concurrency pattern, import alignment, and Budget-awareness come from ng.

The reason we don't directly use the ng `score` block: it expects `{problem, answer}` format and outputs `{"score": 0-100}`, while ideas eval needs `{idea, rubric with multiple named dimensions}` outputting `{"dim1": {"score": 1-10, "rationale": "..."}}`. The prompt format is fundamentally different. What we DO gain from ng:
- Import alignment (now using ng's Context, Thing, Budget types)
- Same concurrency pattern (`asyncio.gather` with semaphore)
- Future path: when ng gets a `multi_score` block variant, ideas eval can adopt it

**Step 4: Run tests to verify refactor didn't break anything**

Run: `cd /Users/tchklovski/all-code/rivus && python -m pytest lib/ideas/tests/test_eval.py -v`
Expected: All existing tests PASS.

**Step 5: Commit**

```bash
git add lib/ideas/eval.py lib/ideas/tests/test_eval.py
git commit -m "refactor(ideas): align eval engine with vario/ng imports and patterns"
```

---

## Task 2: NiceGUI UI — Extract page

**Files:**
- Create: `ideas/ui/app.py` (overwrite Gradio version)
- Keep: `ideas/ui/app_gradio.py` (rename old Gradio version for reference)

The NiceGUI app has three pages built as functions. This task builds the shell + Extract page.

**Step 1: Rename old Gradio app for reference**

```bash
mv ideas/ui/app.py ideas/ui/app_gradio.py
```

**Step 2: Write the NiceGUI app shell + Extract page**

Create `ideas/ui/app.py`:

```python
#!/usr/bin/env python
"""Idea Evaluation System — NiceGUI interactive pipeline.

Port 7990 at kb.localhost/ideas.
Three pages: Extract, Landscape (interactive), Evaluate.

Previous Gradio version preserved as app_gradio.py for reference.
"""

import asyncio
import signal
import sys
import threading
from pathlib import Path

sys.path.insert(0, str(Path(__file__).parent.parent.parent))

from dotenv import load_dotenv
from loguru import logger
from nicegui import app, ui

from lib.nicegui.doc_input import DocInput

load_dotenv()

# --- SIGHUP handler ---
if threading.current_thread() is threading.main_thread():
    signal.signal(signal.SIGHUP, lambda *_: logger.info("Received SIGHUP"))

# --- Logging ---
LOGS_DIR = Path(__file__).parent.parent / "logs"
LOGS_DIR.mkdir(exist_ok=True)
logger.remove()
logger.add(sys.stderr, format="<green>{time:HH:mm:ss}</green> | <level>{level: <8}</level> | <level>{message}</level>", level="INFO", colorize=True)
logger.add(LOGS_DIR / "ideas.log.jsonl", serialize=True, rotation="50 MB", retention="6 months", level="DEBUG")

DATA_DIR = Path(__file__).parent.parent / "data"

# ---------------------------------------------------------------------------
# Lazy store
# ---------------------------------------------------------------------------

_store = None


def _get_store():
    global _store
    if _store is None:
        from lib.ideas.store import IdeaStore
        DATA_DIR.mkdir(parents=True, exist_ok=True)
        _store = IdeaStore(db_path=DATA_DIR / "ideas.db", vector_path=DATA_DIR / "vectors")
    return _store


# ---------------------------------------------------------------------------
# Header + Navigation
# ---------------------------------------------------------------------------

def _header():
    """Shared header with nav links."""
    with ui.header().classes("items-center q-px-md"):
        ui.label("💡 Ideas").classes("text-h6 text-bold")
        ui.space()
        ui.link("Extract", "/").classes("text-white")
        ui.link("Landscape", "/landscape").classes("text-white")
        ui.link("Evaluate", "/evaluate").classes("text-white")


# ---------------------------------------------------------------------------
# Page: Extract (/)
# ---------------------------------------------------------------------------

@ui.page("/")
def page_extract():
    _header()

    with ui.column().classes("w-full max-w-4xl mx-auto q-pa-md gap-md"):
        ui.label("Extract Ideas").classes("text-h5")
        ui.label("Paste text or load a URL to extract claims and theses.").classes("text-subtitle2 text-grey")

        project_input = ui.input("Project", value="default").classes("w-64").props("outlined dense")

        doc_input = DocInput(
            label="Document to analyze",
            show_preview=True,
            preview_height=300,
            auto_collapse=True,
        )

        extract_btn = ui.button("Extract Ideas", icon="psychology").props("color=primary unelevated")
        results_container = ui.column().classes("w-full gap-sm")

        async def on_extract():
            text = doc_input.content
            if not text or not text.strip():
                ui.notify("Please load a document first", type="warning")
                return

            extract_btn.props("loading")
            results_container.clear()
            try:
                from lib.ideas.extract import extract_ideas

                ideas = await extract_ideas(text[:50_000], source="user", model="sonnet")
                store = _get_store()
                project = project_input.value or "default"
                for idea in ideas:
                    store.save_idea(idea, project=project)

                claims = [i for i in ideas if i.idea_type == "claim"]
                theses = [i for i in ideas if i.idea_type == "thesis"]

                with results_container:
                    ui.label(f"Extracted {len(ideas)} ideas ({len(claims)} claims, {len(theses)} theses)").classes("text-subtitle1 text-bold")

                    if theses:
                        ui.label("Theses").classes("text-subtitle2 q-mt-sm")
                        for i, idea in enumerate(theses, 1):
                            with ui.row().classes("items-start gap-xs"):
                                ui.badge(f"T{i}", color="purple").props("outline")
                                ui.label(idea.text).classes("text-body2")
                                if idea.confidence:
                                    ui.badge(f"{idea.confidence:.0%}", color="grey").props("outline")

                    if claims:
                        ui.label("Claims").classes("text-subtitle2 q-mt-sm")
                        for i, idea in enumerate(claims, 1):
                            with ui.row().classes("items-start gap-xs"):
                                ui.badge(f"C{i}", color="blue").props("outline")
                                ui.label(idea.text).classes("text-body2")
                                if idea.confidence:
                                    ui.badge(f"{idea.confidence:.0%}", color="grey").props("outline")

                    ui.label(f"Saved to project '{project}'. Go to Landscape →").classes("text-caption text-grey q-mt-md")

                ui.notify(f"Extracted {len(ideas)} ideas", type="positive")
            except Exception as exc:
                ui.notify(f"Error: {exc}", type="negative")
                logger.exception("Extraction failed")
            finally:
                extract_btn.props(remove="loading")

        extract_btn.on_click(on_extract)


# ---------------------------------------------------------------------------
# Run
# ---------------------------------------------------------------------------

if __name__ in {"__main__", "__mp_main__"}:
    ui.run(
        host="0.0.0.0",
        port=7990,
        title="Ideas",
        favicon="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='.9em' font-size='90'>💡</text></svg>",
        dark=True,
        reload=True,
        show=False,
        message_history_length=0,
    )
```

**Step 3: Verify the app starts without errors**

Run: `cd /Users/tchklovski/all-code/rivus && timeout 5 python ideas/ui/app.py || true`
Expected: Server starts on port 7990, no import errors. Timeout kills it.

**Step 4: Commit**

```bash
git add ideas/ui/app.py ideas/ui/app_gradio.py
git commit -m "feat(ideas): replace Gradio UI with NiceGUI shell + extract page"
```

---

## Task 3: Interactive Landscape page (Stage 1)

**Files:**
- Modify: `ideas/ui/app.py` — add `/landscape` page
- Read (don't modify): `ideas/landscape.py` — existing `generate_search_terms`, `search_landscape`, `fetch_and_extract_sources`
- Read (don't modify): `lib/ideas/store.py` — `save_source`, `update_source_status`, `get_sources`

This is the core new feature. The landscape page shows:
1. Auto-generated search terms (editable)
2. Search results table with accept/reject per source
3. Manual URL addition
4. "Search again" with refined terms
5. Summary of accepted sources
6. "Extract corpus" button to trigger Stage 2

**Step 1: Write the landscape page function**

Add to `ideas/ui/app.py` before the `# Run` section:

```python
# ---------------------------------------------------------------------------
# Page: Landscape (/landscape)
# ---------------------------------------------------------------------------

@ui.page("/landscape")
def page_landscape():
    _header()

    with ui.column().classes("w-full max-w-5xl mx-auto q-pa-md gap-md"):
        ui.label("Interactive Landscape").classes("text-h5")
        ui.label("Search for related work, review sources, iterate until satisfied.").classes("text-subtitle2 text-grey")

        project_input = ui.input("Project", value="default").classes("w-64").props("outlined dense")

        # --- State ---
        search_terms = ui.textarea(
            "Search Terms",
            placeholder="Auto-generated from extracted ideas, or edit manually...",
        ).classes("w-full").props("outlined autogrow")

        with ui.row().classes("gap-sm"):
            gen_terms_btn = ui.button("Generate Terms from Ideas", icon="auto_fix_high").props("unelevated color=secondary")
            search_btn = ui.button("Search", icon="search").props("unelevated color=primary")

        # --- Sources table ---
        sources_container = ui.column().classes("w-full gap-xs")
        summary_label = ui.label("").classes("text-subtitle2")

        # --- Manual URL add ---
        with ui.row().classes("w-full items-end gap-sm"):
            manual_url = ui.input("Add URL manually", placeholder="https://...").classes("flex-grow").props("outlined dense")
            add_url_btn = ui.button("Add", icon="add_link").props("dense unelevated color=primary")

        # --- Corpus extraction ---
        ui.separator()
        with ui.row().classes("items-center gap-md"):
            extract_corpus_btn = ui.button("Extract Corpus from Accepted Sources", icon="hub").props("unelevated color=positive")
            corpus_status = ui.label("").classes("text-body2 text-grey")

        # ---------------------------------------------------------------
        # Handlers
        # ---------------------------------------------------------------

        async def on_gen_terms():
            """Generate search terms from stored ideas."""
            project = project_input.value or "default"
            store = _get_store()
            rows = store.get_ideas(project, source="user")
            if not rows:
                ui.notify("No ideas found. Extract ideas first.", type="warning")
                return

            gen_terms_btn.props("loading")
            try:
                from ideas.landscape import generate_search_terms
                from lib.ideas.models import Idea

                ideas = [
                    Idea(text=r["text"], idea_type=r["idea_type"], source=r["source"],
                         id=r["id"], confidence=r.get("confidence", 0.0))
                    for r in rows
                ]
                terms = await generate_search_terms(ideas)
                search_terms.value = "\n".join(terms)
                ui.notify(f"Generated {len(terms)} search terms", type="positive")
            except Exception as exc:
                ui.notify(f"Error: {exc}", type="negative")
                logger.exception("Term generation failed")
            finally:
                gen_terms_btn.props(remove="loading")

        async def on_search():
            """Run web search with current terms and display results."""
            terms_text = search_terms.value or ""
            terms = [t.strip() for t in terms_text.split("\n") if t.strip()]
            if not terms:
                ui.notify("Enter search terms first", type="warning")
                return

            search_btn.props("loading")
            try:
                from ideas.landscape import search_landscape

                sources = await search_landscape(terms)
                project = project_input.value or "default"
                store = _get_store()
                for src in sources:
                    store.save_source(src, project=project)

                _render_sources(project)
                ui.notify(f"Found {len(sources)} sources", type="positive")
            except Exception as exc:
                ui.notify(f"Error: {exc}", type="negative")
                logger.exception("Search failed")
            finally:
                search_btn.props(remove="loading")

        def _render_sources(project: str):
            """Render the sources table from store."""
            store = _get_store()
            all_sources = store.get_sources(project)
            accepted = [s for s in all_sources if s["status"] == "accepted"]
            rejected = [s for s in all_sources if s["status"] == "rejected"]
            pending = [s for s in all_sources if s["status"] == "pending"]

            summary_label.text = f"{len(all_sources)} total · {len(accepted)} accepted · {len(rejected)} rejected · {len(pending)} pending"

            sources_container.clear()
            with sources_container:
                for src in all_sources:
                    status = src["status"]
                    color = {"accepted": "positive", "rejected": "negative", "pending": "grey"}.get(status, "grey")

                    with ui.card().classes("w-full q-pa-sm").props(f"bordered {'flat' if status == 'rejected' else ''}"):
                        with ui.row().classes("items-center w-full gap-sm"):
                            ui.badge(status, color=color).props("outline")
                            with ui.column().classes("flex-grow gap-none"):
                                ui.link(src["title"] or src["url"], src["url"], new_tab=True).classes("text-body2 text-bold")
                                ui.label(src["snippet"][:200]).classes("text-caption text-grey")
                                ui.label(f"via: {src['search_term']}").classes("text-caption text-grey-6")
                            # Accept/reject buttons
                            if status != "accepted":
                                ui.button(icon="check", on_click=lambda _, u=src["url"]: _set_status(u, "accepted", project)).props("dense flat color=positive round")
                            if status != "rejected":
                                ui.button(icon="close", on_click=lambda _, u=src["url"]: _set_status(u, "rejected", project)).props("dense flat color=negative round")

        def _set_status(url: str, status: str, project: str):
            """Update source status and re-render."""
            store = _get_store()
            store.update_source_status(url, status, project)
            _render_sources(project)

        async def on_add_url():
            """Manually add a URL as an accepted source."""
            url = manual_url.value or ""
            if not url.strip():
                return
            from lib.ideas.models import LandscapeSource

            project = project_input.value or "default"
            store = _get_store()
            src = LandscapeSource(
                url=url.strip(),
                title=url.strip(),
                snippet="(manually added)",
                search_term="manual",
                status="accepted",
            )
            store.save_source(src, project=project)
            manual_url.value = ""
            _render_sources(project)
            ui.notify("URL added", type="positive")

        async def on_extract_corpus():
            """Fetch and extract ideas from accepted sources (Stage 2)."""
            project = project_input.value or "default"
            store = _get_store()
            source_rows = store.get_sources(project, status="accepted")
            if not source_rows:
                ui.notify("No accepted sources. Accept some sources first.", type="warning")
                return

            extract_corpus_btn.props("loading")
            corpus_status.text = f"Extracting ideas from {len(source_rows)} sources..."
            try:
                from ideas.landscape import fetch_and_extract_sources
                from lib.ideas.models import LandscapeSource

                sources = [
                    LandscapeSource(
                        url=r["url"], title=r["title"], snippet=r["snippet"],
                        search_term=r["search_term"], status=r["status"],
                    )
                    for r in source_rows
                ]
                ideas = await fetch_and_extract_sources(sources)

                for idea in ideas:
                    store.save_idea(idea, project=project)

                corpus_status.text = f"Extracted {len(ideas)} ideas from {len(source_rows)} sources. Ready to evaluate."
                ui.notify(f"Corpus: {len(ideas)} ideas extracted", type="positive")
            except Exception as exc:
                corpus_status.text = f"Error: {exc}"
                ui.notify(f"Error: {exc}", type="negative")
                logger.exception("Corpus extraction failed")
            finally:
                extract_corpus_btn.props(remove="loading")

        # Wire up handlers
        gen_terms_btn.on_click(on_gen_terms)
        search_btn.on_click(on_search)
        add_url_btn.on_click(on_add_url)
        extract_corpus_btn.on_click(on_extract_corpus)

        # Load existing sources on page load
        async def _on_load():
            project = project_input.value or "default"
            store = _get_store()
            existing = store.get_sources(project)
            if existing:
                _render_sources(project)

        ui.timer(0.1, _on_load, once=True)
```

**Step 2: Verify the landscape page loads**

Run: `cd /Users/tchklovski/all-code/rivus && timeout 5 python ideas/ui/app.py || true`
Expected: Server starts, `/landscape` page accessible.

**Step 3: Commit**

```bash
git add ideas/ui/app.py
git commit -m "feat(ideas): add interactive landscape page with accept/reject/iterate loop"
```

---

## Task 4: Evaluate page (NiceGUI)

**Files:**
- Modify: `ideas/ui/app.py` — add `/evaluate` page

**Step 1: Add the evaluate page function**

Add to `ideas/ui/app.py` before the `# Run` section:

```python
# ---------------------------------------------------------------------------
# Page: Evaluate (/evaluate)
# ---------------------------------------------------------------------------

@ui.page("/evaluate")
def page_evaluate():
    _header()

    with ui.column().classes("w-full max-w-4xl mx-auto q-pa-md gap-md"):
        ui.label("Evaluate Ideas").classes("text-h5")
        ui.label("Score extracted ideas across 9 quality dimensions.").classes("text-subtitle2 text-grey")

        project_input = ui.input("Project", value="default").classes("w-64").props("outlined dense")
        consensus_switch = ui.switch("Multi-model consensus (slower, more accurate)")

        eval_btn = ui.button("Evaluate All Ideas", icon="assessment").props("unelevated color=primary")
        results_container = ui.column().classes("w-full gap-md")

        async def on_evaluate():
            project = project_input.value or "default"
            store = _get_store()
            rows = store.get_ideas(project, source="user")
            if not rows:
                ui.notify("No ideas found. Extract ideas first.", type="warning")
                return

            eval_btn.props("loading")
            results_container.clear()
            try:
                from lib.ideas.eval import evaluate_idea, evaluate_idea_consensus
                from lib.ideas.models import Idea, IdeaScore, GROUPS

                ideas = [
                    Idea(text=r["text"], idea_type=r["idea_type"], source=r["source"],
                         id=r["id"], confidence=r.get("confidence", 0.0))
                    for r in rows
                ]

                use_consensus = consensus_switch.value
                eval_fn = evaluate_idea_consensus if use_consensus else evaluate_idea
                mode = "multi-model consensus" if use_consensus else "single model"

                with results_container:
                    ui.label(f"Evaluating {len(ideas)} ideas ({mode})...").classes("text-subtitle2")

                for idea in ideas:
                    profile = await eval_fn(idea)

                    # Save scores
                    for dim, score_val in profile.scores.items():
                        score_obj = IdeaScore(
                            idea_id=idea.id, dimension=dim, score=score_val,
                            rationale="", model="consensus" if use_consensus else "opus",
                        )
                        store.save_score(score_obj, project=project)

                    # Render result card
                    with results_container:
                        with ui.card().classes("w-full q-pa-sm").props("bordered"):
                            tag = "T" if idea.idea_type == "thesis" else "C"
                            tag_color = "purple" if idea.idea_type == "thesis" else "blue"
                            with ui.row().classes("items-start gap-xs"):
                                ui.badge(tag, color=tag_color).props("outline")
                                ui.label(idea.text[:120] + ("..." if len(idea.text) > 120 else "")).classes("text-body2 text-bold")

                            for group, dims in GROUPS.items():
                                floor = profile.floors.get(group)
                                floor_str = f" (floor: {floor:.0f})" if floor is not None else ""
                                ui.label(f"{group.title()}{floor_str}").classes("text-caption text-bold q-mt-xs")
                                for dim in dims:
                                    score = profile.scores.get(dim, 0)
                                    filled = int(score)
                                    bar = "█" * filled + "░" * (10 - filled)
                                    with ui.row().classes("items-center gap-xs"):
                                        ui.label(f"{dim}").classes("text-caption w-48")
                                        ui.label(f"{bar} {score:.0f}").classes("text-caption font-mono")

                ui.notify(f"Evaluated {len(ideas)} ideas", type="positive")
            except Exception as exc:
                ui.notify(f"Error: {exc}", type="negative")
                logger.exception("Evaluation failed")
            finally:
                eval_btn.props(remove="loading")

        eval_btn.on_click(on_evaluate)
```

**Step 2: Verify all pages load**

Run: `cd /Users/tchklovski/all-code/rivus && timeout 5 python ideas/ui/app.py || true`
Expected: Server starts on 7990. Pages /, /landscape, /evaluate all accessible.

**Step 3: Run existing tests to make sure nothing is broken**

Run: `cd /Users/tchklovski/all-code/rivus && python -m pytest ideas/tests/ lib/ideas/tests/ -v`
Expected: All tests pass (tests import from `lib/ideas/`, not from the UI).

**Step 4: Commit**

```bash
git add ideas/ui/app.py
git commit -m "feat(ideas): add evaluate page to NiceGUI UI with score visualization"
```

---

## Task 5: Update __main__ and README

**Files:**
- Modify: `ideas/__main__.py` — no change needed (CLI still works independently)
- Modify: `ideas/README.md` — update for NiceGUI, landscape workflow

**Step 1: Update README**

Edit `ideas/README.md`:

```markdown
# Ideas — Idea Extraction & Evaluation

Extract ideas from writing and score them on 9 dimensions across 4 groups.

## Quick Start

```bash
# CLI — extract + evaluate from URL
python -m ideas https://a16z.com/newsletter/big-ideas-2026/

# CLI — extract only
python -m ideas article.txt --extract-only

# UI — interactive pipeline (NiceGUI)
python ideas/ui/app.py   # https://kb.localhost/ideas (port 7990)
```

## What It Does

1. **Stage 0 — Extract**: Pull claims (testable assertions) and theses (insights/framings) from text
2. **Stage 1 — Landscape** (interactive): Search web for related work, user accepts/rejects sources, iterates
3. **Stage 2 — Corpus Extract**: Extract ideas from confirmed landscape sources
4. **Stage 3 — Evaluate**: Score user's ideas on 9 dimensions, novelty relative to corpus

## Scoring Dimensions

| Group | Dimensions | What it measures |
|-------|-----------|------------------|
| **Substance** | claim_precision, internal_coherence, evidential_grounding | Does the idea hold up? |
| **Novelty** | semantic_novelty, framing_novelty | Does it add something new? |
| **Expression** | clarity, rhetorical_force | Is it well communicated? |
| **Fertility** | generativity, composability | Does it open further thinking? |

Each dimension scored 1-10. Expression reported separately (evaluates text, not idea). Floor scores per group (weakest link, not averages).

## Architecture

```
ideas/              App layer — CLI, NiceGUI UI, landscape search
├── cli.py          Click CLI (python -m ideas)
├── landscape.py    Web search + source gathering (Serper + lib/ingest)
├── ui/app.py       NiceGUI UI — Extract, Landscape (interactive), Evaluate
└── demo.py         Demo script with sample results

lib/ideas/          Core library — reusable by other modules
├── models.py       Idea, IdeaScore, LandscapeSource, EvaluationProfile
├── extract.py      LLM-based idea extraction (claims + theses)
├── eval.py         Batched evaluation, multi-model consensus
└── store.py        SQLite persistence + vector search
```

## Interactive Landscape (Stage 1)

The NiceGUI UI at `/landscape` provides:
- Auto-generate search terms from extracted ideas
- Search web, present sources with accept/reject buttons
- Add URLs manually
- Iterate: adjust terms, search again, refine corpus
- Extract ideas from accepted sources to build comparison corpus
- Novelty scoring relative to that corpus

## Design

See `docs/plans/2026-03-02-idea-evaluation-design.md` for full design with rubrics and vario consensus notes.
```

**Step 2: Commit**

```bash
git add ideas/README.md
git commit -m "docs(ideas): update README for NiceGUI UI and interactive landscape"
```

---

## Summary

| Task | What | Files | Effort |
|------|------|-------|--------|
| 1 | Align eval with ng imports/patterns | `lib/ideas/eval.py`, tests | Small refactor |
| 2 | NiceGUI shell + Extract page | `ideas/ui/app.py` (new) | New file |
| 3 | Interactive Landscape page | `ideas/ui/app.py` (add page) | Core feature |
| 4 | Evaluate page | `ideas/ui/app.py` (add page) | Add page |
| 5 | README update | `ideas/README.md` | Docs |

**Dependencies:** Task 1 is independent. Tasks 2→3→4 are sequential (same file). Task 5 can run after any.

**Testing approach:** Existing `lib/ideas/tests/` cover the core library. The UI is tested by startup verification (import + launch without errors). Interactive landscape tested manually through the UI.

**Port:** 7990 (`kb.localhost/ideas`) — unchanged from Gradio version.