Rivus Demos

A small team + AI agents building an intelligence amplification platform. The thesis: most knowledge work — research, due diligence, market analysis, writing — is bottlenecked by human attention, not human intelligence. Rivus removes the bottleneck.

Each tool below compounds on the others — the intel system feeds the supply chain map, the multi-model engine powers analysis, and shared libraries mean each new capability takes hours instead of weeks. Working systems on real data, not prototypes.

2026-03-10 — full project status & priorities

🔱 Vario — Multi-Model Reasoning Engine Done

Vario is the reasoning layer that powers everything below. The premise: no single model is best at everything, and the interesting answers emerge from disagreement, not consensus. Vario orchestrates multiple frontier models through composable pipelines — produce candidates, score them, refine the best, reduce to a final answer. Think of it as a compiler for multi-model workflows.

Here we demo model_debate, one of Vario's simplest recipes: three models argue a question, a judge picks the winner. Even this basic recipe produces insight no single model generates alone.

    "Is enterprise software more likely to be disrupted or strengthened by AI agents?"

    Winner's insight: "Incumbents will report strong numbers while losing the architectural battle." — $0.21 total cost.

Vario v2 (ng) is designed and being built. v1 materializes full lists between stages — fine for 3 candidates, unusable for 100. ng streams Things through async generators, so pipelines scale without memory cliffs. It adds provenance tracking (every Thing knows which model made it and why), composable stop conditions (halt when score plateaus, budget exhausts, or a winner separates), and 6 new block types (enrich from web/DB, execute code, decompose problems, classify, steer, plan). 15 pipeline topologies designed, 5 blocks implemented with 221 tests. The current demo runs on v1; every other demo on this page will run better on ng.

▶ model-debate.mp4 ▶ ngstudio-run.mp4 📄 questions + results

Doc Deep Analysis WIP

Two goals: understand at scale and produce better writing. Feed it any document and it does two things. First, it maps the structure — every paragraph tagged by rhetorical role (claim, evidence, hedge, rebuttal), style evaluated against ~215 craft principles. Second, it extracts every distinct idea and scores it on 9 dimensions across 4 groups: Substance (precision, coherence, evidential grounding), Novelty (semantic, framing), Expression (clarity, rhetorical force), Fertility (generativity, composability).

Floor scoring means each group is only as strong as its weakest dimension — a brilliant idea with no evidence scores low on Substance, no matter how well-expressed. The system doesn't just extract what was said — it judges whether it's worth saying, and whether it's said well.

    Tested on Ben Evans' "How will OpenAI compete?" — 45 ideas extracted, each scored. Novelty correctly scored modest: Evans synthesizes known strategy, doesn't originate theory. The chatbot-as-browser analogy scored highest on expression. The system got the judgment right.
  

Why this matters: Run this on thousands of documents via the jobs pipeline and you get a scored corpus of ideas across an entire field — what's novel, what's rehashed, what's well-argued vs. hand-wavy. Use it to produce better Drafts — the analysis feeds directly back into writing assistance, showing where your own arguments are weakest.

WIP: Landscape search (querying the web for related work to calibrate novelty against what's already published) is in progress. Claims evaluation — using multi-model consensus to assess evidence-claim alignment — is next.

📄 ideas results 💾 raw JSON 📊 rhetorical analysis report

Intel: Building the VC + Semiconductor Worlds WIP

The goal: build structured, auditable knowledge graphs of entire industries — then reason over them. We're doing this for two worlds right now: venture capital (9,500 firms, 5,400 partners from SEC filings, portfolio mapping, partner dossiers) and semiconductors (9,800 companies, 41,000 supplier/customer/competitor relationships, 16 role categories from foundry to metrology).

This isn't just data collection — it's collect, audit, reconcile at scale. Multiple sources (SEC Form D, Form ADV, Crunchbase, firm websites, web search) feed into a unified registry. Entities are deduplicated, cross-referenced, and scored. When the data is wrong, the system catches it.

    Caught a real bug: ticker MU mapped to "Micron Memory Japan, G.K." (subsidiary) instead of "Micron Technology, Inc." (the $150B parent). 214 supply chain relationships correctly mapped but invisible in the UI — automated QA catching what humans miss.
  

Evaluation rubrics are built from literature review of what actually predicts founder success. Founders are scored on 3 dimensions today (prior success, network quality, technical depth — expanding to 7), calibrated against research on founder/startup outcomes. Companies are scored via TFTF ("Too Fast To Follow") — 6 dimensions measuring velocity, compounding advantage, moat depth, talent magnetism, capital efficiency, and founder intensity. The rubrics aren't made up — they're grounded in what the research says matters.

Where it's headed: For VCs — company assessment and portfolio tracking: score founders before taking a meeting, monitor portfolio health across dimensions, spot patterns in what predicts success. For startups — partner and firm research: know who you're pitching, what they've invested in, what their thesis actually is vs. what they say it is. More depth and freshness than Crunchbase or PitchBook — because it's pulling live from the web, SEC filings, and firm sites rather than waiting for a database update cycle. The TenOneTen dossier below shows what the output looks like for a single firm.

📄 audit results 📊 quality audit report 📚 founder success research survey 📊 assessor calibration report 👤 Gil Elbaz profile 👤 Kumar Chellapilla profile 📊 people discovery pipeline 📊 semiconductor comparative report 🏭 Lattice Semi 🏭 SiTime 🏭 Semtech

Supply Chain Bottleneck Map Done

A direct output of the semiconductor world above. Supply chain risk is invisible until something breaks. This maps dependency graphs across 9,800 companies and 41,000 relationships, automatically flagging single-source bottlenecks. The kind of analysis McKinsey charges six figures for — generated in seconds from a structured database.

    ASML identified as critical single-source for EUV lithography — no alternative supplier exists. If ASML has a disruption, the entire advanced semiconductor supply chain stops. This isn't hypothetical — it's the analysis every semiconductor investor should have.
  

📈 TSMC supply chain (SVG) 📄 demo script 📊 prep report

Full System Flyover Done

The demos above are modules. This shows what happens when they compose. One question — "Is Micron a good investment?" — threads through 8 systems: company intel, supply chain analysis, document reasoning, multi-model debate, idea evaluation, job orchestration, health monitoring, and session awareness. The whole is greater than the parts.

🎥 flyover presentation 📄 speaker script ☑ preflight checklist 📊 prep report

TenOneTen VC Dossier Done

You're about to pitch a VC. Instead of skimming their website, you run their firm through your own intelligence platform. Partners profiled, portfolio mapped, investment thesis extracted, individual dossiers generated — from cold start to full dossier, autonomously. Walk into the meeting having analyzed them with the tool you're pitching.

TenOneTen Ventures (LA, data science founders) — 6 partner profiles, portfolio companies mapped, individual HTML dossiers.

📊 firm dossier 👤 Minnie Ingersoll 👤 David Waxman 👤 Gil Elbaz 👤 Eric Pakravan 👤 Nicole Toussaint 👤 Rohan Gupta 👤 Conor Webb

Project Priorities Updated

How do you decide what to build next when you have 15 projects competing for attention? Score each on 10 dimensions (payoff, defensibility, excitement, VC narrative...) and let the math settle arguments. Newest addition: Market signal intelligence — real-time news intake assessed for stock/sector impact via multi-model reasoning and the supply chain graph.

📊 priorities report

Built: 2026-03-10 | Source: PRIORITIES.md (md5: eed3b9f4) | static.localhost/present/demos/