170K Lines of Self-Improving AI Infrastructure — Built in 8 Weeks

A compound intelligence system that runs multi-model reasoning, learns from its own mistakes, and operates autonomously. Looking for a technical co-founder to scale it.

170K+
Lines of code
1,237
Commits
19
Reasoning strategies
664+
Sessions reviewed

What Exists

A full-stack AI reasoning system with three layers: ingestion, reasoning, and self-management. Not scaffolding — production infrastructure processing real data.

INGEST Web scraping, transcription, document parsing, YouTube pipelines
REASON Multi-provider LLM abstraction, 19 strategy engine, vector search (Qdrant), analytical lenses
OUTPUT Dossiers, financial analysis, supply chain graphs, presentations
---
SELF-MANAGEMENT Session review, principle extraction, sandbox evaluation, pipeline health
KNOWLEDGE 25K+ learned instances, 40+ expert workflows, domain-specific patterns

Key technical decisions: Async Python throughout. Multi-provider LLM abstraction (Claude, GPT, Gemini, Grok — hot-swappable). SQLite + Qdrant for hybrid search. Gradio for rapid UIs. Playwright for autonomous web interaction. Redis for real-time data. Self-healing pipelines with LLM-assisted error triage.


The Interesting Problems

These are not wrapper-level engineering challenges. This is applied research running in production.

Multi-model consensus

How do you synthesize 4-8 model outputs into a single high-quality answer? When models disagree, which one is right? How do you detect when all models are confidently wrong?

Automated principle extraction

Session transcripts go in, behavioral principles come out. Each principle must be specific, testable, and actually improve future performance. 25K+ extracted so far.

Strategy selection

19 reasoning strategies built from 10 composable stages and 9 analytical lenses. The system must pick the right strategy for each problem type and adapt when initial choice underperforms.

Sandbox replay evaluation

How do you measure whether a learned principle actually helps? Replay past sessions with/without the principle. Quantify improvement. Kill principles that don't work.

Pipeline staleness detection

20+ autonomous pipelines running 24/7. Detect when outputs degrade, when upstream data changes, when models update. Version-aware freshness with automatic remediation.

Domain knowledge packaging

How do you take accumulated intelligence from one deployment and safely transfer it to benefit others in the same vertical — without leaking proprietary data?


What's Working


The Vision

Domain-specific reasoning as a product. The infrastructure for compound intelligence is built. It works in three verticals today. The next step: package it for customers. Every enterprise vertical — finance, legal, biotech, supply chain — gets a reasoning engine that accumulates domain knowledge and gets measurably better over time.

This is not a chatbot. This is not a prompt chain. This is infrastructure for AI systems that actually improve. The technical foundations exist. Now it needs to become a product.


The Role

Technical Co-founder

You would own a major axis — product, infrastructure, or go-to-market engineering — with full authority to shape the technical direction. This is a system with massive leverage: one engineer's work already compounded into 170K+ lines of working infrastructure. A second strong technical mind multiplies that further.

What you bring: Deep systems engineering experience. Comfort with ambiguity and research-grade problems. The ability to ship fast without cutting corners. An opinion about how AI reasoning systems should work — and the skill to build it.

What you get: A working system, not a slide deck. Real compound improvement, not hand-waving about "AI." A co-founder who builds — 1,237 commits in 8 weeks. And a problem space where the right technical decisions create lasting, defensible value.

Let's talk architecture.

I'll walk you through the codebase. You tell me what you'd build next.

Start the conversation