The Bet

Each session starts from zero. Solved problems get re-solved. The same bugs get re-fixed. Jarvis exists to end this.

The Problem

Before Jarvis, each development session starts from zero. LLM reasoning is redundant across sessions, and context is lost the moment the terminal window closes. The core insight driving this ecosystem is Constitution Principle VIII: "Solved problems stay solved."

By ensuring that agents are sandboxed and that past learnings are systematically recalled, we transform isolated coding sessions into a compounding knowledge base.

The

This is the architectural philosophy of Jarvis. Safety instructions are soft constraints that evaporate at scale — mechanical enforcement is the only reliable solution. The deterministic layer is orchestrated by a engine that sequences tasks.

HUMAN LAYER (conductor)

Approves plans, reviews PRs, sets policy. The only layer that can merge.

↓ approves ↑ reviews

DETERMINISTIC LAYER (built by AI)

Merge gates · File locks
Branch protection · Test suites
Pre-commit hooks · CI/CD
Mechanical enforcement — not prompts

↓ sandboxed ↑ proposes

AI LAYER (builder, sandboxed)

Writes code · Proposes PRs · Reviews
CANNOT merge or bypass gates

The OpenClaw Inbox Incident

A cautionary tale that motivates the sandwich architecture. Summer Yue told the agent to "confirm before acting". However, the token limit auto-summarized the safety instruction away.

The agent deleted 200+ emails and couldn't be stopped remotely. The root cause was a flat architecture — there was no deterministic layer between "agent decides" and "action executes".

The Lesson: Safety instructions are soft constraints that evaporate at scale.

The Stripe Minions Insight

"Putting LLMs in contained boxes compounds into system-wide reliability."

This insight validates the sandwich approach. Stripe proved this works: they forked for Minions. A smaller box equals better agent performance (the Toolshed principle).

Economics Reasoning

Phase 1 (NOW): Subscriptions

Claude Pro $20/mo, ChatGPT $20/mo. Cheaper for single-user heavy dev work.

Phase 3: OpenRouter

When org-level multi-tenant justifies per-token billing + 5.5% fee. This is explicitly planned, not speculative.

Strategic Rationale

graph TD
    BP["🔵 Blueprint Engine
━━━━━━━━━━━━━━━━━━━
Stripe: 'Putting LLMs into
contained boxes compounds
into system-wide reliability'
━━━━━━━━━━━━━━━━━━━
Blueprints are CODE, not prompts
Det. nodes guarantee lint/test/git
Agent nodes sandboxed with tool subsets"] MA["🟡 Mastra as Foundation
━━━━━━━━━━━━━━━━━━━
Already has .then() .branch()
.parallel() .dowhile()
Typed state flow, suspend/resume
━━━━━━━━━━━━━━━━━━━
New engine = ~4000 LOC duplicated
Blueprint DSL = ~500 LOC on top"] GO["🔴 Goose as Agent Harness
━━━━━━━━━━━━━━━━━━━
Stripe PROVED this works:
they forked Goose for Minions
━━━━━━━━━━━━━━━━━━━
Rust = fast, memory-safe
HTTP SSE = language-agnostic
NO FORK — HTTP client only"] TS["🟣 Toolshed with Profiles
━━━━━━━━━━━━━━━━━━━
Stripe: 'Smaller box =
better agent performance'
━━━━━━━━━━━━━━━━━━━
Fewer tools = less confusion
Add tool → entire fleet gets it
Profile filter = right tools per role"] OL["🟢 Out-Loop Human Interface
━━━━━━━━━━━━━━━━━━━
Stripe: 'Developer attention
is most constrained resource'
━━━━━━━━━━━━━━━━━━━
Engineer at START + END only
AskUserQuestion REMOVED (T08)
Physically impossible to pause"] KS["🟩 Knowledge as Memory
━━━━━━━━━━━━━━━━━━━
Constitution Principle VIII:
'Solved Problems Stay Solved'
━━━━━━━━━━━━━━━━━━━
Every run ingests learnings
Every new run recalls patterns
PostgreSQL + pgvector · 139 tests"] TM["🟠 Two Modes (In + Out Loop)
━━━━━━━━━━━━━━━━━━━
Not everything is autonomous
━━━━━━━━━━━━━━━━━━━
IN-LOOP: explore, architect, debug
OUT-LOOP: known patterns, grunt work
Engineer provides THINKING
AI does GRUNT WORK"] CTX["⚪ AGENTS.md Canonical
━━━━━━━━━━━━━━━━━━━
Stripe standardized one format
synced to all agent tools
━━━━━━━━━━━━━━━━━━━
AGENTS.md → .goosehints
AGENTS.md → .cursor/rules/*.mdc
One source of truth"] BP --> MA & GO GO --> TS TS --> OL OL --> KS KS --> TM TM --> CTX