Appearance
Roadmap — Pending Stages
Each stage adds a capability. What we've built is the organism's core. What's ahead is making it battle-ready, observable, and usable.
Done
| Stage | Name | What It Unlocked |
|---|---|---|
| 0 | Conception | Heartbeat |
| 1 | Amoeba | Single cell (brain + senses + muscles + memory) |
| 2 | Hydra | Multi-agent (inbox + scheduling) |
| 3a | Embryo (self-building) | Axiom creates agents |
| 3b | Embryo (governance) | Build/Fuse mode |
| 3c | Embryo (self-sufficiency) | Per-agent data + grants |
| 4a | Ephemeral Pages | Short-link web pages for interactions |
| 4b | Pure Core + Plugins | Channels extracted as plugins |
| 5 | Onboarding Conversation | Axiom grows the organism through chat — no forms, no wizards |
| 5a | MiniWhatsApp + plugin | Local WhatsApp-lookalike for real external messaging tests |
| 5c | Claude CLI Architect | Self-coding agents — organism writes new TS tools into sandboxed agent folders |
| 6 | Mesh Visualization | Force-directed graph of agents + message flow, aggregated across recent traces |
Total so far: 74 tests passing, 18 catalog skills, complete organism core, self-extension via Claude CLI.
Pending — In Suggested Priority Order
✅ Stage 5: Onboarding Conversation (DONE — see stage doc)
What gets built:
- Axiom interviews new customers through
/talk - Builds their organism through natural conversation
- Creates agents, assigns skills, sets up grants
- No admin panel, no forms — setup IS a conversation
Flow:
Customer: "Hi, I'm Naveen, I run a small household"
Axiom: "Welcome. Who's in your household?"
Customer: "My maid Radha speaks Telugu, my driver Agil speaks Tamil"
Axiom: [creates passive agents] "Got it. What do you need them for?"
Customer: "Daily task delegation via WhatsApp"
Axiom: [plans agent architecture]
"I'll set up an attendance tracker and a task router. Sound good?"What you get:
- Best-in-class demo for prospects
- Validates the self-building organism capability
- Exercises every system (agents, skills, grants, passive, plugins, pages)
- Real usage patterns → feedback for improvement
Dependencies: None (all primitives exist) Complexity: Medium — mostly instruction-writing for Axiom + guided flow logic Time to build: 1-2 days
✅ Stage 5a: MiniWhatsApp — Local Test Harness (DONE — see stage doc)
What gets built:
A standalone web app living OUTSIDE orbita-core (one directory up):
/Users/sajithmr/1box/
├── orbita-core/
└── miniwhatsapp/ ← new project
├── server.ts HTTP + WebSocket server
├── public/
│ └── index.html Chat UI (browser app)
└── package.jsonMiniWhatsApp features:
- User registers with name + mobile number (any string, no verification)
- Session stored in browser localStorage (persistent)
- Chat UI showing conversation list + active chat
- Send message → broadcasts over WebSocket
- Receive message → appears instantly in UI
- Multiple browsers/incognito tabs = multiple "WhatsApp users"
REST/WebSocket API (for the Orbita plugin):
POST /api/send— Orbita sends outbound:{ to: "+91...", text: "..." }WS /api/events— Orbita listens:{ from: "+91...", text: "..." }eventsGET /api/users— list registered users
Orbita side — plugins/miniwhatsapp.plugin.ts:
- Opens WebSocket to miniwhatsapp
- Maintains contact book: agent name → miniwhatsapp number
- Watches matching agent inboxes → POSTs to
/api/send - Receives WS events → writes to agent inbox via
inbox.send
What you get:
- Full end-to-end external messaging without Meta API
- Test as many concurrent users as you have browser windows
- Real HTTP + WebSocket transport (no in-memory shortcut)
- Confidence that the plugin architecture handles real external systems
- When real WhatsApp comes, it's just a new plugin with different transport — core unchanged
Dependencies: None (pure dev tool) Complexity: Low-Medium — it's just a small web app + plugin Time to build: 1-2 days (miniwhatsapp) + 1 day (plugin)
Why this is genius: The hardest part of "Stage 5b: Real WhatsApp" is integration testing. You can't iterate easily against Meta's API. With MiniWhatsApp, you can:
- Test message routing in minutes
- Demo to colleagues without Meta account
- CI-friendly (run in tests)
- Debug the plugin pattern with real transport
✅ Stage 5c: Claude CLI Architect (Self-Coding Agents) (DONE — see stage doc)
The big idea: A new agent mode that uses Claude CLI (not the API) to generate agents WITH their tool implementations — including TypeScript code. Bring-your-own-agent + build-your-own-agent, all from natural language.
What gets built:
src/cortex/cli-runtime.ts— spawnsclaudeCLI process for agents needing full computer access- Architect-mode agents — agents with
mode: "cli"in config.json run via Claude CLI instead of API skills/architect/SKILL.md— Paperclip-style instructions that teach Claude CLI:- Orbita's architecture (link to manifesto)
- Existing agents and their skills (dynamic)
- Skill catalog (what tools exist)
- Coding conventions (where files go, how to register)
- How to create: folder + instruction.md + config.json +
src/tools/X.skill.ts
add_coded_skilltool — Architect writes new TypeScript tools, registers in catalog, commits- Build mode gate — CLI mode only works in BUILD mode (structural changes = code changes)
- Feed-the-brain — the CLI agent receives full context:
manifesto.md(architecture)list_agentsoutput (what exists)catalog.listNames()output (available skills)- Existing
src/tools/structure (example implementations)
Example flow:
User: "I need a GST tax calculator agent for India —
GST on invoices, HSN code lookup, GSTR filing prep"
Axiom (API mode): recognizes need for CODED skills
→ delegates to architect agent (CLI mode)
Architect (CLI): reads manifesto + existing agents
→ creates agents/gst-calculator/ folder
→ writes instruction.md (persona)
→ creates skills/calculate-gst.skill.md
→ writes src/tools/calculate-gst.skill.ts (actual TS code)
→ creates skills/lookup-hsn.skill.md
→ writes src/tools/lookup-hsn.skill.ts
→ updates build-catalog.ts to register new skills
→ updates factories.ts
→ writes agents/gst-calculator/config.json with skills
→ runs tests to verify
→ reports: "gst-calculator agent ready with 2 coded skills"What you get:
- True self-extension — organism writes new code for itself
- BYOA/BYOS — customers describe domain needs, system generates the code
- Architecture-aware generation — controlled, follows conventions (not ad-hoc)
- Dogfood loop — organism can improve its own core too
- Paperclip pattern realized — this is what Paperclip proved works
- Scales horizontally — each domain (GST, HIPAA, payroll) becomes its own agent with coded tools
Why our controlled CLI instead of just running claude manually?
Manual claude in terminal | Orbita-controlled CLI |
|---|---|
| No knowledge of Orbita architecture | Auto-injects manifesto + conventions |
| Doesn't know existing agents/skills | Sees live catalog, avoids duplicates |
| Ad-hoc file placement | Enforces folder/naming conventions |
| No audit trail | Every CLI run traced with traceId |
| Could break the system | Gated by BUILD mode, tests run after |
| Expert-only | Any user can say "build me a X agent" |
| No tests enforced | Auto-runs test suite after generation |
Governance:
- CLI mode is structural → BUILD mode only
- Every file created is audited to the trace log
- Axiom approves the plan before CLI agent executes
- Automatic rollback if tests fail
Quality Pipeline (mandatory gates)
Every agent/tool generated by the CLI Architect must pass ALL gates before being activated:
Gate 1: Static Validation
- TypeScript compiles (
tsc --noEmiton new files) - Imports resolve to existing modules
- Exports match the expected factory signature
- No disallowed imports (nothing that bypasses inbox/data/loader)
- Skill name in markdown matches tool name in code
- config.json shape is valid
Gate 2: Orbita Rules Validation (QA via Claude CLI)
A second CLI pass acts as QA Architect with a focused SKILL.md:
You are the Orbita QA Reviewer. Validate this generated agent against:
- DNA laws (8 laws)
- Inbox-only communication (no direct calls)
- Data namespace isolation (only touches own + granted)
- Trace continuity (preserves traceId)
- No reserved names
- Skill convention compliance
- Does NOT conflict with existing skills (no duplicates, no overrides)
- Does NOT break any existing agent (integration analysis)
Output: PASS / FAIL with specific violations.Gate 3: Dependency Management
If new tools need new npm packages:
- Architect lists required packages (
dependenciesarray in plan) package.jsonupdatednpm installruns automatically- Installation success verified
- Axiom decides: live reload (if supported) OR restart required notice
Gate 4: Isolated Agent Test
Every new agent MUST ship with a unit test. The architect generates:
tests/agents/<agent-name>.test.ts- Runs in isolated sandbox — copy agent folder to
/tmp/orbita-test-<uuid>/ - Tests each skill with mock services
- Tests happy path + error path for each tool
- MongoDB: in-memory test DB (mongodb-memory-server)
- Must pass 100% before agent is activated
Gate 5: Integration Smoke Test
- Start a test runtime with ONLY the new agent + Axiom
- Send a sample message to trigger a skill
- Verify end-to-end: inbox → skill → tool → result
- Trace inspected for trace continuity
What Happens on Failure
Any gate failing → rollback:
- Delete agent folder
- Revert package.json
- Revert catalog registration
- Log what failed + why to audit trail
- Report to user: "Agent generation failed at Gate X: <reason>. Try again with different approach?"
Agent Test Portability
A generated agent must be testable in isolation:
bash
# Copy an agent (from any Orbita tenant) and test it standalone:
orbita test-agent tenant/agents/gst-calculator/
→ creates /tmp/orbita-test-xxx/
→ copies agent folder + required tools
→ runs the agent's test suite
→ reports pass/fail
→ cleans upThis means:
- Portable agents — someone shares an agent folder, you test it before using
- CI-friendly — every agent tested independently
- Trust marketplace — future agent marketplace can verify submissions this way
- Safe installs — drop an agent in, test it, only then activate
Dependencies: None technical, but needs claude CLI installed on the host Complexity: High — spawning CLI, context injection, code generation safety, quality gates, isolation Time to build: 5-7 days (vs 4-6 without quality gates)
Where it slots in: Before Stage 7 (Immune System), because self-coding unblocks rapid domain expansion.
🟡 Stage 5b: Real WhatsApp Plugin
What gets built:
plugins/whatsapp.plugin.ts— real Meta Business API integration- Contact book (agent name → WhatsApp number) in plugin's own MongoDB collection
- Webhook receiver for incoming messages
- Template management (Meta's approved templates)
- 24-hour session window handling
Flow:
Axiom creates passive agent "radha" (language: te)
Plugin admin UI: "radha = +91 9988776655"
Axiom sends to radha's inbox
WhatsApp plugin formats for Telugu, sends via Meta
Real Radha receives WhatsApp
Real Radha replies → webhook fires → plugin writes to radha's inboxWhat you get:
- First real external channel works
- Proves plugin pattern end-to-end
- Demo to prospects who care about real messaging
- Foundation for email/SMS/Telegram plugins later
Dependencies: Meta Business API account (someone outsources this) Complexity: Medium — API integration + webhooks + templates Time to build: 2-3 days (most time is Meta setup, not code)
✅ Stage 6: Trace Mesh Visualization (DONE — see stage doc)
What gets built:
/traceupgraded to a full-graph visualization (Vue Flow)- Multiple traces overlaid — see the mesh
- Real-time updates (SSE or WebSocket)
- Click agent → see their activity
- Click trace → see full graph
- Performance metrics (tokens, time, cost per agent/trace)
- Filter by agent, time window, trace type
What you get:
- True observability — see inside the living organism
- Debugging mastery — any issue traceable visually
- Customer-facing dashboard (eventually)
- Performance tuning data (where are the slow agents?)
- Cost tracking per agent
Dependencies: None Complexity: Medium — Vue Flow + backend streaming Time to build: 2-3 days
🟠 Stage 7: Immune System (Self-Healing)
What gets built:
src/sentinel/module- Agent crash detection & auto-restart
- Stuck task escalation (task pending too long → notify manager)
- Plugin failure handling (retry, circuit breaker, fallback)
- Rate limiting per agent / per channel
- Dead letter queue for failed inbox messages
- Anomaly detection (unusual patterns → alert)
- Budget enforcement (LLM cost caps)
What you get:
- Production-ready resilience
- Organism survives partial failures
- Can deploy with confidence
- Alerting for operators
- Cost control
Dependencies: Some production load (need real failures to handle) Complexity: Medium-high — defensive code, many edge cases Time to build: 3-5 days
🟠 Stage 8: Multi-Tenant Isolation
What gets built:
- Control plane API (provision/start/stop tenants)
- Per-tenant Docker containers (or shared instance with strict isolation)
- Tenant-scoped everything (DB prefix, agents, plugins)
- Per-tenant usage tracking (cost, storage)
- Tenant-level config (timezone, language defaults, budget caps)
- Provisioning workflow: new customer → new isolated organism
What you get:
- Can sell to multiple customers simultaneously
- Each customer's organism fully isolated
- SaaS-ready
- Billing/metering foundation
Dependencies: Stages 5-7 should be solid first Complexity: High — infrastructure + orchestration Time to build: 5-7 days
🔵 Stage 9: Adult (Production Polish)
What gets built:
- Security audit (token strength, RBAC, input validation)
- Backup & restore (automatic MongoDB backups)
- Upgrade path (framework version migration)
- Extensive docs for developers building plugins
- Plugin marketplace (community-shared plugins)
- CLI tool (
orbita create-tenant,orbita list-agents, etc.) - Zero-downtime deployment
- Compliance (data retention, audit log exports)
What you get:
- Enterprise-ready product
- Sellable at scale
- Long-term sustainable
Dependencies: Stages 5-8 complete Complexity: High Time to build: Ongoing
Minor Improvements (Can Slot In Anytime)
🟢 Clean up legacy docs
Old stage docs reference add_contact, OrgManager, etc. that no longer exist. Update them to match current architecture.
🟢 Example plugin library
Add plugins/examples/ with templates: email plugin, SMS plugin, webhook-forwarder plugin.
🟢 Agent templates library
Pre-built passive agent templates (Driver, Nurse, Vendor, Student) that customers can use as starting points.
🟢 CLI for agent creation
orbita add-person Naveen --role=Leader --language=en for power users.
🟢 Getting started guide
Step-by-step tutorial for first-time users. "In 10 minutes, your first organism."
Recommendation
If I were building this for market fit, I'd do:
1. Onboarding Conversation (Stage 5) ← the demo
2. MiniWhatsApp + Plugin (Stage 5a) ← local test harness, builds confidence
3. Claude CLI Architect (Stage 5c) ← self-coding agents, BYOA/BYOS
4. Trace Mesh Viz (Stage 6) ← operator confidence
5. Real WhatsApp Plugin (Stage 5b) ← natural extension of 5a
6. Immune System (Stage 7) ← production hardening
7. Multi-tenant (Stage 8) ← SaaS scalingWhy this order?
- Stage 5 wins prospects' attention (the demo)
- Stage 5a is cheap confidence (real transport, no Meta account needed)
- Stage 6 gives operators trust (see inside the organism)
- Stage 5b becomes trivial after 5a works (swap the plugin's URL)
- Stage 7-8 are production concerns — address when production exists
But it's your call. You know your situation.