Anthropy Works

Progress and Architecture Status

Anthropy Works is an AI-native MSP operations product and runtime platform. It coordinates managed environments, services, trusted workers, runtime evidence, workflows, policy, memory, and narrated operator activity across distributed business systems.

Ground Truth Current Status Phase Status Current Evidence Operating Model Readiness Boundaries Next Milestones Build Overview

Development Chronology

Full Build Chronology

The detailed phase log below still carries the build-by-build evidence. This section shows the F foundation phases that had to come before R, the current larger product arc, effort ranges, and the current phase percent that should move every time the build status is updated.

These hour ranges are live planning estimates, not promises. The report must always show the clearest current range, even when the range gets worse, and must change whenever scope, evidence, validation, blockers, or production risk changes.

Foundation: Control Plane Genesis

Complete

This foundation phase created the first working Anthropy control plane: app shell, login, organizations, audit history, nodes, agent check-in, Docker inventory, OpenClaw discovery, and safe deployment planning. It had to come first because Anthropy needed a real place to see machines, queue work, require approval, and route operations through an agent before higher-level orchestration could mean anything.

The game changer is that Anthropy started with the minimum operating substrate in place: identity, visibility, jobs, safety, and agent-run execution instead of direct infrastructure commands.

Human hours: 250-450
AI agent hours: 10-24
Estimate basis: Anchored to Phase 0-13 repo history: 15 commits, 64 files, about 10,000 added lines, covering the first app, auth, nodes, agents, Docker/OpenClaw inventory, jobs, safety, and deployment validation.

Foundation: Workflow and Capability Engine

Complete

This foundation phase added the capability catalog, connection records, workflow definitions, workflow rules, permission controls, and the first governed external action. It had to come next because once Anthropy could see and route work, it needed a reusable way to describe what work is allowed, what it depends on, and when humans must approve it.

The game changer is that Anthropy gained a reusable policy-driven work model, so future integrations can plug into shared governance instead of becoming isolated automation hacks.

Human hours: 180-320
AI agent hours: 8-18
Estimate basis: Anchored to Phase 14-18.6 repo history: 8 commits, 34 files, about 6,900 added lines, covering capabilities, connections, workflows, policy enforcement, permission UI, external action execution, and regression hardening.

Foundation: Operator Product and Access Layer

Complete

This foundation phase explored the operator product shape and access layer: workflow approvals, reliability recovery, tenancy clarity, browser tests, production-readiness gates, deployment/backup/observability runbooks, command-center UI, user assignments, OpenClaw versioning, and access evidence. It had to come before deeper execution architecture because the platform needed product and access evidence before it could define what safe execution should eventually support.

The game changer is that Anthropy proved ingredients for a future operator experience, but this was still foundation/prototype evidence, not a finished workflow ordinary MSP operators could rely on.

Human hours: 600-1,000
AI agent hours: 35-80
Estimate basis: Anchored to Phase 19-25.3 repo history: 36 commits, 76 files, about 25,900 added lines, covering approvals, recovery, production readiness, Cloudflare reporting, deploy/backup/observability, tenant segmentation, command-center UX, OpenClaw lifecycle planning, and SSH access workflows.

Foundation: Execution Plane Architecture

Complete

This foundation phase established the deeper operating-system architecture: platform interpretation, architecture canon, execution workers, execution jobs, worker capability and trust topology, environments, services, bindings, access dispatch, worker result relay, scoped secret delivery, transcript relay, and worker assignment scope. It had to come right before R because the platform needed clear execution contracts before the constitutional safety hardening could lock those contracts down.

The game changer is that Anthropy became structurally ready for distributed execution: not by enabling dangerous production work, but by defining the contracts that make production work governable later.

Human hours: 300-550
AI agent hours: 16-36
Estimate basis: Anchored to Phase 26-29H repo history: 12 commits after the access-layer arc, 47 files, about 9,700 added lines, covering the interpretation contract, architecture canon, execution-worker model, worker topology, environment/service topology, service bindings, access dispatch, dry-run grants, transcript relay, and assignment/fallback policy.

Constitutional Law

Complete

This phase built the platform's rulebook: who can act, what they can touch, what must be approved, and what evidence has to be kept. It turned dangerous infrastructure work into governed jobs with audit trails, redaction, worker identity, credential grants, rollback paths, and hard stops before risky action.

The game changer is that Anthropy can now grow without every new feature inventing its own safety rules; the law is shared, enforceable, and already underneath the work.

Human hours: 340-560
AI agent hours: 14-30
Estimate basis: Anchored to the completed R1-R27 arc: the original R1-R20 hardening (21 commits, ~10,700 added lines, migrations, CI, runbooks) plus a 7-commit R21-R27 audit-remediation pass that closed the doc-vs-code execution-edge gaps an independent reconciliation audit found — governed worker execution edge activated, deploy-path safety invariant restored, fail-safe RBAC, host-key observation producer, dependency/blast-radius edge model, ServiceBinding dispatch, and record-first alerting.

Operational Consciousness

90% done

This phase teaches Anthropy to notice what is happening operationally and explain it in human terms. It adds queues, health signals, review surfaces, blocked-work visibility, emergency-disable awareness, and the first operator view of what needs attention.

The game changer is that the platform is moving from simply having safety controls to being able to show people where risk, attention, and readiness actually live.

Human hours: 550-1,000
AI agent hours: 60-140
Estimate basis: Updated after the completed G1-G9-W1 governance and console arc and the major work that followed it: the agent-foundation spine (the entitlement keystone, the governed door with idempotency / plan-apply / AIP-151 operations / events, agent tool descriptors, and action preview), situational-awareness read models (attention, entity status, per-customer operator view), a two-plane tenancy and responsibility model, and a world-class operator console rebuilt to a calm Focus + Summon architecture and wired to that spine — Customers, Members with email-invite onboarding, Connections, Capabilities, MSP management, and System > Configuration, on a reusable CRUD kit. Agent-executed OpenClaw deployment is real and verified live in dev/control-plane with reversible-only teardown, and the worker execution edge is activated in stub/dry-run. The full backend suite is 879 tests green. G is held below complete because alert delivery, live transcript fanout, real external-provider execution with OAuth, resident-LM behavior, live worker SSH, and PRODUCTION remain out of scope.

G6 - Governance OS Trust + Operational Object Foundation

Governance OS stabilization and durable review memory
Deterministic operational trust walkthroughs for ambiguity, safe stops, and escalation clarity
Operator cognition doctrine: concrete operational narration over raw governance semantics
Initial runtime object model for gateways, nodes, and jobs
First read-only access gateway inspection flow
Current status: complete through G6D

G7 - Operator Console + System Introspection

System introspection registries (backend actions, operational objects, capability lifecycle, setup flows)
UI page/panel/action registries (what exists and what is safe)
Deterministic Operator Console shell (typed commands → registered actions)
Refusal behavior for out-of-scope questions (not a general chatbot)
Console-guided steering of existing UI state (navigation + panels)
Console can start persisted setup-flow sessions (no provider calls, no SSH, no mutation)
Introspection-aware command grammar resolves registered pages, panels, setup flows, fields, and session commands
Console-guided setup sessions can list, inspect, resume, update non-secret fields, and advance persisted sessions without owning canonical state
TEST internal-alpha environment readiness covers local/LAN access, Redis sessions, Postgres setup-session persistence, multi-user sanity, isolation, restart/recovery discipline, and walkthrough readiness
Current status: complete through G7G

G8 - Shared Operational Alpha + STAGING Readiness

Shared operational alpha readiness inventory, isolation tests, readiness UI/runbook, and operator walkthrough
Kitsune and Operator Console stabilization for alpha walkthrough use
First walkthrough blocker pass: console gateway setup no longer opens the real Connect gateway modal
TEST remains local/LAN development and validation
STAGING is the hosted trusted-tester environment at staging.anthropy.works
STAGING UI now forces an explicit environment label (banner + required badge) so testers cannot confuse it with TEST or future PRODUCTION
Mobile Kitsune cockpit is reset and stabilized for phones without breaking desktop panel semantics
PRODUCTION does not exist yet
Current status: complete through G8-H1

G9 - Workflow Cognition + Gateway Lifecycle

Workflow Cognition Architecture shifts Kitsune from panel polish toward object lifecycle clarity
Access gateway lifecycle spine: identify, configure, validate, monitor, investigate, control, retire
Gateway setup sessions behave as visible setup drafts before profile creation or validation
Gateway profile handoff makes the setup-to-object transition more explicit
Guided gateway profile editing separates cosmetic edits from validation-sensitive changes
Evidence revalidation, validation workspace, operator timeline, and read-only validation evidence make gateway trust easier to inspect
Current status: G9-W1 complete

Agent Foundation - The Spine + The Governed Door

Entitlement keystone: capability/action registry, can_user_perform, GET /me/entitlements, and a role matrix answer what a user may actually do in one place
One governed door (POST /actions/{id}/invoke) executes low-risk record/state mutations; medium, high, and destructive actions stay refused behind dedicated confirmation
Industry patterns built in: Stripe-style idempotency keys, Terraform-style plan/apply change-sets, Google AIP-151 long-running operations, and an event stream on door mutations
Agent tool descriptors (GET /agent/tools) and standardized action preview (GET /actions/{id}/preview) make the same operations legible to a future agent
Situational awareness: GET /me/attention, per-entity status, and a per-customer operator view answer is this OK, what needs me, what can I do
Two-plane tenancy and responsibility: explicit Node.hardware_owner attribution and an org-user functional-health surface
Current status: complete and green (879 backend tests)

Execution Edge - Activated and Made Safe (R21-R27)

Governed worker execution edge activated end to end in stub/dry-run: promote-to-worker, dispatch with a scoped credential-grant stub, worker job claim, and result relay
Deploy-path safety invariant restored: reversible-only teardown (no rm -rf, no -v); control-push defers to the agent once a host has a live agent
Agent-executed OpenClaw deployment with a per-instance Cloudflare tunnel and DNS, verified live in dev/control-plane
Fail-safe RBAC (first-class viewer default-deny), host-key trust-on-first-use producer, a dependency/blast-radius edge model, ServiceBinding-scoped dispatch, and record-first alerting
Live worker SSH remains flag-blocked; no medium+ door execution, no real external-provider execution, no PRODUCTION
Current status: complete

Operator Console - World-Class Surface Wired to the Spine

Every screen rebuilt to a calm Focus + Summon architecture (full-width room + summoned detail sheet; the old 3-pane surface retired app-wide)
Entitlements decide what is shown; create/edit/deactivate route through the governed door
Reusable CRUD kit across Customers/Orgs, Members/Users, Connections, Capabilities, and MSP-overlord management, each with an object-graph Related lens
Members onboards operators by email invite / magic link with no operator passwords
System > Configuration: Security, Notifications, Branding, Integrations & API, and AI keys / model routing, per-MSP with secrets encrypted at rest
Apple Liquid Glass with three appearance modes; universal Apple-Mail list rows and one universal + add affordance
Current status: active — remaining per-screen mutations still being migrated onto the door

G10 - Resident LM Integration Candidate

Anthropy Works-only knowledge boundary
Retrieval from introspection registries
No external/general knowledge answers
Tenant/user/org isolation
Capability-aware responses
UI-native action orchestration through deterministic registries only
Candidate future direction only; not active until explicitly approved

G11 - Multi-Agent / Autonomous Ops Candidate

Only after deterministic console, setup flows, staging review, and future LM boundaries are stable
No direct LM-to-runtime authority
All action through registries, workflows, jobs, risk gates
Candidate future direction only; no autonomy is enabled

Bounded Intelligence

Substrate ready

This phase gives Anthropy useful reasoning inside clear limits: understanding intent, choosing safe next steps, and explaining options without pretending it can do everything. It is where assistants become more than screens and start helping operators think through governed work.

The game changer is intelligence that is boxed in by the constitutional layer, so AI can help plan and decide without becoming an ungoverned executor.

Human hours: 400-800
AI agent hours: 30-80
Estimate basis: Total range is for the intelligence itself and is unchanged. The enabling substrate is already built and green: a 20-commit agent-foundation spine (entitlements, agent tool descriptors, action preview, the governed door with plan/apply and idempotency, situational-awareness read models), a 269-line deterministic Operator Console, and AI provider-key + model-routing config (the ModelRoute model + the System > Configuration AI page). That substrate de-risks the boxed-in / policy-aware portion but is not the reasoning. Intent interpretation, bounded planning, decision explanations, evaluation loops, and AI safety-regression coverage are not started — there are zero model calls in the codebase today. Roughly a tenth of the phase is de-risked by substrate; the intelligence is 0% built, by design.

Built - the rails an LM will ride (reused from R/G)

Entitlements, agent tool descriptors (/agent/tools), and action preview let an agent read exactly what it may do and what an action would change
The governed door (plan/apply + idempotency) means any agent-proposed change runs through the same constitutional gate as a human
Situational-awareness read models (attention, entity status, operator view) give an agent honest machine truth to reason over
The deterministic Operator Console (typed command to registered action, scope refusal) is the precursor surface - same rails, no model
AI provider-key storage and model-routing config exist (ModelRoute, primary/fallback), consumed by nothing yet

Not started - the intelligence itself

No LLM/model calls anywhere in the codebase; no intent interpretation by a model
No bounded planning, no reasoned decision explanations, no option weighing
No evaluation loops and no AI-behavior safety-regression coverage

Next steps

Wire a resident LM (the G10 candidate) that reads entitlements/tools/preview and proposes plans only - never LM-to-runtime
Constrain it to Anthropy registries (no general knowledge) with tenant/user/org isolation
Route every proposed action back through the governed door (plan/apply + confirmation) so reasoning can never bypass the law
Stand up an evaluation harness and an AI safety-regression suite before any medium-or-higher action is ever proposed

Scalable Productization

Getting started

This phase turns the platform into something customers can actually adopt, operate, and pay for. It adds onboarding, setup flows, billing shape, support workflows, polished governance UX, and repeatable day-to-day product paths.

The game changer is that the deep operating-system substrate becomes a usable product instead of a powerful internal architecture project.

Human hours: 800-1,600
AI agent hours: 80-180
Estimate basis: Total range is unchanged, but this is no longer Not started: a real first product slice is built and green. Done: email-invite / magic-link onboarding (a 149-line mailer + an accept-invite page + 11 tests, no operator passwords), persisted setup-flow sessions and guided schema-first add wizards, a five-page System > Configuration surface (a 701-line tested router + migration, per-MSP, secrets encrypted), a reusable 854-line CRUD kit, the world-class Focus + Summon operator UX, and billing shape (org plan_tier / billing_status fields). The productization-defining bulk remains: an actual billing and payments engine, support tooling, hosted production operations, packaging, reliability hardening, and an end-to-end customer path (blocked on the same provider-OAuth gap as A and E). Roughly a 15-20% first slice; the rest is the heavy half.

Built - the first product slice

Email-invite / magic-link onboarding: create-company plus a public accept-invite page, no operator passwords (mailer + 11 tests)
Persisted setup-flow sessions and guided schema-first add wizards (the /hosts/new pattern, per-step validation)
System > Configuration: five pages (Security & Access, Notifications, Branding, Integrations & API, AI), per-MSP, secrets encrypted at rest (701-line tested router + migration)
A reusable CRUD kit (854 lines) and the world-class Focus + Summon operator UX in Apple Liquid Glass
Billing shape: org plan_tier / billing_ref / billing_status fields (Stripe-Connect-patterned) - fields only, no engine

Not started - the productization core

No billing or payments engine, checkout, metering, or invoicing - only the data fields exist
No support tooling or help desk of any kind
No hosted production operations, packaging, documentation, or reliability hardening
No end-to-end customer path: a customer still cannot connect a real account and have governed work performed (blocked on provider execution + OAuth)

Next steps

Stand up a billing engine on the existing fields: plan tiers to metering to a real payment provider
Add the first real provider OAuth + execution so onboarding actually leads to work being done (the shared unlock with A and E)
Build minimal operator support and audit-trail tooling
Hosted operations: backups, monitoring, and runbooks for a real STAGING-to-first-customer path

Ecosystem Emergence

Not started

This phase opens Anthropy into a larger ecosystem of providers, runtimes, agents, customer services, and partner workflows. It is where the platform can coordinate across many outside systems while keeping ownership, policy, memory, and evidence portable.

The game changer is that Anthropy becomes a category platform: not one tool for one workflow, but a governed operating layer that other tools and teams can safely build around.

Human hours: 2,000-5,000+
AI agent hours: 200-600+
Estimate basis: Estimated long-horizon range for provider ecosystems, runtime portability at scale, partner workflows, enterprise governance, multi-agent operations, and platform-category maturity.

Ground Truth

How To Read This Build Honestly

Anthropy Works has been built in the right order for a high-risk AI operations platform: foundation first, execution contracts next, constitutional safety after that, and now operational visibility. That order matters because infrastructure automation cannot jump safely from a demo into customer trust without identity, scope, jobs, grants, audit, transcripts, rollback, and human review.

Environment Doctrine

TEST means local machine or trusted LAN development. STAGING means the hosted online trusted-tester environment at staging.anthropy.works with rollback, backup, access-control, and isolation discipline. PRODUCTION does not exist yet and must be created only in a future dedicated phase.

Actually Usable Today

There is now a real, drivable operator console over a complete governed backend. An MSP admin can sign in, manage MSPs, customers, and staff (onboarded by email invite with no passwords), see operational health and what needs attention, configure the system, and perform low-risk changes through one audited door. What is still missing for real customer use is connecting real external accounts and having governed work performed in them.

What Is Strong

The platform has unusually strong safety sequencing for its stage. Only low-risk changes flow through one governed, audited door; risky and production execution stay blocked while the system builds the law, evidence, governance narratives, operational intelligence, and escalation semantics needed before autonomy can be trusted.

What Is Not Proven Yet

Live worker SSH, real external-provider execution with live OAuth and credential acquisition, live transcript fanout, external alert delivery, customer onboarding, billing, resident-LM behavior, and a PRODUCTION environment are not done. Agent-executed OpenClaw deployment is real in dev/control-plane but is not a production customer rollout. The report should not be read as SaaS readiness.

What Must Be Done Well

Leadership and builders must keep scope disciplined, validate every phase exit, update estimates when evidence changes, and resist turning future capability into current claims. Optimism cannot outrun validation.

How Confidence Should Feel

Sponsors should be confident that the architecture is progressing in the right order, but not assume the remaining phases are automatic. The right stance is grounded confidence: real progress, live estimates, and explicit uncertainty where production risk still exists.

Production Target

The destination is a system real customers can be placed on without heroic support: ordinary MSP operators should be able to onboard, understand, approve, recover, and operate it without depending on the builder. That is not the current state.

When It Gets Real

The report must name maturity plainly: evidence, prototype, internal tool, sponsor-reviewable alpha, operator-ready beta, or production customer-ready. It only gets called production customer-ready when ordinary MSP operators can run the workflow for real customers with documented support, recovery, and rollback.

Current Status

AI-Native Orchestration Operating System

Anthropy Works coordinates environments, services, workers, runtimes, providers, workflows, policy, memory, and narrated activity across distributed business systems. The current build pairs an agent-foundation spine — one entitlement keystone that decides what a user or agent may do, and one governed door that executes low-risk mutations with idempotency, plan/apply change-sets, long-running operations, and an event stream — with a world-class operator console rebuilt on top of it. It is no longer read-only: it is a coherent, drivable sponsor-reviewable alpha. It is not yet operator-ready beta or PRODUCTION: live worker SSH stays blocked, medium and higher-risk actions stay behind confirmation, external provider execution is a single demo read with no live OAuth, and PRODUCTION does not exist.

What Is Current

The agent-foundation spine answers what a user or agent may do in one place: a capability/action registry, can_user_perform, the GET /me/entitlements keystone, and a role matrix.
One governed door (POST /actions/{id}/invoke) executes low-risk record/state mutations with Stripe-style idempotency keys, Terraform-style plan/apply change-sets, Google AIP-151 long-running operations, and an event stream; medium, high, and destructive actions stay refused behind dedicated confirmation.
Situational-awareness read models answer is this OK, what needs me, and what can I do: GET /me/attention, per-entity status, and a per-customer operator view the Customers screen renders directly.
A two-plane tenancy and responsibility model attributes each instance to who runs it and exposes an org-user functional-health surface.
R21-R27 closed the doc-vs-code execution-edge gaps an independent audit found: the governed worker execution edge is activated in stub/dry-run, the deploy path is agent-executed and reversible-only, MSP-viewer read-only is a first-class default-deny policy, host-key trust is fed by live SSH observations, dependency/blast-radius has a real edge model, ServiceBinding dispatch is live, and alerting is recorded first.
Agent-executed OpenClaw deployment with a per-instance Cloudflare tunnel and DNS is real and verified live in dev/control-plane, with reversible-only teardown.
The operator console is rebuilt to world-class: every screen recomposed to a calm Focus + Summon architecture and wired to the spine, on a reusable CRUD kit across Customers/Orgs, Members/Users, Connections, Capabilities, and MSP management, with Apple Liquid Glass.
Members onboards operators by email invite / magic link with no operator passwords; System > Configuration ships Security, Notifications, Branding, Integrations & API, and AI keys / model routing, per-MSP with secrets encrypted at rest.
The full backend suite is 879 tests green.
Architecture Canon defines Anthropy as the durable orchestration layer for identity, policy, object graph, capability graph, workflow truth, narrative history, and portable memory contracts.
ExecutionWorker and ExecutionJob records define worker identity, assignment scope, capability support, job status, route narrative, transcript, retry, and redaction metadata.
Worker Capability + Trust Topology defines capabilities, trust levels, locality, ownership scope, runtime relationships, and execution eligibility checks.
Environment + Service Topology defines environments as business operating spaces and services as the useful capabilities organizations and users understand.
Service Binding Contracts define how services attach to workers, runtimes, providers, capabilities, workflows, and assistants.
Worker-Side Access Dispatch defines how access_test and host_discovery route through eligible workers with local_dev_execution_mode as a governed bootstrap fallback.
Worker Access Executor Stub defines worker_stub_execution and validates worker-style access results before they become transcript, technical detail, access, or discovery evidence.
Scoped Secret Delivery defines one-job CredentialGrant metadata and worker_ssh_dry_run, which stops before opening any network connection.
Phase R11 adds a controlled test_managed_secret_provider harness that resolves synthetic in-memory material for dev/test lease validation, captures redaction fingerprints, closes the lease, and remains blocked in production.
Phase R12 adds a Vault-compatible provider pilot that resolves scoped leases through an explicitly configured endpoint in dry-run/test mode, captures redaction values, closes the lease, and keeps worker-side SSH disabled.
Phase R13 adds worker_ssh_preflight so workers can prove host-key trust, target route policy, command allowlist, grant/lease evidence, and live-mode gates before stopping without opening SSH.
Phase R14 persists HostKeyTrustRecord evidence, enforces host-key review transitions, detects changed fingerprints, and audits host-key observation, verification, rejection, disablement, and preflight blocking.
Phase R15 adds MSP-only host-key trust review APIs with safe status/risk/next-action text, audit summaries, and verify/reject/disable actions.
Phase R16 hardens DB-backed worker transcript read models with ordered operator/technical/audit streams, MSP-admin technical/quarantine access, replay/out-of-order/post-terminal quarantine rules, and safe preflight evidence summaries.
Phase R17 adds explicit live worker SSH feature flags, environment/scope/worker gates, emergency disable guardrails, safe operator status, and runbook/audit scaffolding while keeping live worker SSH disabled.
Phase R18 adds a feature-flagged worker_ssh_live_nonproduction harness for narrow read-only validation after all worker, scope, host-key, route, command, grant, lease, provider, and emergency-disable gates pass; production live SSH remains blocked.
Phase R19 completes the first constrained non-production worker SSH pilot shape and transcript transport soak for ordered events, timeout/disconnect behavior, emergency disable, host-key change blocking, provider/lease validation, replay quarantine, and no-secret-leak checks.
Phase R20 completes the operational readiness and rollback drill for reconnect-style transcript polling, replay resilience, partial transcript continuity, emergency disable during transcript flow, feature-flag disable, provider disable, worker revoke, host-key invalidation, lease revocation, and operator where/why/next narratives.
Phase G1 adds MSP-facing governance status and queues for worker health, host-key review, transcript quarantine, failed/blocked execution, provider/lease failures, rollback attention, emergency-disable visibility, and live SSH gate status.
Phase G2 adds a read-only Governance/Mainstage UI so MSP operators can inspect governance queues and alert integration contracts without taking action from that surface.
Phase G3 adds plain-language operational narratives grounded in existing governance, job, workflow, audit, and access evidence.
Phase G4 adds computed operational intelligence with evidence correlation, provenance, confidence, and uncertainty fields while remaining read-only.
Phase G5 adds escalation and delegation semantics that explain why Anthropy stopped, what human decision is needed, what remains unsafe, and what future bounded remediation would require.
Phase G6 adds Governance OS stabilization, integration branch CI protection, review memory, operational trust validation, operator cognition doctrine, and the first gateway/node/job operational object inspection foundation.
Phase G7B adds deterministic system introspection registries for backend actions, operational objects, capability lifecycle vocabulary, and setup-flow definitions without secret leakage.
Phase G7C adds persisted SetupFlowSession records and deterministic setup-flow state machines (starting with access_gateway_setup) without provider calls, SSH execution, or runtime mutation.
Phase G7D adds a deterministic Operator Console shell (typed commands mapped to registered actions) that can steer the current UI state and start setup-flow sessions without becoming a chatbot or an execution terminal.
Phase G7E adds introspection-aware command grammar so the console resolves registered pages, panels, setup flows, setup fields, and setup-session commands from Anthropy registries instead of growing a pile of hardcoded phrases.
Phase G7F completes console-guided setup-session steering: list, inspect, resume, update non-secret fields, and advance persisted setup-flow sessions while the setup-flow API remains the source of truth.
Phase G7G documents and validates TEST internal-alpha readiness for the trusted local/LAN stack, Redis sessions, Postgres setup-session persistence, multi-user sanity, isolation, restart/recovery discipline, backup basics, and walkthrough use.
Phase G8 completes shared operational alpha readiness and the first walkthrough-blocker fix pass while keeping RBAC, auth semantics, setup-session ownership, provider execution, OpenClaw mutation, and LM behavior unchanged.
Phase G8-H1 completes hosted STAGING environment doctrine and visible environment labeling for staging.anthropy.works while PRODUCTION remains explicitly nonexistent.
Phase G9-W1 starts gateway workflow cognition: setup drafts, profile handoff, guided editing, evidence revalidation, validation evidence workspace, operator timeline, and real read-only validation evidence for access gateway review.
Worker authentication and assignment checks now reject wrong, disabled, revoked, stale, unhealthy, insufficiently trusted, missing-capability, or wrong-scope workers before grant use, transcript relay, or result relay is trusted.
Redis-backed sessions replace process-local session memory, and login plus bootstrap agent registration have basic Redis-backed rate limits.
Worker Transcript Relay defines validated worker progress events with operator, technical, audit, and quarantine read-model separation.
Worker Assignment Scope defines platform, organization, environment, service, and user worker boundaries for access/discovery job eligibility.
Production Fallback Policy records explicit local_dev_execution_mode fallback decisions with where, why, next, requested scope, selected worker, and fallback_used fields.
Phase R1 hardens production config validation, production cookie security, public status minimization, and tracked env-file guidance.
Phase R2 removes the default host private SSH key mount from Docker Compose, keeps local demos on saved/uploaded credential references, hardens SSH ControlPath placement, and improves SSH transcript/result redaction.
Phase R3 persists org scope on tenant-sensitive records and moves high-risk list/read paths toward database-level scope filtering.
local_dev_execution_mode is gated in production unless explicitly enabled, and blocked routes stop before remote access begins.
Assistant semantics are now explicit: assistants are user-facing work surfaces, not workers, runtimes, or providers.
Mainstage semantics are now documented separately for MSP topology and routing, org environments and service health, and user workspaces and outcomes.

What This Means

Anthropy is the durable orchestration layer across MSP, organization, and user ownership layers.
Workers are governed execution participants, not invisible implementation detail.
Capabilities are intent-level graph concepts, not single provider buttons.
Runtimes and providers can execute or cache state, but they do not own Anthropy identity, policy, workflow truth, memory contract, object graph, or narrative history.
Environment and service topology is the next customer-facing model; raw infrastructure remains an MSP/operator concern.
Production execution still requires scoped dispatch, policy evaluation, typed confirmation where risky, audit, redaction, production live fanout implementation, production-reviewed provider auth, external alerting/escalation integrations, and final production worker-side SSH runbooks.

Phase Status

Recent Architecture Milestones

The governance and gateway-cognition arc is complete through G9-W1. Since then the constitutional layer was hardened through R27 — the governed worker execution edge was activated in stub/dry-run, the deploy-path safety invariant was restored, and fail-safe RBAC, a host-key trust producer, a dependency/blast-radius model, ServiceBinding dispatch, and record-first alerting landed. On top of that the agent-foundation spine was built (the entitlement keystone and one governed door with idempotency, plan/apply, long-running operations, and events), and the operator console was rebuilt to world-class and wired to that spine. Resident-LM behavior, live worker-side SSH, real external-provider execution with OAuth, and autonomy remain later phases that require explicit approval.

Operator Console

World-Class Operator Surface

Every operator screen rebuilt to a calm Focus + Summon architecture and wired to the spine: Customers, Members with email-invite onboarding and no operator passwords, Connections, Runtime, Capabilities, Procedures, Fabric, MSP management, and System > Configuration, on a reusable CRUD kit in Apple Liquid Glass. Remaining per-screen mutations are still being migrated onto the governed door.

Active

Agent Foundation 3

Situational Awareness

Read models for what needs attention (/me/attention), per-entity status, and a per-customer operator view, plus a two-plane tenancy and responsibility model, give every screen one honest answer to is this OK, what needs me, and what can I do.

Complete

Agent Foundation 2

The Governed Door

One door (POST /actions/{id}/invoke) now executes low-risk record/state mutations with Stripe-style idempotency keys, Terraform-style plan/apply change-sets for risky actions, Google AIP-151 long-running operations, and an event stream. Medium, high, and destructive actions stay refused behind dedicated confirmation.

Complete

Agent Foundation 1

Entitlement Keystone

A capability/action registry, can_user_perform, the GET /me/entitlements keystone, and a role matrix make what a user may actually do a single authoritative answer that the UI and future agents both read.

Complete

Phase R27

Operational Alerting (record-first)

Added record-first outbound operational alerting off the governance queues; external delivery channels remain intentionally narrow.

Complete

Phase R26

ServiceBinding-Scoped Dispatch

Activated the built-but-dormant service-scoped routing branch by creating ServiceBindings and passing service scope through execution dispatch.

Complete

Phase R25

Dependency / Blast-Radius Model

Added a CMDB dependency edge model and impact service so the platform can answer what depends on a given host, provider, or runtime at the data layer instead of faking it in the UI.

Complete

Phase R24

Host-Key Observation Producer

Fed host keys observed during live SSH access tests into the trust recorder (trust-on-first-use), so the host-key review surface is populated from real evidence instead of staying empty.

Complete

Phase R23

Fail-Safe RBAC

Made MSP-viewer read-only a first-class default-deny policy for non-admin MSP users instead of enforcement-by-omission, so a mutating endpoint cannot accidentally grant a viewer write access.

Complete

Phase R22

Deploy-Path Safety Invariant

Resolved the deploy contradiction an independent audit flagged: teardown is reversible-only (docker compose down, no -v, never rm -rf) and control-push refuses once a host has a live agent, restoring the rule that the agent executes deployment.

Complete

Phase R21

Governed Worker Execution Edge

Activated the previously test-only worker execution machinery against real reach in stub/dry-run: promote-to-worker, an MSP worker roster, operator dispatch with a scoped credential-grant stub, worker job claim, and result relay — driven end to end against the live dev API. Live worker SSH stays flag-blocked.

Complete

Phase G9-W1

Gateway Workflow Cognition

Turns access gateway work into a coherent operational lifecycle with setup drafts, profile handoff, guided editing, revalidation, evidence workspace, operator timeline, and read-only validation evidence while preserving no-execution boundaries.

Complete

Phase G8-H1

STAGING Environment Doctrine

Normalizes TEST, hosted STAGING at staging.anthropy.works, and future PRODUCTION doctrine with isolation, rollback, host-collision, credential, and access-control boundaries.

Complete

Phase G8-F1

Shared Alpha Walkthrough Fixes

Fixed the first walkthrough blockers by keeping console-started gateway setup inside deterministic setup-session steering and aligning documented setup-field commands with the grammar.

Complete

Phase G8

Shared Operational Alpha Readiness

Prepared the internal alpha for coherent evaluation with readiness inventory, isolation coverage, readiness UI/runbook, walkthrough guidance, Kitsune stabilization, and preserved MSP/Org/User/RBAC boundaries.

Complete

Phase G7G

TEST Internal Alpha Environment

Documented and validated the trusted local/LAN TEST alpha environment, including restart discipline, Redis sessions, Postgres setup persistence, multi-user sanity, isolation, backup basics, and walkthrough readiness.

Complete

Phase G7F

Console-Guided Setup Sessions

Completed console-guided setup-session steering for list, inspect, resume, non-secret field updates, and deterministic state advancement while setup-flow APIs remain canonical.

Complete

Phase G7E

Introspection-Aware Command Grammar

Replaced one-off phrase growth with registry-backed deterministic command grammar for pages, panels, UI actions, setup flows, setup fields, and setup-session commands.

Complete

Phase G7D

Deterministic Operator Console Shell

Added a bounded Operator Console shell that steers registered UI actions and setup-flow session starts without becoming a chatbot or execution terminal.

Complete

Phase G7C

Deterministic Setup-Flow Sessions

Added persisted SetupFlowSession records and deterministic setup-flow state machines with audited lifecycle events, redaction, and user/org isolation.

Complete

Phase G7B

System Introspection Registries

Added deterministic registries for backend actions, operational objects, capability lifecycle vocabulary, setup flows, pages, panels, and UI actions without exposing secrets.

Complete

Phase G6D

Runtime Observation + Object Inspection

Stabilized the initial gateway, node, and job operational object vocabulary and added the first lightweight read-only access gateway inspection flow.

Complete

Phase G6C

Operational Trust Validation

Added deterministic local/dev walkthrough scenarios and reviewer guidance so humans could evaluate ambiguity handling, safe stops, escalation clarity, and Governance Mainstage coherence.

Complete

Phase G6B

Governance Review Memory

Added durable informational review memory with source fingerprints and stale/historical states while preserving source conditions, risk, escalation, and delegate-back boundaries.

Complete

Phase G6A.1

Integration Branch CI Protection

Protects Governance OS integration discipline with branch/status validation and no added runtime authority.

Complete

Phase G6A

Governance OS Stabilization

Stabilizes Governance OS v1 after independent review while keeping execution, remediation, provider operations, production worker SSH, and OpenClaw deployment disabled.

Complete

Phase G5

Escalation + Delegation Semantics

Adds read-only escalation and delegation semantics that explain why Anthropy stopped, who must decide, what safe options remain, what is unsafe, and what would be required before future bounded remediation could be considered.

Complete

Phase G4

Operational Intelligence Read Model

Adds computed, read-only operational intelligence that correlates governance items with safe evidence, provenance, confidence, and unresolved uncertainty while keeping structured records as source of truth.

Complete

Phase G3

Operational Narrative Layer

Adds plain-language governance narratives grounded in existing queue, job, workflow, audit, and access evidence so operators can understand where Anthropy stopped, why it matters, and what human input is needed.

Complete

Phase G2

Governance Mainstage Read UI

Adds a read-only Governance/Mainstage surface for MSP operators to inspect governance queues, status, and alert integration contracts without enabling acknowledgement, remediation, alert delivery, production SSH, or OpenClaw deployment.

Complete

Phase G1

Operational Governance Surface

Adds MSP-facing governance status and queue APIs for worker health, host-key review, transcript quarantine, failed or blocked execution, provider and lease failures, rollback attention, emergency-disable visibility, live SSH gate status, and audit-only acknowledgement while production worker SSH remains disabled.

Complete

Phase C1

Post-Remediation Critical Cleanup

Closes bounded follow-up findings by binding default local Postgres/Redis ports to loopback, scoping worker transcript read routes by job org/private scope, adding worker-authenticated transcript event ingest through the existing relay validator, and documenting human-approved git-history purge requirements.

Complete

Phase R20

Readiness Review + Rollback Drill

Completes the constrained operational readiness drill for reconnect-style transcript polling, replay resilience, emergency disable during transcript flow, feature-flag disable, provider disable, worker revoke, host-key invalidation, lease revocation, operator narratives, and no-secret-leak checks while production live SSH remains blocked.

Complete

Phase R19

Non-Production SSH Pilot + Transcript Soak

Completes the first constrained non-production worker SSH pilot shape and transcript soak for read-only commands, timeout/disconnect handling, emergency disable, host-key behavior, provider/lease validation, replay quarantine, and no-secret-leak checks while production live SSH remains blocked.

Complete

Phase R18

Non-Production Live SSH Harness

Adds a feature-flagged worker_ssh_live_nonproduction harness for narrow read-only validation after worker, scope, host-key, route, command, grant, lease, provider, and emergency-disable gates pass; production live SSH remains blocked.

Complete

Phase R17

Live SSH Guardrails

Adds explicit live worker SSH feature flags, environment/scope/worker gates, emergency disable guardrails, safe status/read models, and runbook/audit scaffolding while live worker SSH remains disabled by default.

Complete

Phase R16

Transcript + Preflight Read Models

Hardens DB-backed worker transcript read models, MSP-gated operator/technical/quarantine access, replay/out-of-order/post-terminal quarantine rules, and safe worker_ssh_preflight evidence summaries while live worker SSH remains disabled.

Complete

Phase R15

Host-Key Review API

Adds MSP-only host-key trust list/detail read models, safe operator status/risk/next-action text, audit summaries, and verify/reject/disable review actions while live worker SSH remains disabled.

Complete

Phase R14

Host-Key Trust Persistence

Persists host-key trust evidence, enforces review transitions, audits observed/verified/changed/rejected/disabled/preflight-blocked events, and feeds durable trust evidence into worker_ssh_preflight.

Complete

Phase R13

Worker SSH Preflight Boundary

Added worker_ssh_preflight with host-key trust states, target route gating, command allowlist checks, lease evidence checks, and live-mode rejection while keeping worker-side SSH disabled.

Complete

Phase R12

Vault-Compatible Provider Pilot

Added a Vault-compatible managed provider adapter pilot that validates provider config and vault:// references, resolves scoped leases in dry-run/test mode, captures redaction values, closes leases, and keeps worker-side SSH disabled.

Complete

Phase R11

Managed Secret Testbed + Lease Retrieval

Added a dev/test-only managed secret provider harness that resolves synthetic in-memory material, captures redaction values, closes the lease, and proves WorkerSecretLease validation without enabling real provider retrieval or worker-side SSH.

Complete

Phase R10

Managed Provider Adapter + Transcript Transport Prep

Added managed secret provider adapter configuration, provider selection rules, safe not-implemented behavior for real providers, and DB-first transcript transport planning without enabling real provider retrieval or worker-side SSH.

Complete

Phase R9

Secret Provider Contract

Defined a provider-pluralistic SecretProvider contract and WorkerSecretLease shape so worker dry-runs can validate scoped retrieval rules without enabling real worker-side SSH or production secret delivery.

Complete

Phase R8

Alembic Drift Cleanup

Aligned historical timestamp nullability, workflow timestamp indexes, and CredentialGrant grant_id uniqueness/index shape so migration validation can fail on future unclassified drift.

Complete

Phase R7

CI + Deployment Validation

Added GitHub Actions validation scaffolding, phase-oriented validation layers, secret scanning, migration graph/head checks, deployment gates, and explicit reporting for known Alembic drift.

Complete

Phase R6

Redis Sessions + Rate Limits

Moved API sessions into Redis with TTL records and added Redis-backed rate limits for login and bootstrap agent registration without changing execution architecture.

Complete

Phase R5

Worker Authentication + Assignment

Hardened ExecutionWorker identity and assignment with registration status, hashed worker secret metadata, explicit scope fields, active/disabled/revoked/heartbeat checks, and scope-aware grant/result/transcript validation.

Complete

Phase R4

Durable CredentialGrant Persistence

Persisted CredentialGrant as a scoped, temporary, auditable permission record and made worker dry-run result relay validate durable grant status, scope, job, worker, credential reference, and allowed use before accepting evidence.

Complete

Phase R3

Tenancy + Scope Persistence

Persisted org scope on tenant-sensitive infrastructure, access, execution, and audit records; improved DB-level scope filtering; and added cross-org leakage tests without changing execution architecture.

Complete

Phase R2

Local Dev SSH Risk Containment

Removed the default host private SSH key mount, kept local demos on saved/uploaded credentials, hardened SSH control sockets, strengthened redaction, and contained password SSH risk without enabling worker SSH.

Complete

Phase R1

Secret + Production Config Hardening

Hardened production config validation, secure cookie behavior, minimal public status, env-file guidance, and report claims without changing access execution or unblocking deployment.

Complete

Phase 29H

Worker Assignment Scope + Fallback Policy

Added explicit worker assignment scope, job scope, production fallback decisions, and scope validation for transcript events, credential grants, and worker results without enabling real worker-side SSH.

Complete

Phase 29G

Worker Transcript Relay

Added validated worker progress events, operator/technical/audit read-model separation, quarantine handling, dry-run event emission, and grant lifecycle audit scaffolding.

Complete

Phase 29F

Scoped Secret Delivery + SSH Dry Run

Defined one-job CredentialGrant scaffolding and worker_ssh_dry_run, proving grant references, allowlists, redaction, and stop-before-network behavior without enabling production worker-side SSH.

Complete

Phase 29E

Worker Access Executor Stub

Added worker_stub_execution and API result relay validation for worker-style access and discovery results without enabling live worker SSH or raw secret delivery.

Complete

Phase 29D

Worker-Side Access Dispatch Contract

Defined access_test and host_discovery dispatch through eligible workers, with governed local development fallback, redaction, audit, transcript, and blocked-route behavior.

Complete

Phase 29C

Service Binding Contracts

Defined the binding model that attaches services to workers, runtimes, providers, capabilities, workflows, and assistant surfaces.

Complete

Phase 29B

Environment + Service Topology

Defined environments as business operating spaces and services as customer-facing capabilities across MSP, organization, and user layers.

Complete

Phase 29A

Worker Capability + Trust Topology

Defined worker capabilities, trust levels, locality, ownership scope, runtime relationships, and execution eligibility checks.

Complete

Phase 28

Execution Plane

Introduced ExecutionWorker and ExecutionJob records so operational work can be modeled as routed, auditable jobs with worker identity, transcripts, retries, and redaction.

Complete

Phase 27

Architecture Canon

Established Anthropy as the orchestration layer for identity, policy, object graph, capability graph, workflow truth, narrative history, and portable memory contracts.

Complete

Current Evidence

What The Build Supports

The entitlement keystone

GET /me/entitlements is the single authoritative answer to what the signed-in user may actually do, computed from a capability/action registry, can_user_perform, and a role matrix. The operator console reads it to decide what to show, and a future agent reads the same truth — no second, drifting copy of the rules.

The governed door

Every governed change goes through one door: POST /actions/{id}/invoke. It executes low-risk record and state mutations today with a Stripe-style Idempotency-Key (a retried mutation replays the stored response), Terraform-style plan/apply change-sets for risky actions, Google AIP-151 long-running operations, and an event stream. Medium, high, and destructive actions are refused in code and routed to dedicated confirmation; viewers are denied and audited.

Situational awareness

GET /me/attention, per-entity status, and a per-customer operator view answer is this OK, what needs me, and what can I do from machine truth. A two-plane tenancy and responsibility model attributes each instance to who runs it, so the same object reads correctly as platform topology, org service health, or personal work.

World-class operator console

Canonical local review URL: http://localhost:3000. Every screen is rebuilt to a calm Focus + Summon architecture (a full-width room plus a summoned detail sheet) and wired to the spine: Customers, Members, Connections, Runtime, Capabilities, Procedures, Fabric, MSP management, and System > Configuration, on a reusable CRUD kit in Apple Liquid Glass. Members onboards operators by email invite with no passwords.

Agent-executed deployment

An OpenClaw instance can be deployed by the agent with its own per-instance Cloudflare tunnel and DNS, verified live in dev/control-plane: it comes up healthy at its own hostname, and teardown is reversible-only (no volume deletion) and removes the tunnel and DNS. This is real deploy capability, not a production customer rollout — PRODUCTION does not exist yet.

Spatial product direction

The command workspace direction is organized around roster, inspector, activity, map context, and plain-English status patterns. The report describes architecture status, not a scripted UI walkthrough.

G6D access gateway inspection

Canonical local review URL: http://localhost:3000. Start or restart with cd /Users/shayne/AI/Projects/Anthropy/Works/apps/web && npm run dev. Sign in with the seeded local/dev MSP admin admin@anthropy.works / changeme-dev-only, then open Access.

Exact UI path: open Access, choose Access gateways, then select a gateway profile. The workspace should show what the gateway is, what host routes depend on it, the latest read-only reachability check, uncertainty, and explicit refused/blocked actions.

Known limitation: this is a read-only inspection flow grounded in existing jumpbox/host access records. It is not a generalized inspector framework, replay engine, simulation product, remediation surface, provider operation, alert delivery path, or OpenClaw deployment path.

G7E/G7F operator console steering

Try deterministic typed commands such as what can you do, open access gateways, close inspector window, start access gateway setup, show setup sessions, or inspect setup session. The console resolves commands through Anthropy registries and routes to registered UI actions and setup-flow APIs only; it is not a chatbot, does not call external services, and does not execute SSH or provider operations.

G9-W1 gateway workflow cognition

Access gateway work now has a clearer lifecycle path: setup draft, profile handoff, guided profile editing, evidence revalidation, validation evidence workspace, operator timeline, and read-only validation evidence.

The review target is whether an operator can understand what gateway is being configured, what changed, what Anthropy checked, what evidence exists, what remains uncertain, and what action stays blocked. This does not enable provider execution, OpenClaw mutation, remediation, resident LM behavior, autonomy, or PRODUCTION.

G6C local review workflow

Canonical local review URL: http://localhost:3000. Start or restart with cd /Users/shayne/AI/Projects/Anthropy/Works/apps/web && npm run dev. If port 3000 is occupied, stop the listener with PID="$(lsof -ti tcp:3000)"; [ -z "$PID" ] || kill $PID. Sign in with the seeded local/dev MSP admin admin@anthropy.works / changeme-dev-only, then open Governance. Fixture command: cd /Users/shayne/AI/Projects/Anthropy/Works && python3 scripts/load-g6c-operational-walkthrough-fixtures.py | python3 -m json.tool.

Exact UI path: open Governance, use Governance mainstage, scroll to G6C walkthrough, then use G6C scenarios. Each scenario should show Operational situation, Evidence available, Expected narrative, Safe-stop reason, Evidence distinctions, and Boundaries held.

Scenario order: Stale credentials after prior successful access; Revoked credentials with escalation ambiguity; Transcript success claim contradicted by current telemetry; Secret-redacted transcript missing causal context; Observe-only imported node that appears to need action; Quarantined transcript requiring blocked delegate-back; Stale governance review invalidated by changed source conditions; Partial evidence with unresolved ambiguity; Remediation path exists but execution authority is unavailable; Tenant-safe evidence boundary where cross-tenant inference must not occur.

Known limitation: G6C is a human walkthrough aid inside Governance Mainstage. It is not a replay engine, simulation product, score, remediation surface, provider operation, alert delivery path, or OpenClaw deployment path.

Access verification model

The model supports redacted SSH credentials, jumpbox and host profiles, read-only discovery, transcripts, and access-verified node candidates. It does not imply installation, host mutation, unrestricted remote command execution, or production trust in local_dev_execution_mode.

Execution routing evidence

The governed worker execution edge is now activated end to end in stub/dry-run: promote-to-worker, operator dispatch with a scoped credential-grant stub, worker job claim, and result relay, driven against the live dev API. Execution jobs record worker identity, route narrative, redacted technical detail, and tenant scope. Live worker-side SSH remains flag-blocked; production dispatch remains future work.

Workflow policy

Workflow policy exists around capability assignment, connection status, connection usage policy, risk rules, and confirmation pauses. Workflow execution must pass those checks before a step is allowed to proceed.

Capability governance

Capability records, packs, permission rules, allowed actions, risk levels, and confirmation requirements are part of the control model. They are governance evidence, not a promise of broad provider execution.

Environment and service model

Environments are business spaces and services are customer-facing capabilities. Service bindings attach workers, runtimes, providers, capabilities, workflows, and assistants underneath as governed implementation topology.

Access dispatch

The API remains the control plane for access. Workers are future execution hands. Local development SSH is visible, governed, read-only, and not presented as production execution.

Worker result relay

worker_stub_execution proves the worker-side access result envelope and API relay validation. Malformed, mismatched, secret-looking, and non-allowlisted results are rejected before becoming access evidence.

Secret delivery dry run

CredentialGrant is now a durable control-plane record scoped to one job, worker, credential reference, allowed use, expiration, and tenant scope. worker_ssh_dry_run verifies the shape, records grant consumption, and stops before any network connection opens.

Managed provider adapters

Secret provider selection now has safe adapter configuration for local development, Vault, KMS, managed secret services, external brokers, and stubbed references. Disabled, missing, unsupported, production-local, and not-yet-implemented providers fail clearly without releasing raw secrets.

Secret lease testbed

A controlled test_managed_secret_provider can resolve synthetic in-memory material in dev/test only, capture redaction values and fingerprints, close the lease, and prove the result path rejects leaked synthetic material. It is blocked in production and is not customer secret delivery.

Vault-compatible pilot

The Vault-compatible adapter validates safe provider config, accepts vault://mount/path#field references, resolves one scoped lease in dry-run/test mode, captures redaction values, closes the lease, and rejects leaked resolved material. It does not enable worker-side SSH.

SSH preflight boundary

worker_ssh_preflight checks host-key trust, target route policy, command allowlists, grant and lease evidence, and the live-mode gate. It can explain why execution is blocked, but it stops before opening SSH.

Host-key trust records

HostKeyTrustRecord persists target fingerprints, review state, changed-fingerprint evidence, audit events, and MSP-only review read models. Verified trust can satisfy preflight; bootstrap, unknown, changed, rejected, and disabled states still block production trust, and live worker SSH remains disabled.

Transcript transport prep

Worker transcript transport is planned around DB-persisted events as the source of truth, with Redis reserved for future live fanout. R16 adds ordered read models, MSP-gated technical/quarantine access, replay and post-terminal quarantine, and a safe preflight evidence summary without implying production live worker execution.

Governance surface

G1 through G5 add MSP-facing governance status, queues, read-only Mainstage UI, plain-language narratives, operational intelligence, and escalation/delegation semantics. These surfaces explain what needs attention and why, but they do not acknowledge, retry, remediate, deliver alerts, enable SSH, or deploy OpenClaw.

Live SSH guardrails

Phase R17 makes live worker SSH difficult to enable accidentally: global and environment flags, org/environment allowlists, worker/org/environment emergency denies, runbook acknowledgement, and a final implementation gate all keep live SSH disabled until a future phase explicitly opens it.

Non-production SSH harness

Phase R18 adds worker_ssh_live_nonproduction for tightly controlled local/staging validation. Phase R19 proves the first constrained pilot and transcript soak across ordered events, timeout/disconnect handling, emergency disable, host-key change blocking, provider/lease validation, and replay quarantine. Phase R20 drills reconnect-style polling, partial transcript continuity, rollback, and disable/revoke paths. It is disabled by default, blocked in production, limited to a tiny read-only command set, and requires worker, scope, host-key, route, grant, lease, provider, and emergency-disable gates before live-style result evidence is accepted.

Worker identity

ExecutionWorker records now carry registration status, hashed worker secret metadata, explicit assignment fields, last-authenticated and heartbeat timestamps, and disabled/revoked gates. These checks guard transcript relay, result relay, and CredentialGrant use, but they do not enable live worker SSH.

Authentication hardening

API sessions now live in Redis with TTL records and hashed session-token keys. Login and bootstrap agent registration use Redis-backed counters with TTL. These controls improve production shape without changing the product flow.

Validation guardrails

GitHub Actions and local scripts now split validation into backend, execution-plane, tenancy/scope, security, migration, frontend, report, Docker, and JSON layers. Secret scanning and migration checks are automated, and Alembic autogenerate drift is now clean enough to fail validation on future unclassified drift.

Worker transcript relay

Worker dry-runs can now emit progress events that Anthropy validates, separates into operator, technical, and audit streams, or quarantines before they become trusted narrative.

Support visibility

MSP support visibility is governed, role-controlled, read-oriented, auditable, and narratable. It is not implicit unrestricted access to customer state.

Runtime boundary

Runtimes and providers are governed implementation surfaces. Anthropy keeps the canonical object graph, policy, workflow truth, memory contract, and narrative record.

Operational recovery

Operational evidence includes stale jobs, interrupted workflows, reviewed failures, health status, backups, restore testing, logs, and validation scripts.

Operating Model

How Anthropy Is Structured

Ownership layers

MSP, organization, and user are interpretation layers. The same object can be platform topology, org service health, or personal work depending on viewer context.

Execution plane

The API approves, records, and routes work. Workers execute only when capability, trust, locality, ownership scope, policy, confirmation, redaction, and audit checks allow it.

Worker topology

Workers advertise capabilities, trust levels, locality, ownership scope, and runtime relationships. Eligibility is explicit before a job can route to a worker.

Capability graph

Capabilities express intent. Provider discovery, missing dependency checks, policy evaluation, execution planning, runtime routing, and narrative output sit underneath.

Runtime portability

Anthropy keeps canonical truth portable. Runtime-local and non-portable state must be exported, normalized, hydrated, validated, or explicitly treated as runtime-bound.

Environment topology

Environments are business operating spaces such as Dispatch, Sales, Field Operations, HR, or Support. They are not infrastructure clusters.

Service topology

Services are organization-facing capabilities such as dispatch automation, document search, monitoring, or assistant surfaces. They connect to workers, runtimes, providers, workflows, and assistants through governed bindings.

Service bindings

Bindings make implementation topology explicit: which worker can support a service, which runtime or provider it uses, which capabilities are exposed, which workflows run, and which assistant surfaces present the work.

Readiness Boundaries

Current Production Boundaries

Environment and service UI

Environment, service, and binding records are in place as topology scaffolding. Full user-facing workflows and Mainstage surfaces are still planned work.

Worker-side execution

Worker-side dispatch, result relay, SSH dry-run, and transcript relay gates are contracted and guarded. The live SSH executor, production identity, authentication, scoped routing, heartbeat, audit, redaction, and implemented Vault/KMS-style secret retrieval remain planned work.

Remote access scope

Access tests run approved read-only discovery only. Host mutation, arbitrary remote commands, and broad infrastructure actions are outside the current demo boundary.

Secret handling

Secret-looking fields are redacted in job results, audit messages, transcripts, and technical details shown to operators. Default Compose no longer mounts host private SSH keys, and audit events now carry org scope when inferable; broad production use still needs implemented managed Vault/KMS-style retrieval.

Provider coverage

External provider execution remains intentionally narrow until capability assignment, provider policy, account handshakes, confirmation behavior, and audit contracts are production-ready.

Production readiness

Current readiness includes contracts, scaffolding, validation, first-pass production gates, a non-production worker SSH pilot/soak, and an operational rollback/reconnect readiness drill plus the first governance queue surface. Customer-facing production rollout remains blocked on reviewed production provider operations, production transcript fanout implementation, scoped dispatch, production runbooks, external alerting/escalation integrations, and rollout approvals.

Next Milestones

Migrate the remaining per-screen operator mutations onto the governed door (add a suspend/resume capability invoker; route connection validate/refresh and runtime lifecycle through the door's confirmation flow).
Add the first real external-provider execution with live OAuth and credential acquisition, so a customer can connect a real SaaS account and have governed work performed in it (today only a single Google Drive read exists).
Surface environment and service topology in the operator UI now that the schemas, CRUD, and dependency/blast-radius edge model exist.
Promote the governed worker execution edge from stub/dry-run toward a reviewed live worker-side executor, behind the existing flags and runbooks, after provider operations and exact scope enforcement are ready.
Build live transcript fanout and external alert delivery (email/Slack/webhook) on top of the record-first alerting and event stream.
Expose environment and service topology through reviewed APIs and Mainstage surfaces.
Rotate and purge previously committed development credentials from git history after explicit human approval.
Remove .env.local from git tracking when the team approves replacing the tracked template with an untracked local file or renamed example path.
Move real local secrets into untracked .env files and keep tracked env files template-only.
Use uploaded/saved SSH credentials for local demos; any legacy default-key import requires an explicit development-only compose override.
Continue hardening exact service, environment, and user-private scope persistence before production worker-side execution.
Review ServiceBinding APIs and decide which binding types become user-facing first.
Do not schedule production worker-side SSH until production provider operations, exact scope enforcement, live fanout implementation, external alerting/escalation, production runbooks, and operator workflow validation are ready.
Replace legacy node-agent registration token usage with production per-worker or per-org expiring registration tokens before self-service worker installation.
Review production provider auth operations, production host-key review UI/operations, live transcript fanout implementation, external alerting/escalation integrations, and fallback runbooks before accepting production live worker SSH results.
Promote no testbed provider to production; keep the R12 Vault-compatible pilot behind explicit provider configuration and dry-run gates until production operations are reviewed.
Keep Alembic drift enforcement green as future schema changes land.
Implement host registration only after access verification, worker identity, policy, typed confirmation, audit, rollback, and redaction are satisfied.

Known Limitations

This is a sponsor-reviewable alpha you can actually drive, not an operator-ready beta or a production customer-ready product.
No customer can yet connect a real external SaaS account and have governed work performed in it end to end: external provider execution is still a single Google Drive read, and connection OAuth / live credential acquisition is not built.
The governed door executes low-risk record/state mutations only; medium, high, and destructive actions stay refused behind dedicated confirmation, and live-access low-risk actions are deliberately not wired to the door yet.
Live worker-side SSH remains flag-blocked; the worker execution edge is activated only in stub/dry-run.
Agent-executed OpenClaw deployment is real and reversible in dev/control-plane, but there is no PRODUCTION environment and no customer-facing production rollout.
Operational alerting is record-first; live transcript fanout and real-time streaming do not exist, so everything is poll-then-read.
Resident-LM behavior, bounded-intelligence reasoning, and autonomy are not built; the agent-foundation spine is the substrate those later phases will use.
Environment, service, and binding records now have schemas, CRUD, and a dependency edge model, but their operator UI surfaces are still being built.
Tracked env files are templates now, but previously committed development credentials still require human-approved rotation and git-history purge.
Default local Postgres and Redis bindings are loopback-only as of Phase C1, but Redis requirepass or managed-cache authentication remains a production follow-up.
The legacy default SSH key import path is development-only and requires explicit opt-in mounting; default Compose does not mount host private keys.
Tenant scope persistence, Redis-backed auth foundations, CI validation scaffolding, Alembic drift enforcement, secret-provider contracts, provider adapter configuration, synthetic lease testbed validation, Vault-compatible lease pilot validation, and SSH preflight checks are stronger, but customer-facing production rollout still needs production support workflows.
Migration graph/head/upgrade/autogenerate checks are automated and clean as of Phase R8; future schema drift should fail validation unless explicitly classified.
Worker-side access dispatch, worker_stub_execution, worker_ssh_dry_run, worker transcript relay, durable CredentialGrant records, assignment-scope policy, the worker_ssh_live_nonproduction pilot, the R20 rollback drill, and C1 transcript scope/ingest cleanup are contracted and guarded, but production worker SSH remains disabled.
Phases G1-G9-W1 add MSP-facing governance status, review memory, operational trust validation, runtime observation, introspection registries, setup-flow sessions, deterministic Operator Console steering, TEST alpha readiness, shared operational alpha readiness, STAGING doctrine, mobile Kitsune stabilization, and gateway workflow cognition. External alert providers, incident escalation workflows, mutation APIs, provider operations, production SSH execution, OpenClaw mutation, resident LM behavior, and production operator polish remain future work.
ExecutionWorker identity is hardened in scaffolding, but legacy node-agent registration still uses a bootstrap token model that is not a production self-service installer.
Access/discovery execution jobs are recorded; local development access still uses local_dev_execution_mode, while Phases R19 and R20 add only a disabled-by-default non-production worker SSH pilot and readiness boundary.
Credential storage is suitable for local development, Phase R11 proves synthetic managed-secret lease retrieval in a controlled testbed, and Phase R12 proves a Vault-compatible dry-run lease pilot, but broad production use still needs reviewed provider auth and worker-side SSH operations.
Host-key trust is durable, auditable, and available through MSP-only review APIs as of Phase R15; Phase R16 adds safe preflight summaries; Phase R17 adds live SSH feature-flag/runbook guardrails; Phase R19 proves verified trust and changed/unknown/rejected/disabled blocking in the non-production pilot; Phase R20 proves invalidation during rollback, but production host-key review UI polish and operating runbooks remain future work.
Runtime/provider bindings and memory portability contracts are documented but not yet full runtime behavior.
External provider coverage is intentionally narrow.
Host mutation, arbitrary remote commands, and broad provider execution are not current demo actions.

Complete Build Overview

The Five Build Layers

These layers explain the platform in plain terms. The lower layers make the upper layers safe: Anthropy first establishes rules, trust, evidence, and rollback; then it can add operational surfaces, orchestration intelligence, autonomous behavior, and customer-facing product workflows.

Layer 1Complete (through R27)

Constitutional Layer

The constitutional layer is the rulebook underneath the platform. It defines who can act, what they can touch, how approval works, how evidence is recorded, and how the system stops safely when something is not allowed.

Execution governance: The control plane decides whether work is allowed before anything executes.
Trust: Workers, users, scopes, providers, and routes must prove they are eligible.
Tenancy: Customer, organization, environment, service, and user boundaries are preserved.
Grants: Temporary permission records say exactly what a worker may use for one job.
Leases: Short-lived secret access proves a credential can be used without exposing it.
Transcript law: Operational evidence is ordered, redacted, scoped, and quarantined when unsafe.
Rollback: The system can disable, revoke, invalidate, and explain recovery paths.
Feature gates: Risky behavior stays disabled until explicit flags and reviews allow it.
Provider abstraction: Anthropy can use different secret or runtime providers without becoming locked to one.

Layer 2Largely built

Operational Layer

The operational layer turns safety rules into day-to-day operating surfaces. It gives humans queues, alerts, review paths, escalation, and plain-language visibility into what needs attention.

Governance surfaces: Pages and APIs where operators see what is blocked, risky, or awaiting review.
Operational queues: Worklists for items such as host-key review, worker readiness, or failed execution.
Alerts: Signals that something important changed and needs attention.
Escalation: Clear routing for decisions that require a higher-trust human or team.
Review systems: Structured approval and rejection flows with audit history.
Operational narratives: Human-readable where, why, status, risk, and next-action explanations.
Execution visibility: A safe view into jobs, transcripts, worker state, and blocked routes.

Layer 3Partially scaffolded

Orchestration Intelligence

The orchestration intelligence layer lets Anthropy reason about work instead of simply running actions. It maps intent to capabilities, providers, runtimes, services, policies, and remediation plans.

Capability graphs: A map of what the platform can do and what conditions are required.
Provider selection: Choosing the right provider or adapter for a job without hardcoding one path.
Workflow reasoning: Understanding steps, permissions, dependencies, confirmations, and outcomes.
Runtime routing: Sending work to the right runtime or worker only when it is eligible.
Remediation planning: Turning a problem into safe diagnostic and recovery options.
Environment/service orchestration: Managing business environments and services rather than raw machines alone.

Layer 4Mostly future

Autonomous Operations

The autonomous operations layer is where Anthropy can begin taking controlled action on its own. This only becomes safe after the lower layers define trust, approval, evidence, rollback, and operational review.

Self-healing: Detecting and correcting known failure states without waiting for manual diagnosis.
Autonomous remediation: Carrying out approved recovery plans inside governed boundaries.
Safe retries: Retrying work only when policy, risk, and state make retry appropriate.
Adaptive troubleshooting: Changing diagnostic approach based on observed evidence.
Operational optimization: Improving reliability, routing, cost, and response time over repeated operations.
AI strategic orchestration: Using AI to coordinate larger operational goals across systems and teams.

Layer 5Early

Productization

The productization layer turns the platform into a guided customer experience. It packages governance, orchestration, and operations into onboarding, setup, assistant experiences, and usable day-to-day workflows.

Onboarding: Helping new MSPs, organizations, users, workers, and environments enter safely.
Billing: Packaging usage, plans, and commercial boundaries around the platform.
Setup flows: Guided paths for configuring environments, services, credentials, and workers.
Guided operations: Operator workflows that explain what to do next and why.
Governance UX: Clear review, approval, risk, and audit experiences for humans.
Environment/service UX: Customer-facing views organized around business spaces and useful services.
Assistant experiences: User-facing AI surfaces that help people work through governed actions.

Why The R Arc Mattered

Governance Before Autonomy

The R-series remediation work built the constitutional layer that later operating layers can inherit. Without it, advanced automation would eventually turn into unsafe execution, hidden activity, runtime lock-in, unclear ownership, and operational confusion.

With it, Anthropy can build operational queues, intelligence, autonomous remediation, and productized assistant experiences on top of shared governance instead of inventing separate rules for every feature.

Durable Planning Outlook

How Long To Complete Anthropy Works From Now?

This section is intentionally stable. Phase numbers and build status will change, but these horizons describe the durable path from the current architecture into a sponsor-reviewable alpha, a real SMB production platform, and the larger AI operating-system vision.

Coherent enough to inspect, not production-ready

Sponsor-Reviewable Alpha

~600 focused hours

This is the next practical destination: a coherent alpha that sponsors and internal operators can inspect end to end, with enough guided flows to understand the operating-system direction without pretending real customer production readiness is finished.

Includes

environments/services
governed workflows
assistants
providers
operational governance surfaces
onboarding
non-production execution
coherent UX pass
AI narratives
useful orchestration

Depends on

aggressive AI-assisted development
Codex/Claude acceleration
reuse of existing substrate
no giant rewrites

Actual customers using it operationally

Real SMB Production Platform v1

~3,000 hours

This is the first customer-operational version: reliable enough to support real SMB use, with onboarding, billing, production execution, support workflows, and enough observability to operate responsibly.

Includes

onboarding
billing
OAuth provider onboarding
operational UX
support tooling
hosted environments
production worker execution
reliability hardening
SaaS flows
observability
operational support

Depends on

polish requirements
integration depth
provider count
operational tooling scope
deployment model

The truly massive vision

AI Operating System Category Platform

~5 years

This is the category-defining platform: Anthropy as an intelligence and operations layer across business systems, providers, runtimes, agents, governance, remediation, and business process orchestration.

Includes

autonomous remediation
orchestration intelligence
provider graph reasoning
runtime routing
AI operations layer
enterprise governance
multi-runtime substrate
large provider ecosystem
agent ecosystems
operational intelligence
business process operating layer

Depends on

enterprise-grade governance
broad ecosystem depth
multi-runtime maturity
large-scale operational learning
sustained product and platform investment