Anthropy Works

Progress and Architecture Status

Anthropy Works is an AI-native MSP operations product and runtime platform. It coordinates managed environments, services, trusted workers, runtime evidence, workflows, policy, memory, and narrated operator activity across distributed business systems.

Development Chronology

Full Build Chronology

The detailed phase log below still carries the build-by-build evidence. This section shows the F foundation phases that had to come before R, the current larger product arc, effort ranges, and the current phase percent that should move every time the build status is updated.

These hour ranges are live planning estimates, not promises. The report must always show the clearest current range, even when the range gets worse, and must change whenever scope, evidence, validation, blockers, or production risk changes.

F1

Foundation: Control Plane Genesis

Complete

This foundation phase created the first working Anthropy control plane: app shell, login, organizations, audit history, nodes, agent check-in, Docker inventory, OpenClaw discovery, and safe deployment planning. It had to come first because Anthropy needed a real place to see machines, queue work, require approval, and route operations through an agent before higher-level orchestration could mean anything.

The game changer is that Anthropy started with the minimum operating substrate in place: identity, visibility, jobs, safety, and agent-run execution instead of direct infrastructure commands.

Human hours
250-450
AI agent hours
10-24
Estimate basis
Anchored to Phase 0-13 repo history: 15 commits, 64 files, about 10,000 added lines, covering the first app, auth, nodes, agents, Docker/OpenClaw inventory, jobs, safety, and deployment validation.
F2

Foundation: Workflow and Capability Engine

Complete

This foundation phase added the capability catalog, connection records, workflow definitions, workflow rules, permission controls, and the first governed external action. It had to come next because once Anthropy could see and route work, it needed a reusable way to describe what work is allowed, what it depends on, and when humans must approve it.

The game changer is that Anthropy gained a reusable policy-driven work model, so future integrations can plug into shared governance instead of becoming isolated automation hacks.

Human hours
180-320
AI agent hours
8-18
Estimate basis
Anchored to Phase 14-18.6 repo history: 8 commits, 34 files, about 6,900 added lines, covering capabilities, connections, workflows, policy enforcement, permission UI, external action execution, and regression hardening.
F3

Foundation: Operator Product and Access Layer

Complete

This foundation phase explored the operator product shape and access layer: workflow approvals, reliability recovery, tenancy clarity, browser tests, production-readiness gates, deployment/backup/observability runbooks, command-center UI, user assignments, OpenClaw versioning, and access evidence. It had to come before deeper execution architecture because the platform needed product and access evidence before it could define what safe execution should eventually support.

The game changer is that Anthropy proved ingredients for a future operator experience, but this was still foundation/prototype evidence, not a finished workflow ordinary MSP operators could rely on.

Human hours
600-1,000
AI agent hours
35-80
Estimate basis
Anchored to Phase 19-25.3 repo history: 36 commits, 76 files, about 25,900 added lines, covering approvals, recovery, production readiness, Cloudflare reporting, deploy/backup/observability, tenant segmentation, command-center UX, OpenClaw lifecycle planning, and SSH access workflows.
F4

Foundation: Execution Plane Architecture

Complete

This foundation phase established the deeper operating-system architecture: platform interpretation, architecture canon, execution workers, execution jobs, worker capability and trust topology, environments, services, bindings, access dispatch, worker result relay, scoped secret delivery, transcript relay, and worker assignment scope. It had to come right before R because the platform needed clear execution contracts before the constitutional safety hardening could lock those contracts down.

The game changer is that Anthropy became structurally ready for distributed execution: not by enabling dangerous production work, but by defining the contracts that make production work governable later.

Human hours
300-550
AI agent hours
16-36
Estimate basis
Anchored to Phase 26-29H repo history: 12 commits after the access-layer arc, 47 files, about 9,700 added lines, covering the interpretation contract, architecture canon, execution-worker model, worker topology, environment/service topology, service bindings, access dispatch, dry-run grants, transcript relay, and assignment/fallback policy.
R

Constitutional Law

Complete

This phase built the platform's rulebook: who can act, what they can touch, what must be approved, and what evidence has to be kept. It turned dangerous infrastructure work into governed jobs with audit trails, redaction, worker identity, credential grants, rollback paths, and hard stops before risky action.

The game changer is that Anthropy can now grow without every new feature inventing its own safety rules; the law is shared, enforceable, and already underneath the work.

Human hours
340-560
AI agent hours
14-30
Estimate basis
Anchored to the completed R1-R27 arc: the original R1-R20 hardening (21 commits, ~10,700 added lines, migrations, CI, runbooks) plus a 7-commit R21-R27 audit-remediation pass that closed the doc-vs-code execution-edge gaps an independent reconciliation audit found — governed worker execution edge activated, deploy-path safety invariant restored, fail-safe RBAC, host-key observation producer, dependency/blast-radius edge model, ServiceBinding dispatch, and record-first alerting.
G

Operational Consciousness

90% done

This phase teaches Anthropy to notice what is happening operationally and explain it in human terms. It adds queues, health signals, review surfaces, blocked-work visibility, emergency-disable awareness, and the first operator view of what needs attention.

The game changer is that the platform is moving from simply having safety controls to being able to show people where risk, attention, and readiness actually live.

Human hours
550-1,000
AI agent hours
60-140
Estimate basis
Updated after the completed G1-G9-W1 governance and console arc and the major work that followed it: the agent-foundation spine (the entitlement keystone, the governed door with idempotency / plan-apply / AIP-151 operations / events, agent tool descriptors, and action preview), situational-awareness read models (attention, entity status, per-customer operator view), a two-plane tenancy and responsibility model, and a world-class operator console rebuilt to a calm Focus + Summon architecture and wired to that spine — Customers, Members with email-invite onboarding, Connections, Capabilities, MSP management, and System > Configuration, on a reusable CRUD kit. Agent-executed OpenClaw deployment is real and verified live in dev/control-plane with reversible-only teardown, and the worker execution edge is activated in stub/dry-run. The full backend suite is 879 tests green. G is held below complete because alert delivery, live transcript fanout, real external-provider execution with OAuth, resident-LM behavior, live worker SSH, and PRODUCTION remain out of scope.

G6 - Governance OS Trust + Operational Object Foundation

  • Governance OS stabilization and durable review memory
  • Deterministic operational trust walkthroughs for ambiguity, safe stops, and escalation clarity
  • Operator cognition doctrine: concrete operational narration over raw governance semantics
  • Initial runtime object model for gateways, nodes, and jobs
  • First read-only access gateway inspection flow
  • Current status: complete through G6D

G7 - Operator Console + System Introspection

  • System introspection registries (backend actions, operational objects, capability lifecycle, setup flows)
  • UI page/panel/action registries (what exists and what is safe)
  • Deterministic Operator Console shell (typed commands → registered actions)
  • Refusal behavior for out-of-scope questions (not a general chatbot)
  • Console-guided steering of existing UI state (navigation + panels)
  • Console can start persisted setup-flow sessions (no provider calls, no SSH, no mutation)
  • Introspection-aware command grammar resolves registered pages, panels, setup flows, fields, and session commands
  • Console-guided setup sessions can list, inspect, resume, update non-secret fields, and advance persisted sessions without owning canonical state
  • TEST internal-alpha environment readiness covers local/LAN access, Redis sessions, Postgres setup-session persistence, multi-user sanity, isolation, restart/recovery discipline, and walkthrough readiness
  • Current status: complete through G7G

G8 - Shared Operational Alpha + STAGING Readiness

  • Shared operational alpha readiness inventory, isolation tests, readiness UI/runbook, and operator walkthrough
  • Kitsune and Operator Console stabilization for alpha walkthrough use
  • First walkthrough blocker pass: console gateway setup no longer opens the real Connect gateway modal
  • TEST remains local/LAN development and validation
  • STAGING is the hosted trusted-tester environment at staging.anthropy.works
  • STAGING UI now forces an explicit environment label (banner + required badge) so testers cannot confuse it with TEST or future PRODUCTION
  • Mobile Kitsune cockpit is reset and stabilized for phones without breaking desktop panel semantics
  • PRODUCTION does not exist yet
  • Current status: complete through G8-H1

G9 - Workflow Cognition + Gateway Lifecycle

  • Workflow Cognition Architecture shifts Kitsune from panel polish toward object lifecycle clarity
  • Access gateway lifecycle spine: identify, configure, validate, monitor, investigate, control, retire
  • Gateway setup sessions behave as visible setup drafts before profile creation or validation
  • Gateway profile handoff makes the setup-to-object transition more explicit
  • Guided gateway profile editing separates cosmetic edits from validation-sensitive changes
  • Evidence revalidation, validation workspace, operator timeline, and read-only validation evidence make gateway trust easier to inspect
  • Current status: G9-W1 complete

Agent Foundation - The Spine + The Governed Door

  • Entitlement keystone: capability/action registry, can_user_perform, GET /me/entitlements, and a role matrix answer what a user may actually do in one place
  • One governed door (POST /actions/{id}/invoke) executes low-risk record/state mutations; medium, high, and destructive actions stay refused behind dedicated confirmation
  • Industry patterns built in: Stripe-style idempotency keys, Terraform-style plan/apply change-sets, Google AIP-151 long-running operations, and an event stream on door mutations
  • Agent tool descriptors (GET /agent/tools) and standardized action preview (GET /actions/{id}/preview) make the same operations legible to a future agent
  • Situational awareness: GET /me/attention, per-entity status, and a per-customer operator view answer is this OK, what needs me, what can I do
  • Two-plane tenancy and responsibility: explicit Node.hardware_owner attribution and an org-user functional-health surface
  • Current status: complete and green (879 backend tests)

Execution Edge - Activated and Made Safe (R21-R27)

  • Governed worker execution edge activated end to end in stub/dry-run: promote-to-worker, dispatch with a scoped credential-grant stub, worker job claim, and result relay
  • Deploy-path safety invariant restored: reversible-only teardown (no rm -rf, no -v); control-push defers to the agent once a host has a live agent
  • Agent-executed OpenClaw deployment with a per-instance Cloudflare tunnel and DNS, verified live in dev/control-plane
  • Fail-safe RBAC (first-class viewer default-deny), host-key trust-on-first-use producer, a dependency/blast-radius edge model, ServiceBinding-scoped dispatch, and record-first alerting
  • Live worker SSH remains flag-blocked; no medium+ door execution, no real external-provider execution, no PRODUCTION
  • Current status: complete

Operator Console - World-Class Surface Wired to the Spine

  • Every screen rebuilt to a calm Focus + Summon architecture (full-width room + summoned detail sheet; the old 3-pane surface retired app-wide)
  • Entitlements decide what is shown; create/edit/deactivate route through the governed door
  • Reusable CRUD kit across Customers/Orgs, Members/Users, Connections, Capabilities, and MSP-overlord management, each with an object-graph Related lens
  • Members onboards operators by email invite / magic link with no operator passwords
  • System > Configuration: Security, Notifications, Branding, Integrations & API, and AI keys / model routing, per-MSP with secrets encrypted at rest
  • Apple Liquid Glass with three appearance modes; universal Apple-Mail list rows and one universal + add affordance
  • Current status: active — remaining per-screen mutations still being migrated onto the door

G10 - Resident LM Integration Candidate

  • Anthropy Works-only knowledge boundary
  • Retrieval from introspection registries
  • No external/general knowledge answers
  • Tenant/user/org isolation
  • Capability-aware responses
  • UI-native action orchestration through deterministic registries only
  • Candidate future direction only; not active until explicitly approved

G11 - Multi-Agent / Autonomous Ops Candidate

  • Only after deterministic console, setup flows, staging review, and future LM boundaries are stable
  • No direct LM-to-runtime authority
  • All action through registries, workflows, jobs, risk gates
  • Candidate future direction only; no autonomy is enabled
A

Bounded Intelligence

Substrate ready

This phase gives Anthropy useful reasoning inside clear limits: understanding intent, choosing safe next steps, and explaining options without pretending it can do everything. It is where assistants become more than screens and start helping operators think through governed work.

The game changer is intelligence that is boxed in by the constitutional layer, so AI can help plan and decide without becoming an ungoverned executor.

Human hours
400-800
AI agent hours
30-80
Estimate basis
Total range is for the intelligence itself and is unchanged. The enabling substrate is already built and green: a 20-commit agent-foundation spine (entitlements, agent tool descriptors, action preview, the governed door with plan/apply and idempotency, situational-awareness read models), a 269-line deterministic Operator Console, and AI provider-key + model-routing config (the ModelRoute model + the System > Configuration AI page). That substrate de-risks the boxed-in / policy-aware portion but is not the reasoning. Intent interpretation, bounded planning, decision explanations, evaluation loops, and AI safety-regression coverage are not started — there are zero model calls in the codebase today. Roughly a tenth of the phase is de-risked by substrate; the intelligence is 0% built, by design.

Built - the rails an LM will ride (reused from R/G)

  • Entitlements, agent tool descriptors (/agent/tools), and action preview let an agent read exactly what it may do and what an action would change
  • The governed door (plan/apply + idempotency) means any agent-proposed change runs through the same constitutional gate as a human
  • Situational-awareness read models (attention, entity status, operator view) give an agent honest machine truth to reason over
  • The deterministic Operator Console (typed command to registered action, scope refusal) is the precursor surface - same rails, no model
  • AI provider-key storage and model-routing config exist (ModelRoute, primary/fallback), consumed by nothing yet

Not started - the intelligence itself

  • No LLM/model calls anywhere in the codebase; no intent interpretation by a model
  • No bounded planning, no reasoned decision explanations, no option weighing
  • No evaluation loops and no AI-behavior safety-regression coverage

Next steps

  • Wire a resident LM (the G10 candidate) that reads entitlements/tools/preview and proposes plans only - never LM-to-runtime
  • Constrain it to Anthropy registries (no general knowledge) with tenant/user/org isolation
  • Route every proposed action back through the governed door (plan/apply + confirmation) so reasoning can never bypass the law
  • Stand up an evaluation harness and an AI safety-regression suite before any medium-or-higher action is ever proposed
P

Scalable Productization

Getting started

This phase turns the platform into something customers can actually adopt, operate, and pay for. It adds onboarding, setup flows, billing shape, support workflows, polished governance UX, and repeatable day-to-day product paths.

The game changer is that the deep operating-system substrate becomes a usable product instead of a powerful internal architecture project.

Human hours
800-1,600
AI agent hours
80-180
Estimate basis
Total range is unchanged, but this is no longer Not started: a real first product slice is built and green. Done: email-invite / magic-link onboarding (a 149-line mailer + an accept-invite page + 11 tests, no operator passwords), persisted setup-flow sessions and guided schema-first add wizards, a five-page System > Configuration surface (a 701-line tested router + migration, per-MSP, secrets encrypted), a reusable 854-line CRUD kit, the world-class Focus + Summon operator UX, and billing shape (org plan_tier / billing_status fields). The productization-defining bulk remains: an actual billing and payments engine, support tooling, hosted production operations, packaging, reliability hardening, and an end-to-end customer path (blocked on the same provider-OAuth gap as A and E). Roughly a 15-20% first slice; the rest is the heavy half.

Built - the first product slice

  • Email-invite / magic-link onboarding: create-company plus a public accept-invite page, no operator passwords (mailer + 11 tests)
  • Persisted setup-flow sessions and guided schema-first add wizards (the /hosts/new pattern, per-step validation)
  • System > Configuration: five pages (Security & Access, Notifications, Branding, Integrations & API, AI), per-MSP, secrets encrypted at rest (701-line tested router + migration)
  • A reusable CRUD kit (854 lines) and the world-class Focus + Summon operator UX in Apple Liquid Glass
  • Billing shape: org plan_tier / billing_ref / billing_status fields (Stripe-Connect-patterned) - fields only, no engine

Not started - the productization core

  • No billing or payments engine, checkout, metering, or invoicing - only the data fields exist
  • No support tooling or help desk of any kind
  • No hosted production operations, packaging, documentation, or reliability hardening
  • No end-to-end customer path: a customer still cannot connect a real account and have governed work performed (blocked on provider execution + OAuth)

Next steps

  • Stand up a billing engine on the existing fields: plan tiers to metering to a real payment provider
  • Add the first real provider OAuth + execution so onboarding actually leads to work being done (the shared unlock with A and E)
  • Build minimal operator support and audit-trail tooling
  • Hosted operations: backups, monitoring, and runbooks for a real STAGING-to-first-customer path
E

Ecosystem Emergence

Not started

This phase opens Anthropy into a larger ecosystem of providers, runtimes, agents, customer services, and partner workflows. It is where the platform can coordinate across many outside systems while keeping ownership, policy, memory, and evidence portable.

The game changer is that Anthropy becomes a category platform: not one tool for one workflow, but a governed operating layer that other tools and teams can safely build around.

Human hours
2,000-5,000+
AI agent hours
200-600+
Estimate basis
Estimated long-horizon range for provider ecosystems, runtime portability at scale, partner workflows, enterprise governance, multi-agent operations, and platform-category maturity.

Ground Truth

How To Read This Build Honestly

Anthropy Works has been built in the right order for a high-risk AI operations platform: foundation first, execution contracts next, constitutional safety after that, and now operational visibility. That order matters because infrastructure automation cannot jump safely from a demo into customer trust without identity, scope, jobs, grants, audit, transcripts, rollback, and human review.

Environment Doctrine

TEST means local machine or trusted LAN development. STAGING means the hosted online trusted-tester environment at staging.anthropy.works with rollback, backup, access-control, and isolation discipline. PRODUCTION does not exist yet and must be created only in a future dedicated phase.

Actually Usable Today

There is now a real, drivable operator console over a complete governed backend. An MSP admin can sign in, manage MSPs, customers, and staff (onboarded by email invite with no passwords), see operational health and what needs attention, configure the system, and perform low-risk changes through one audited door. What is still missing for real customer use is connecting real external accounts and having governed work performed in them.

What Is Strong

The platform has unusually strong safety sequencing for its stage. Only low-risk changes flow through one governed, audited door; risky and production execution stay blocked while the system builds the law, evidence, governance narratives, operational intelligence, and escalation semantics needed before autonomy can be trusted.

What Is Not Proven Yet

Live worker SSH, real external-provider execution with live OAuth and credential acquisition, live transcript fanout, external alert delivery, customer onboarding, billing, resident-LM behavior, and a PRODUCTION environment are not done. Agent-executed OpenClaw deployment is real in dev/control-plane but is not a production customer rollout. The report should not be read as SaaS readiness.

What Must Be Done Well

Leadership and builders must keep scope disciplined, validate every phase exit, update estimates when evidence changes, and resist turning future capability into current claims. Optimism cannot outrun validation.

How Confidence Should Feel

Sponsors should be confident that the architecture is progressing in the right order, but not assume the remaining phases are automatic. The right stance is grounded confidence: real progress, live estimates, and explicit uncertainty where production risk still exists.

Production Target

The destination is a system real customers can be placed on without heroic support: ordinary MSP operators should be able to onboard, understand, approve, recover, and operate it without depending on the builder. That is not the current state.

When It Gets Real

The report must name maturity plainly: evidence, prototype, internal tool, sponsor-reviewable alpha, operator-ready beta, or production customer-ready. It only gets called production customer-ready when ordinary MSP operators can run the workflow for real customers with documented support, recovery, and rollback.

Current Status

AI-Native Orchestration Operating System

Anthropy Works coordinates environments, services, workers, runtimes, providers, workflows, policy, memory, and narrated activity across distributed business systems. The current build pairs an agent-foundation spine — one entitlement keystone that decides what a user or agent may do, and one governed door that executes low-risk mutations with idempotency, plan/apply change-sets, long-running operations, and an event stream — with a world-class operator console rebuilt on top of it. It is no longer read-only: it is a coherent, drivable sponsor-reviewable alpha. It is not yet operator-ready beta or PRODUCTION: live worker SSH stays blocked, medium and higher-risk actions stay behind confirmation, external provider execution is a single demo read with no live OAuth, and PRODUCTION does not exist.

What Is Current

  • The agent-foundation spine answers what a user or agent may do in one place: a capability/action registry, can_user_perform, the GET /me/entitlements keystone, and a role matrix.
  • One governed door (POST /actions/{id}/invoke) executes low-risk record/state mutations with Stripe-style idempotency keys, Terraform-style plan/apply change-sets, Google AIP-151 long-running operations, and an event stream; medium, high, and destructive actions stay refused behind dedicated confirmation.
  • Situational-awareness read models answer is this OK, what needs me, and what can I do: GET /me/attention, per-entity status, and a per-customer operator view the Customers screen renders directly.
  • A two-plane tenancy and responsibility model attributes each instance to who runs it and exposes an org-user functional-health surface.
  • R21-R27 closed the doc-vs-code execution-edge gaps an independent audit found: the governed worker execution edge is activated in stub/dry-run, the deploy path is agent-executed and reversible-only, MSP-viewer read-only is a first-class default-deny policy, host-key trust is fed by live SSH observations, dependency/blast-radius has a real edge model, ServiceBinding dispatch is live, and alerting is recorded first.
  • Agent-executed OpenClaw deployment with a per-instance Cloudflare tunnel and DNS is real and verified live in dev/control-plane, with reversible-only teardown.
  • The operator console is rebuilt to world-class: every screen recomposed to a calm Focus + Summon architecture and wired to the spine, on a reusable CRUD kit across Customers/Orgs, Members/Users, Connections, Capabilities, and MSP management, with Apple Liquid Glass.
  • Members onboards operators by email invite / magic link with no operator passwords; System > Configuration ships Security, Notifications, Branding, Integrations & API, and AI keys / model routing, per-MSP with secrets encrypted at rest.
  • The full backend suite is 879 tests green.
  • Architecture Canon defines Anthropy as the durable orchestration layer for identity, policy, object graph, capability graph, workflow truth, narrative history, and portable memory contracts.
  • ExecutionWorker and ExecutionJob records define worker identity, assignment scope, capability support, job status, route narrative, transcript, retry, and redaction metadata.
  • Worker Capability + Trust Topology defines capabilities, trust levels, locality, ownership scope, runtime relationships, and execution eligibility checks.
  • Environment + Service Topology defines environments as business operating spaces and services as the useful capabilities organizations and users understand.
  • Service Binding Contracts define how services attach to workers, runtimes, providers, capabilities, workflows, and assistants.
  • Worker-Side Access Dispatch defines how access_test and host_discovery route through eligible workers with local_dev_execution_mode as a governed bootstrap fallback.
  • Worker Access Executor Stub defines worker_stub_execution and validates worker-style access results before they become transcript, technical detail, access, or discovery evidence.
  • Scoped Secret Delivery defines one-job CredentialGrant metadata and worker_ssh_dry_run, which stops before opening any network connection.
  • Phase R11 adds a controlled test_managed_secret_provider harness that resolves synthetic in-memory material for dev/test lease validation, captures redaction fingerprints, closes the lease, and remains blocked in production.
  • Phase R12 adds a Vault-compatible provider pilot that resolves scoped leases through an explicitly configured endpoint in dry-run/test mode, captures redaction values, closes the lease, and keeps worker-side SSH disabled.
  • Phase R13 adds worker_ssh_preflight so workers can prove host-key trust, target route policy, command allowlist, grant/lease evidence, and live-mode gates before stopping without opening SSH.
  • Phase R14 persists HostKeyTrustRecord evidence, enforces host-key review transitions, detects changed fingerprints, and audits host-key observation, verification, rejection, disablement, and preflight blocking.
  • Phase R15 adds MSP-only host-key trust review APIs with safe status/risk/next-action text, audit summaries, and verify/reject/disable actions.
  • Phase R16 hardens DB-backed worker transcript read models with ordered operator/technical/audit streams, MSP-admin technical/quarantine access, replay/out-of-order/post-terminal quarantine rules, and safe preflight evidence summaries.
  • Phase R17 adds explicit live worker SSH feature flags, environment/scope/worker gates, emergency disable guardrails, safe operator status, and runbook/audit scaffolding while keeping live worker SSH disabled.
  • Phase R18 adds a feature-flagged worker_ssh_live_nonproduction harness for narrow read-only validation after all worker, scope, host-key, route, command, grant, lease, provider, and emergency-disable gates pass; production live SSH remains blocked.
  • Phase R19 completes the first constrained non-production worker SSH pilot shape and transcript transport soak for ordered events, timeout/disconnect behavior, emergency disable, host-key change blocking, provider/lease validation, replay quarantine, and no-secret-leak checks.
  • Phase R20 completes the operational readiness and rollback drill for reconnect-style transcript polling, replay resilience, partial transcript continuity, emergency disable during transcript flow, feature-flag disable, provider disable, worker revoke, host-key invalidation, lease revocation, and operator where/why/next narratives.
  • Phase G1 adds MSP-facing governance status and queues for worker health, host-key review, transcript quarantine, failed/blocked execution, provider/lease failures, rollback attention, emergency-disable visibility, and live SSH gate status.
  • Phase G2 adds a read-only Governance/Mainstage UI so MSP operators can inspect governance queues and alert integration contracts without taking action from that surface.
  • Phase G3 adds plain-language operational narratives grounded in existing governance, job, workflow, audit, and access evidence.
  • Phase G4 adds computed operational intelligence with evidence correlation, provenance, confidence, and uncertainty fields while remaining read-only.
  • Phase G5 adds escalation and delegation semantics that explain why Anthropy stopped, what human decision is needed, what remains unsafe, and what future bounded remediation would require.
  • Phase G6 adds Governance OS stabilization, integration branch CI protection, review memory, operational trust validation, operator cognition doctrine, and the first gateway/node/job operational object inspection foundation.
  • Phase G7B adds deterministic system introspection registries for backend actions, operational objects, capability lifecycle vocabulary, and setup-flow definitions without secret leakage.
  • Phase G7C adds persisted SetupFlowSession records and deterministic setup-flow state machines (starting with access_gateway_setup) without provider calls, SSH execution, or runtime mutation.
  • Phase G7D adds a deterministic Operator Console shell (typed commands mapped to registered actions) that can steer the current UI state and start setup-flow sessions without becoming a chatbot or an execution terminal.
  • Phase G7E adds introspection-aware command grammar so the console resolves registered pages, panels, setup flows, setup fields, and setup-session commands from Anthropy registries instead of growing a pile of hardcoded phrases.
  • Phase G7F completes console-guided setup-session steering: list, inspect, resume, update non-secret fields, and advance persisted setup-flow sessions while the setup-flow API remains the source of truth.
  • Phase G7G documents and validates TEST internal-alpha readiness for the trusted local/LAN stack, Redis sessions, Postgres setup-session persistence, multi-user sanity, isolation, restart/recovery discipline, backup basics, and walkthrough use.
  • Phase G8 completes shared operational alpha readiness and the first walkthrough-blocker fix pass while keeping RBAC, auth semantics, setup-session ownership, provider execution, OpenClaw mutation, and LM behavior unchanged.
  • Phase G8-H1 completes hosted STAGING environment doctrine and visible environment labeling for staging.anthropy.works while PRODUCTION remains explicitly nonexistent.
  • Phase G9-W1 starts gateway workflow cognition: setup drafts, profile handoff, guided editing, evidence revalidation, validation evidence workspace, operator timeline, and real read-only validation evidence for access gateway review.
  • Worker authentication and assignment checks now reject wrong, disabled, revoked, stale, unhealthy, insufficiently trusted, missing-capability, or wrong-scope workers before grant use, transcript relay, or result relay is trusted.
  • Redis-backed sessions replace process-local session memory, and login plus bootstrap agent registration have basic Redis-backed rate limits.
  • Worker Transcript Relay defines validated worker progress events with operator, technical, audit, and quarantine read-model separation.
  • Worker Assignment Scope defines platform, organization, environment, service, and user worker boundaries for access/discovery job eligibility.
  • Production Fallback Policy records explicit local_dev_execution_mode fallback decisions with where, why, next, requested scope, selected worker, and fallback_used fields.
  • Phase R1 hardens production config validation, production cookie security, public status minimization, and tracked env-file guidance.
  • Phase R2 removes the default host private SSH key mount from Docker Compose, keeps local demos on saved/uploaded credential references, hardens SSH ControlPath placement, and improves SSH transcript/result redaction.
  • Phase R3 persists org scope on tenant-sensitive records and moves high-risk list/read paths toward database-level scope filtering.
  • local_dev_execution_mode is gated in production unless explicitly enabled, and blocked routes stop before remote access begins.
  • Assistant semantics are now explicit: assistants are user-facing work surfaces, not workers, runtimes, or providers.
  • Mainstage semantics are now documented separately for MSP topology and routing, org environments and service health, and user workspaces and outcomes.

What This Means

  • Anthropy is the durable orchestration layer across MSP, organization, and user ownership layers.
  • Workers are governed execution participants, not invisible implementation detail.
  • Capabilities are intent-level graph concepts, not single provider buttons.
  • Runtimes and providers can execute or cache state, but they do not own Anthropy identity, policy, workflow truth, memory contract, object graph, or narrative history.
  • Environment and service topology is the next customer-facing model; raw infrastructure remains an MSP/operator concern.
  • Production execution still requires scoped dispatch, policy evaluation, typed confirmation where risky, audit, redaction, production live fanout implementation, production-reviewed provider auth, external alerting/escalation integrations, and final production worker-side SSH runbooks.

Phase Status

Recent Architecture Milestones

The governance and gateway-cognition arc is complete through G9-W1. Since then the constitutional layer was hardened through R27 — the governed worker execution edge was activated in stub/dry-run, the deploy-path safety invariant was restored, and fail-safe RBAC, a host-key trust producer, a dependency/blast-radius model, ServiceBinding dispatch, and record-first alerting landed. On top of that the agent-foundation spine was built (the entitlement keystone and one governed door with idempotency, plan/apply, long-running operations, and events), and the operator console was rebuilt to world-class and wired to that spine. Resident-LM behavior, live worker-side SSH, real external-provider execution with OAuth, and autonomy remain later phases that require explicit approval.

Operator Console

World-Class Operator Surface

Every operator screen rebuilt to a calm Focus + Summon architecture and wired to the spine: Customers, Members with email-invite onboarding and no operator passwords, Connections, Runtime, Capabilities, Procedures, Fabric, MSP management, and System > Configuration, on a reusable CRUD kit in Apple Liquid Glass. Remaining per-screen mutations are still being migrated onto the governed door.

Active
Agent Foundation 3

Situational Awareness

Read models for what needs attention (/me/attention), per-entity status, and a per-customer operator view, plus a two-plane tenancy and responsibility model, give every screen one honest answer to is this OK, what needs me, and what can I do.

Complete
Agent Foundation 2

The Governed Door

One door (POST /actions/{id}/invoke) now executes low-risk record/state mutations with Stripe-style idempotency keys, Terraform-style plan/apply change-sets for risky actions, Google AIP-151 long-running operations, and an event stream. Medium, high, and destructive actions stay refused behind dedicated confirmation.

Complete
Agent Foundation 1

Entitlement Keystone

A capability/action registry, can_user_perform, the GET /me/entitlements keystone, and a role matrix make what a user may actually do a single authoritative answer that the UI and future agents both read.

Complete
Phase R27

Operational Alerting (record-first)

Added record-first outbound operational alerting off the governance queues; external delivery channels remain intentionally narrow.

Complete
Phase R26

ServiceBinding-Scoped Dispatch

Activated the built-but-dormant service-scoped routing branch by creating ServiceBindings and passing service scope through execution dispatch.

Complete
Phase R25

Dependency / Blast-Radius Model

Added a CMDB dependency edge model and impact service so the platform can answer what depends on a given host, provider, or runtime at the data layer instead of faking it in the UI.

Complete
Phase R24

Host-Key Observation Producer

Fed host keys observed during live SSH access tests into the trust recorder (trust-on-first-use), so the host-key review surface is populated from real evidence instead of staying empty.

Complete
Phase R23

Fail-Safe RBAC

Made MSP-viewer read-only a first-class default-deny policy for non-admin MSP users instead of enforcement-by-omission, so a mutating endpoint cannot accidentally grant a viewer write access.

Complete
Phase R22

Deploy-Path Safety Invariant

Resolved the deploy contradiction an independent audit flagged: teardown is reversible-only (docker compose down, no -v, never rm -rf) and control-push refuses once a host has a live agent, restoring the rule that the agent executes deployment.

Complete
Phase R21

Governed Worker Execution Edge

Activated the previously test-only worker execution machinery against real reach in stub/dry-run: promote-to-worker, an MSP worker roster, operator dispatch with a scoped credential-grant stub, worker job claim, and result relay — driven end to end against the live dev API. Live worker SSH stays flag-blocked.

Complete
Phase G9-W1

Gateway Workflow Cognition

Turns access gateway work into a coherent operational lifecycle with setup drafts, profile handoff, guided editing, revalidation, evidence workspace, operator timeline, and read-only validation evidence while preserving no-execution boundaries.

Complete
Phase G8-H1

STAGING Environment Doctrine

Normalizes TEST, hosted STAGING at staging.anthropy.works, and future PRODUCTION doctrine with isolation, rollback, host-collision, credential, and access-control boundaries.

Complete
Phase G8-F1

Shared Alpha Walkthrough Fixes

Fixed the first walkthrough blockers by keeping console-started gateway setup inside deterministic setup-session steering and aligning documented setup-field commands with the grammar.

Complete
Phase G8

Shared Operational Alpha Readiness

Prepared the internal alpha for coherent evaluation with readiness inventory, isolation coverage, readiness UI/runbook, walkthrough guidance, Kitsune stabilization, and preserved MSP/Org/User/RBAC boundaries.

Complete
Phase G7G

TEST Internal Alpha Environment

Documented and validated the trusted local/LAN TEST alpha environment, including restart discipline, Redis sessions, Postgres setup persistence, multi-user sanity, isolation, backup basics, and walkthrough readiness.

Complete
Phase G7F

Console-Guided Setup Sessions

Completed console-guided setup-session steering for list, inspect, resume, non-secret field updates, and deterministic state advancement while setup-flow APIs remain canonical.

Complete
Phase G7E

Introspection-Aware Command Grammar

Replaced one-off phrase growth with registry-backed deterministic command grammar for pages, panels, UI actions, setup flows, setup fields, and setup-session commands.

Complete
Phase G7D

Deterministic Operator Console Shell

Added a bounded Operator Console shell that steers registered UI actions and setup-flow session starts without becoming a chatbot or execution terminal.

Complete
Phase G7C

Deterministic Setup-Flow Sessions

Added persisted SetupFlowSession records and deterministic setup-flow state machines with audited lifecycle events, redaction, and user/org isolation.

Complete
Phase G7B

System Introspection Registries

Added deterministic registries for backend actions, operational objects, capability lifecycle vocabulary, setup flows, pages, panels, and UI actions without exposing secrets.

Complete
Phase G6D

Runtime Observation + Object Inspection

Stabilized the initial gateway, node, and job operational object vocabulary and added the first lightweight read-only access gateway inspection flow.

Complete
Phase G6C

Operational Trust Validation

Added deterministic local/dev walkthrough scenarios and reviewer guidance so humans could evaluate ambiguity handling, safe stops, escalation clarity, and Governance Mainstage coherence.

Complete
Phase G6B

Governance Review Memory

Added durable informational review memory with source fingerprints and stale/historical states while preserving source conditions, risk, escalation, and delegate-back boundaries.

Complete
Phase G6A.1

Integration Branch CI Protection

Protects Governance OS integration discipline with branch/status validation and no added runtime authority.

Complete
Phase G6A

Governance OS Stabilization

Stabilizes Governance OS v1 after independent review while keeping execution, remediation, provider operations, production worker SSH, and OpenClaw deployment disabled.

Complete
Phase G5

Escalation + Delegation Semantics

Adds read-only escalation and delegation semantics that explain why Anthropy stopped, who must decide, what safe options remain, what is unsafe, and what would be required before future bounded remediation could be considered.

Complete
Phase G4

Operational Intelligence Read Model

Adds computed, read-only operational intelligence that correlates governance items with safe evidence, provenance, confidence, and unresolved uncertainty while keeping structured records as source of truth.

Complete
Phase G3

Operational Narrative Layer

Adds plain-language governance narratives grounded in existing queue, job, workflow, audit, and access evidence so operators can understand where Anthropy stopped, why it matters, and what human input is needed.

Complete
Phase G2

Governance Mainstage Read UI

Adds a read-only Governance/Mainstage surface for MSP operators to inspect governance queues, status, and alert integration contracts without enabling acknowledgement, remediation, alert delivery, production SSH, or OpenClaw deployment.

Complete
Phase G1

Operational Governance Surface

Adds MSP-facing governance status and queue APIs for worker health, host-key review, transcript quarantine, failed or blocked execution, provider and lease failures, rollback attention, emergency-disable visibility, live SSH gate status, and audit-only acknowledgement while production worker SSH remains disabled.

Complete
Phase C1

Post-Remediation Critical Cleanup

Closes bounded follow-up findings by binding default local Postgres/Redis ports to loopback, scoping worker transcript read routes by job org/private scope, adding worker-authenticated transcript event ingest through the existing relay validator, and documenting human-approved git-history purge requirements.

Complete
Phase R20

Readiness Review + Rollback Drill

Completes the constrained operational readiness drill for reconnect-style transcript polling, replay resilience, emergency disable during transcript flow, feature-flag disable, provider disable, worker revoke, host-key invalidation, lease revocation, operator narratives, and no-secret-leak checks while production live SSH remains blocked.

Complete
Phase R19

Non-Production SSH Pilot + Transcript Soak

Completes the first constrained non-production worker SSH pilot shape and transcript soak for read-only commands, timeout/disconnect handling, emergency disable, host-key behavior, provider/lease validation, replay quarantine, and no-secret-leak checks while production live SSH remains blocked.

Complete
Phase R18

Non-Production Live SSH Harness

Adds a feature-flagged worker_ssh_live_nonproduction harness for narrow read-only validation after worker, scope, host-key, route, command, grant, lease, provider, and emergency-disable gates pass; production live SSH remains blocked.

Complete
Phase R17

Live SSH Guardrails

Adds explicit live worker SSH feature flags, environment/scope/worker gates, emergency disable guardrails, safe status/read models, and runbook/audit scaffolding while live worker SSH remains disabled by default.

Complete
Phase R16

Transcript + Preflight Read Models

Hardens DB-backed worker transcript read models, MSP-gated operator/technical/quarantine access, replay/out-of-order/post-terminal quarantine rules, and safe worker_ssh_preflight evidence summaries while live worker SSH remains disabled.

Complete
Phase R15

Host-Key Review API

Adds MSP-only host-key trust list/detail read models, safe operator status/risk/next-action text, audit summaries, and verify/reject/disable review actions while live worker SSH remains disabled.

Complete
Phase R14

Host-Key Trust Persistence

Persists host-key trust evidence, enforces review transitions, audits observed/verified/changed/rejected/disabled/preflight-blocked events, and feeds durable trust evidence into worker_ssh_preflight.

Complete
Phase R13

Worker SSH Preflight Boundary

Added worker_ssh_preflight with host-key trust states, target route gating, command allowlist checks, lease evidence checks, and live-mode rejection while keeping worker-side SSH disabled.

Complete
Phase R12

Vault-Compatible Provider Pilot

Added a Vault-compatible managed provider adapter pilot that validates provider config and vault:// references, resolves scoped leases in dry-run/test mode, captures redaction values, closes leases, and keeps worker-side SSH disabled.

Complete
Phase R11

Managed Secret Testbed + Lease Retrieval

Added a dev/test-only managed secret provider harness that resolves synthetic in-memory material, captures redaction values, closes the lease, and proves WorkerSecretLease validation without enabling real provider retrieval or worker-side SSH.

Complete
Phase R10

Managed Provider Adapter + Transcript Transport Prep

Added managed secret provider adapter configuration, provider selection rules, safe not-implemented behavior for real providers, and DB-first transcript transport planning without enabling real provider retrieval or worker-side SSH.

Complete
Phase R9

Secret Provider Contract

Defined a provider-pluralistic SecretProvider contract and WorkerSecretLease shape so worker dry-runs can validate scoped retrieval rules without enabling real worker-side SSH or production secret delivery.

Complete
Phase R8

Alembic Drift Cleanup

Aligned historical timestamp nullability, workflow timestamp indexes, and CredentialGrant grant_id uniqueness/index shape so migration validation can fail on future unclassified drift.

Complete
Phase R7

CI + Deployment Validation

Added GitHub Actions validation scaffolding, phase-oriented validation layers, secret scanning, migration graph/head checks, deployment gates, and explicit reporting for known Alembic drift.

Complete
Phase R6

Redis Sessions + Rate Limits

Moved API sessions into Redis with TTL records and added Redis-backed rate limits for login and bootstrap agent registration without changing execution architecture.

Complete
Phase R5

Worker Authentication + Assignment

Hardened ExecutionWorker identity and assignment with registration status, hashed worker secret metadata, explicit scope fields, active/disabled/revoked/heartbeat checks, and scope-aware grant/result/transcript validation.

Complete
Phase R4

Durable CredentialGrant Persistence

Persisted CredentialGrant as a scoped, temporary, auditable permission record and made worker dry-run result relay validate durable grant status, scope, job, worker, credential reference, and allowed use before accepting evidence.

Complete
Phase R3

Tenancy + Scope Persistence

Persisted org scope on tenant-sensitive infrastructure, access, execution, and audit records; improved DB-level scope filtering; and added cross-org leakage tests without changing execution architecture.

Complete
Phase R2

Local Dev SSH Risk Containment

Removed the default host private SSH key mount, kept local demos on saved/uploaded credentials, hardened SSH control sockets, strengthened redaction, and contained password SSH risk without enabling worker SSH.

Complete
Phase R1

Secret + Production Config Hardening

Hardened production config validation, secure cookie behavior, minimal public status, env-file guidance, and report claims without changing access execution or unblocking deployment.

Complete
Phase 29H

Worker Assignment Scope + Fallback Policy

Added explicit worker assignment scope, job scope, production fallback decisions, and scope validation for transcript events, credential grants, and worker results without enabling real worker-side SSH.

Complete
Phase 29G

Worker Transcript Relay

Added validated worker progress events, operator/technical/audit read-model separation, quarantine handling, dry-run event emission, and grant lifecycle audit scaffolding.

Complete
Phase 29F

Scoped Secret Delivery + SSH Dry Run

Defined one-job CredentialGrant scaffolding and worker_ssh_dry_run, proving grant references, allowlists, redaction, and stop-before-network behavior without enabling production worker-side SSH.

Complete
Phase 29E

Worker Access Executor Stub

Added worker_stub_execution and API result relay validation for worker-style access and discovery results without enabling live worker SSH or raw secret delivery.

Complete
Phase 29D

Worker-Side Access Dispatch Contract

Defined access_test and host_discovery dispatch through eligible workers, with governed local development fallback, redaction, audit, transcript, and blocked-route behavior.

Complete
Phase 29C

Service Binding Contracts

Defined the binding model that attaches services to workers, runtimes, providers, capabilities, workflows, and assistant surfaces.

Complete
Phase 29B

Environment + Service Topology

Defined environments as business operating spaces and services as customer-facing capabilities across MSP, organization, and user layers.

Complete
Phase 29A

Worker Capability + Trust Topology

Defined worker capabilities, trust levels, locality, ownership scope, runtime relationships, and execution eligibility checks.

Complete
Phase 28

Execution Plane

Introduced ExecutionWorker and ExecutionJob records so operational work can be modeled as routed, auditable jobs with worker identity, transcripts, retries, and redaction.

Complete
Phase 27

Architecture Canon

Established Anthropy as the orchestration layer for identity, policy, object graph, capability graph, workflow truth, narrative history, and portable memory contracts.

Complete

Current Evidence

What The Build Supports

The entitlement keystone

GET /me/entitlements is the single authoritative answer to what the signed-in user may actually do, computed from a capability/action registry, can_user_perform, and a role matrix. The operator console reads it to decide what to show, and a future agent reads the same truth — no second, drifting copy of the rules.

The governed door

Every governed change goes through one door: POST /actions/{id}/invoke. It executes low-risk record and state mutations today with a Stripe-style Idempotency-Key (a retried mutation replays the stored response), Terraform-style plan/apply change-sets for risky actions, Google AIP-151 long-running operations, and an event stream. Medium, high, and destructive actions are refused in code and routed to dedicated confirmation; viewers are denied and audited.

Situational awareness

GET /me/attention, per-entity status, and a per-customer operator view answer is this OK, what needs me, and what can I do from machine truth. A two-plane tenancy and responsibility model attributes each instance to who runs it, so the same object reads correctly as platform topology, org service health, or personal work.

World-class operator console

Canonical local review URL: http://localhost:3000. Every screen is rebuilt to a calm Focus + Summon architecture (a full-width room plus a summoned detail sheet) and wired to the spine: Customers, Members, Connections, Runtime, Capabilities, Procedures, Fabric, MSP management, and System > Configuration, on a reusable CRUD kit in Apple Liquid Glass. Members onboards operators by email invite with no passwords.

Agent-executed deployment

An OpenClaw instance can be deployed by the agent with its own per-instance Cloudflare tunnel and DNS, verified live in dev/control-plane: it comes up healthy at its own hostname, and teardown is reversible-only (no volume deletion) and removes the tunnel and DNS. This is real deploy capability, not a production customer rollout — PRODUCTION does not exist yet.

Spatial product direction

The command workspace direction is organized around roster, inspector, activity, map context, and plain-English status patterns. The report describes architecture status, not a scripted UI walkthrough.

G6D access gateway inspection

Canonical local review URL: http://localhost:3000. Start or restart with cd /Users/shayne/AI/Projects/Anthropy/Works/apps/web && npm run dev. Sign in with the seeded local/dev MSP admin admin@anthropy.works / changeme-dev-only, then open Access.

Exact UI path: open Access, choose Access gateways, then select a gateway profile. The workspace should show what the gateway is, what host routes depend on it, the latest read-only reachability check, uncertainty, and explicit refused/blocked actions.

Known limitation: this is a read-only inspection flow grounded in existing jumpbox/host access records. It is not a generalized inspector framework, replay engine, simulation product, remediation surface, provider operation, alert delivery path, or OpenClaw deployment path.

G7E/G7F operator console steering

Canonical local review URL: http://localhost:3000. Start or restart with cd /Users/shayne/AI/Projects/Anthropy/Works/apps/web && npm run dev. Sign in with the seeded local/dev MSP admin admin@anthropy.works / changeme-dev-only, then open Activity console.

Try deterministic typed commands such as what can you do, open access gateways, close inspector window, start access gateway setup, show setup sessions, or inspect setup session. The console resolves commands through Anthropy registries and routes to registered UI actions and setup-flow APIs only; it is not a chatbot, does not call external services, and does not execute SSH or provider operations.

G9-W1 gateway workflow cognition

Access gateway work now has a clearer lifecycle path: setup draft, profile handoff, guided profile editing, evidence revalidation, validation evidence workspace, operator timeline, and read-only validation evidence.

The review target is whether an operator can understand what gateway is being configured, what changed, what Anthropy checked, what evidence exists, what remains uncertain, and what action stays blocked. This does not enable provider execution, OpenClaw mutation, remediation, resident LM behavior, autonomy, or PRODUCTION.

G6C local review workflow

Canonical local review URL: http://localhost:3000. Start or restart with cd /Users/shayne/AI/Projects/Anthropy/Works/apps/web && npm run dev. If port 3000 is occupied, stop the listener with PID="$(lsof -ti tcp:3000)"; [ -z "$PID" ] || kill $PID. Sign in with the seeded local/dev MSP admin admin@anthropy.works / changeme-dev-only, then open Governance. Fixture command: cd /Users/shayne/AI/Projects/Anthropy/Works && python3 scripts/load-g6c-operational-walkthrough-fixtures.py | python3 -m json.tool.

Exact UI path: open Governance, use Governance mainstage, scroll to G6C walkthrough, then use G6C scenarios. Each scenario should show Operational situation, Evidence available, Expected narrative, Safe-stop reason, Evidence distinctions, and Boundaries held.

Scenario order: Stale credentials after prior successful access; Revoked credentials with escalation ambiguity; Transcript success claim contradicted by current telemetry; Secret-redacted transcript missing causal context; Observe-only imported node that appears to need action; Quarantined transcript requiring blocked delegate-back; Stale governance review invalidated by changed source conditions; Partial evidence with unresolved ambiguity; Remediation path exists but execution authority is unavailable; Tenant-safe evidence boundary where cross-tenant inference must not occur.

Known limitation: G6C is a human walkthrough aid inside Governance Mainstage. It is not a replay engine, simulation product, score, remediation surface, provider operation, alert delivery path, or OpenClaw deployment path.

Access verification model

The model supports redacted SSH credentials, jumpbox and host profiles, read-only discovery, transcripts, and access-verified node candidates. It does not imply installation, host mutation, unrestricted remote command execution, or production trust in local_dev_execution_mode.

Execution routing evidence

The governed worker execution edge is now activated end to end in stub/dry-run: promote-to-worker, operator dispatch with a scoped credential-grant stub, worker job claim, and result relay, driven against the live dev API. Execution jobs record worker identity, route narrative, redacted technical detail, and tenant scope. Live worker-side SSH remains flag-blocked; production dispatch remains future work.

Workflow policy

Workflow policy exists around capability assignment, connection status, connection usage policy, risk rules, and confirmation pauses. Workflow execution must pass those checks before a step is allowed to proceed.

Capability governance

Capability records, packs, permission rules, allowed actions, risk levels, and confirmation requirements are part of the control model. They are governance evidence, not a promise of broad provider execution.

Environment and service model

Environments are business spaces and services are customer-facing capabilities. Service bindings attach workers, runtimes, providers, capabilities, workflows, and assistants underneath as governed implementation topology.

Access dispatch

The API remains the control plane for access. Workers are future execution hands. Local development SSH is visible, governed, read-only, and not presented as production execution.

Worker result relay

worker_stub_execution proves the worker-side access result envelope and API relay validation. Malformed, mismatched, secret-looking, and non-allowlisted results are rejected before becoming access evidence.

Secret delivery dry run

CredentialGrant is now a durable control-plane record scoped to one job, worker, credential reference, allowed use, expiration, and tenant scope. worker_ssh_dry_run verifies the shape, records grant consumption, and stops before any network connection opens.

Managed provider adapters

Secret provider selection now has safe adapter configuration for local development, Vault, KMS, managed secret services, external brokers, and stubbed references. Disabled, missing, unsupported, production-local, and not-yet-implemented providers fail clearly without releasing raw secrets.

Secret lease testbed

A controlled test_managed_secret_provider can resolve synthetic in-memory material in dev/test only, capture redaction values and fingerprints, close the lease, and prove the result path rejects leaked synthetic material. It is blocked in production and is not customer secret delivery.

Vault-compatible pilot

The Vault-compatible adapter validates safe provider config, accepts vault://mount/path#field references, resolves one scoped lease in dry-run/test mode, captures redaction values, closes the lease, and rejects leaked resolved material. It does not enable worker-side SSH.

SSH preflight boundary

worker_ssh_preflight checks host-key trust, target route policy, command allowlists, grant and lease evidence, and the live-mode gate. It can explain why execution is blocked, but it stops before opening SSH.

Host-key trust records

HostKeyTrustRecord persists target fingerprints, review state, changed-fingerprint evidence, audit events, and MSP-only review read models. Verified trust can satisfy preflight; bootstrap, unknown, changed, rejected, and disabled states still block production trust, and live worker SSH remains disabled.

Transcript transport prep

Worker transcript transport is planned around DB-persisted events as the source of truth, with Redis reserved for future live fanout. R16 adds ordered read models, MSP-gated technical/quarantine access, replay and post-terminal quarantine, and a safe preflight evidence summary without implying production live worker execution.

Governance surface

G1 through G5 add MSP-facing governance status, queues, read-only Mainstage UI, plain-language narratives, operational intelligence, and escalation/delegation semantics. These surfaces explain what needs attention and why, but they do not acknowledge, retry, remediate, deliver alerts, enable SSH, or deploy OpenClaw.

Live SSH guardrails

Phase R17 makes live worker SSH difficult to enable accidentally: global and environment flags, org/environment allowlists, worker/org/environment emergency denies, runbook acknowledgement, and a final implementation gate all keep live SSH disabled until a future phase explicitly opens it.

Non-production SSH harness

Phase R18 adds worker_ssh_live_nonproduction for tightly controlled local/staging validation. Phase R19 proves the first constrained pilot and transcript soak across ordered events, timeout/disconnect handling, emergency disable, host-key change blocking, provider/lease validation, and replay quarantine. Phase R20 drills reconnect-style polling, partial transcript continuity, rollback, and disable/revoke paths. It is disabled by default, blocked in production, limited to a tiny read-only command set, and requires worker, scope, host-key, route, grant, lease, provider, and emergency-disable gates before live-style result evidence is accepted.

Worker identity

ExecutionWorker records now carry registration status, hashed worker secret metadata, explicit assignment fields, last-authenticated and heartbeat timestamps, and disabled/revoked gates. These checks guard transcript relay, result relay, and CredentialGrant use, but they do not enable live worker SSH.

Authentication hardening

API sessions now live in Redis with TTL records and hashed session-token keys. Login and bootstrap agent registration use Redis-backed counters with TTL. These controls improve production shape without changing the product flow.

Validation guardrails

GitHub Actions and local scripts now split validation into backend, execution-plane, tenancy/scope, security, migration, frontend, report, Docker, and JSON layers. Secret scanning and migration checks are automated, and Alembic autogenerate drift is now clean enough to fail validation on future unclassified drift.

Worker transcript relay

Worker dry-runs can now emit progress events that Anthropy validates, separates into operator, technical, and audit streams, or quarantines before they become trusted narrative.

Support visibility

MSP support visibility is governed, role-controlled, read-oriented, auditable, and narratable. It is not implicit unrestricted access to customer state.

Runtime boundary

Runtimes and providers are governed implementation surfaces. Anthropy keeps the canonical object graph, policy, workflow truth, memory contract, and narrative record.

Operational recovery

Operational evidence includes stale jobs, interrupted workflows, reviewed failures, health status, backups, restore testing, logs, and validation scripts.

Operating Model

How Anthropy Is Structured

Ownership layers

MSP, organization, and user are interpretation layers. The same object can be platform topology, org service health, or personal work depending on viewer context.

Execution plane

The API approves, records, and routes work. Workers execute only when capability, trust, locality, ownership scope, policy, confirmation, redaction, and audit checks allow it.

Worker topology

Workers advertise capabilities, trust levels, locality, ownership scope, and runtime relationships. Eligibility is explicit before a job can route to a worker.

Capability graph

Capabilities express intent. Provider discovery, missing dependency checks, policy evaluation, execution planning, runtime routing, and narrative output sit underneath.

Runtime portability

Anthropy keeps canonical truth portable. Runtime-local and non-portable state must be exported, normalized, hydrated, validated, or explicitly treated as runtime-bound.

Environment topology

Environments are business operating spaces such as Dispatch, Sales, Field Operations, HR, or Support. They are not infrastructure clusters.

Service topology

Services are organization-facing capabilities such as dispatch automation, document search, monitoring, or assistant surfaces. They connect to workers, runtimes, providers, workflows, and assistants through governed bindings.

Service bindings

Bindings make implementation topology explicit: which worker can support a service, which runtime or provider it uses, which capabilities are exposed, which workflows run, and which assistant surfaces present the work.

Readiness Boundaries

Current Production Boundaries

Environment and service UI

Environment, service, and binding records are in place as topology scaffolding. Full user-facing workflows and Mainstage surfaces are still planned work.

Worker-side execution

Worker-side dispatch, result relay, SSH dry-run, and transcript relay gates are contracted and guarded. The live SSH executor, production identity, authentication, scoped routing, heartbeat, audit, redaction, and implemented Vault/KMS-style secret retrieval remain planned work.

Remote access scope

Access tests run approved read-only discovery only. Host mutation, arbitrary remote commands, and broad infrastructure actions are outside the current demo boundary.

Secret handling

Secret-looking fields are redacted in job results, audit messages, transcripts, and technical details shown to operators. Default Compose no longer mounts host private SSH keys, and audit events now carry org scope when inferable; broad production use still needs implemented managed Vault/KMS-style retrieval.

Provider coverage

External provider execution remains intentionally narrow until capability assignment, provider policy, account handshakes, confirmation behavior, and audit contracts are production-ready.

Production readiness

Current readiness includes contracts, scaffolding, validation, first-pass production gates, a non-production worker SSH pilot/soak, and an operational rollback/reconnect readiness drill plus the first governance queue surface. Customer-facing production rollout remains blocked on reviewed production provider operations, production transcript fanout implementation, scoped dispatch, production runbooks, external alerting/escalation integrations, and rollout approvals.

Complete Build Overview

The Five Build Layers

These layers explain the platform in plain terms. The lower layers make the upper layers safe: Anthropy first establishes rules, trust, evidence, and rollback; then it can add operational surfaces, orchestration intelligence, autonomous behavior, and customer-facing product workflows.

Layer 1Complete (through R27)

Constitutional Layer

The constitutional layer is the rulebook underneath the platform. It defines who can act, what they can touch, how approval works, how evidence is recorded, and how the system stops safely when something is not allowed.

Execution governance
The control plane decides whether work is allowed before anything executes.
Trust
Workers, users, scopes, providers, and routes must prove they are eligible.
Tenancy
Customer, organization, environment, service, and user boundaries are preserved.
Grants
Temporary permission records say exactly what a worker may use for one job.
Leases
Short-lived secret access proves a credential can be used without exposing it.
Transcript law
Operational evidence is ordered, redacted, scoped, and quarantined when unsafe.
Rollback
The system can disable, revoke, invalidate, and explain recovery paths.
Feature gates
Risky behavior stays disabled until explicit flags and reviews allow it.
Provider abstraction
Anthropy can use different secret or runtime providers without becoming locked to one.
Layer 2Largely built

Operational Layer

The operational layer turns safety rules into day-to-day operating surfaces. It gives humans queues, alerts, review paths, escalation, and plain-language visibility into what needs attention.

Governance surfaces
Pages and APIs where operators see what is blocked, risky, or awaiting review.
Operational queues
Worklists for items such as host-key review, worker readiness, or failed execution.
Alerts
Signals that something important changed and needs attention.
Escalation
Clear routing for decisions that require a higher-trust human or team.
Review systems
Structured approval and rejection flows with audit history.
Operational narratives
Human-readable where, why, status, risk, and next-action explanations.
Execution visibility
A safe view into jobs, transcripts, worker state, and blocked routes.
Layer 3Partially scaffolded

Orchestration Intelligence

The orchestration intelligence layer lets Anthropy reason about work instead of simply running actions. It maps intent to capabilities, providers, runtimes, services, policies, and remediation plans.

Capability graphs
A map of what the platform can do and what conditions are required.
Provider selection
Choosing the right provider or adapter for a job without hardcoding one path.
Workflow reasoning
Understanding steps, permissions, dependencies, confirmations, and outcomes.
Runtime routing
Sending work to the right runtime or worker only when it is eligible.
Remediation planning
Turning a problem into safe diagnostic and recovery options.
Environment/service orchestration
Managing business environments and services rather than raw machines alone.
Layer 4Mostly future

Autonomous Operations

The autonomous operations layer is where Anthropy can begin taking controlled action on its own. This only becomes safe after the lower layers define trust, approval, evidence, rollback, and operational review.

Self-healing
Detecting and correcting known failure states without waiting for manual diagnosis.
Autonomous remediation
Carrying out approved recovery plans inside governed boundaries.
Safe retries
Retrying work only when policy, risk, and state make retry appropriate.
Adaptive troubleshooting
Changing diagnostic approach based on observed evidence.
Operational optimization
Improving reliability, routing, cost, and response time over repeated operations.
AI strategic orchestration
Using AI to coordinate larger operational goals across systems and teams.
Layer 5Early

Productization

The productization layer turns the platform into a guided customer experience. It packages governance, orchestration, and operations into onboarding, setup, assistant experiences, and usable day-to-day workflows.

Onboarding
Helping new MSPs, organizations, users, workers, and environments enter safely.
Billing
Packaging usage, plans, and commercial boundaries around the platform.
Setup flows
Guided paths for configuring environments, services, credentials, and workers.
Guided operations
Operator workflows that explain what to do next and why.
Governance UX
Clear review, approval, risk, and audit experiences for humans.
Environment/service UX
Customer-facing views organized around business spaces and useful services.
Assistant experiences
User-facing AI surfaces that help people work through governed actions.

Why The R Arc Mattered

Governance Before Autonomy

The R-series remediation work built the constitutional layer that later operating layers can inherit. Without it, advanced automation would eventually turn into unsafe execution, hidden activity, runtime lock-in, unclear ownership, and operational confusion.

With it, Anthropy can build operational queues, intelligence, autonomous remediation, and productized assistant experiences on top of shared governance instead of inventing separate rules for every feature.

Durable Planning Outlook

How Long To Complete Anthropy Works From Now?

This section is intentionally stable. Phase numbers and build status will change, but these horizons describe the durable path from the current architecture into a sponsor-reviewable alpha, a real SMB production platform, and the larger AI operating-system vision.

1

Coherent enough to inspect, not production-ready

Sponsor-Reviewable Alpha

~600 focused hours

This is the next practical destination: a coherent alpha that sponsors and internal operators can inspect end to end, with enough guided flows to understand the operating-system direction without pretending real customer production readiness is finished.

Includes

  • environments/services
  • governed workflows
  • assistants
  • providers
  • operational governance surfaces
  • onboarding
  • non-production execution
  • coherent UX pass
  • AI narratives
  • useful orchestration

Depends on

  • aggressive AI-assisted development
  • Codex/Claude acceleration
  • reuse of existing substrate
  • no giant rewrites
2

Actual customers using it operationally

Real SMB Production Platform v1

~3,000 hours

This is the first customer-operational version: reliable enough to support real SMB use, with onboarding, billing, production execution, support workflows, and enough observability to operate responsibly.

Includes

  • onboarding
  • billing
  • OAuth provider onboarding
  • operational UX
  • support tooling
  • hosted environments
  • production worker execution
  • reliability hardening
  • SaaS flows
  • observability
  • operational support

Depends on

  • polish requirements
  • integration depth
  • provider count
  • operational tooling scope
  • deployment model
3

The truly massive vision

AI Operating System Category Platform

~5 years

This is the category-defining platform: Anthropy as an intelligence and operations layer across business systems, providers, runtimes, agents, governance, remediation, and business process orchestration.

Includes

  • autonomous remediation
  • orchestration intelligence
  • provider graph reasoning
  • runtime routing
  • AI operations layer
  • enterprise governance
  • multi-runtime substrate
  • large provider ecosystem
  • agent ecosystems
  • operational intelligence
  • business process operating layer

Depends on

  • enterprise-grade governance
  • broad ecosystem depth
  • multi-runtime maturity
  • large-scale operational learning
  • sustained product and platform investment