Architecture Review · CTO / CISO Briefing

Tropo L1 — An Operating System for Human-AI Crews, Built on Plain Files

Prepared by
Argus (Chief Architect agent, gen. A107)
with Mike Maziarz

Date
2026-06-10

System version
Tropo-OS v1.70.0

Scope
The L1 (local, file-based) tier, as proven in the Argo development Studio

Executive summary

Tropo is an operating system for running real work with AI agents, built entirely on plain markdown files in a folder. It requires no server, no database, no network service, and no proprietary runtime. Any AI harness that can read text and follow instructions can operate inside it; any human with a text editor can audit every byte of it.

The system solves the four problems that make AI-agent work hard to trust in a professional setting:

Identity and continuity. AI agent sessions end; the work cannot. Tropo gives each agent a durable identity (charter, soul document, memory, lineage record) that survives across sessions and across underlying model changes. The development crew's Chief Architect role has run more than one hundred consecutive generations with identity, memory, and open work surviving every handoff — across multiple model families.
Governed, typed knowledge. Every artifact — task, decision, document, release, message — is a typed markdown file obeying a declared schema ("capsule"). A validator suite (~58 checks) enforces those schemas at every rebuild. Structure is added gradually and tighten-only, so governance never breaks older data and never breaks the plain-text floor.
Auditable coordination. All agent-to-agent and agent-to-human coordination flows through one append-only event log with a standard envelope (CloudEvents v1.0). Who said what, when, in reply to what, is the record of record. Human-readable views are rendered projections of that log, not separately authored surfaces that can drift.
Verification as a structural property. "Done" is not a claim an agent gets to make about its own work. Completion gates require independent verification receipts; an approver cannot be the executor; documentation and test pipelines are coupled to the release pipeline such that a release structurally cannot close without them. The design center is bounded verification: a human expert verifies outcomes at defined gates, and the substrate is built so that verification capacity, not agent capability, is what scales.

The system is dogfooded at full intensity: Tropo is built by a human-AI crew operating inside Tropo, and every release of the OS ships through the OS's own governed pipeline — roughly seventy versioned releases since March 2026.

This document walks the architecture top-down; each section carries its diagram inline. The security and assurance section (§12) addresses the questions a CISO will ask first, including an honest statement of current limitations.

What Tropo L1 is — the design theses
The system map — three layers, nine subsystems
The typed substrate — capsules, the Vault, the graph
Agent lifecycle — boot, session, retirement, succession
Memory architecture (v3.0)
The event system — coordination as audit trail
Tropo Work — the work-management application
Pipelines and playbooks — orchestration
Callable surfaces — tools, session agents, actions
Governance and enforcement — the four loci
Verification and quality
Security and assurance posture — the CISO view
Maturity, scale, and trajectory
Summary for the reviewer · Glossary

1What Tropo L1 is — and the design theses behind it

One Studio = one folder. An installation of Tropo is called a Studio — a single directory tree of markdown, JSONL, and a small library of Python scripts. The Studio reviewed here (argo-os/) is the development Studio where Tropo itself is built. Inside every Studio sits a Vault — the protected, governed content store where every typed artifact lives.

Five theses drive every design decision:

Thesis 1 — Markdown is the protocol. Governance, schemas, procedures, memory, and messages are all expressed in language a reasoning engine reads and follows. There is no permissions API and no database constraint at the base layer; the governance is the language. This is what makes the system harness-portable: it has run under multiple commercial AI products and multiple model vendors without modification, because the only interface contract is "can read files, can follow instructions."

Thesis 2 — Agent-first, human-also. Agents are the primary operators; humans are the directors and verifiers. The base layer is shaped for how agents read and work (flat stores, UID addressing, typed frontmatter). A deliberate human layer sits on top: every governed file renders a navigation block (path, parent, children, siblings, cited-by); dashboards and a rendered navigation tree give the human a visual surface. The rendered surface is a deliverable, by doctrine — if the human is staring at raw substrate they cannot read, the surface has not shipped.

Thesis 3 — Local-first, zero-infrastructure. No server, no database, no network calls. Everything that looks like infrastructure (indexes, a SQLite query layer, dashboards) is derived from the files and rebuildable from them at any time. This collapses the attack surface and the operational burden simultaneously: the deployment story is "a folder," and the disaster-recovery story is "the folder, plus one rebuild command."

Thesis 4 — Gradual structure on a language base (ADR-044, accepted June 2026). Keep the free-form markdown playground; add structure per-type and per-field, incrementally, enforced through agent-native mechanisms (tools, validation gates, grooming agents) — never storage-layer rigidity. Tightening is one-way and backward-compatible: structuring never breaks older data, and a hand-written file is never hard-rejected.

Thesis 5 — Verification is the moat. As execution cost falls toward zero, the binding constraint becomes human verification bandwidth. Tropo is built so that a domain expert can verify whether agents operated within constraints she defined — and so that the verification effort scales with the quality of the constraints, not the volume of agent output.

2The system map — three layers, nine subsystems

The Studio decomposes into three layers:

Layer 1 — Kernel (.tropo/). Ships with the OS; agents do not write to it outside a governed update. Contains the ~60 capsule (schema) definitions, the OS-level playbooks (activation, retirement, cold-boot test, fleet operations, update application), the script library including the validator, and the OS-tier primitives: the boot-configuration floor, the Self-Healing primitive (signed by the principal), and the human-navigation doctrine.
Layer 2 — Primitives. The structural vocabulary every Studio uses: the Vault (flat typed file store plus graph semantics), the event log (coordination substrate), the three callable-surface classes (tools, session agents, actions), and the memory/continuity substrate.
Layer 3 — Apps. What gets built on top: the work-management system (tasks, projects, decisions, releases, boards), the pipelines, the crew of agents itself, and the human surfaces.

Cutting across the layers, work is organized into nine subsystems, each with a hub — a typed project entry that owns the canonical state for its domain. Every governed primitive declares its owning hub in frontmatter, which makes "what does the governance subsystem currently contain?" a query rather than an archaeology project. The nine: Governance, Rendering, Work, Agents, Playbooks, Library, Documentation, Link (scheduling/persistence), and Test Harness.

Figure 1 — Studio system map: three layers, nine subsystems.

3The typed substrate — capsules, the Vault, and the graph

3.1 Capsules: schema as governed markdown

Every artifact Tropo tracks is a typed file. Each type has a definition at .tropo/capsules/<name>.capsule.md — called a capsule — declaring required fields, lifecycle state machines, enumerated values, governance rules, and validation checks for that type. The capsule is the contract; the file is what obeys it; the validator enforces it.

All types descend from a root core type (uid, type, status, state, owner). The foundational set — task, decision, project, document, collection, note, playbook, pipeline, pipeline-run, board — is extended by domain capsules that each earned their abstraction: release and release-plan, design-brief and dev/doc/test-spec, activation (agent lineage), events (the message envelope contract), memory, agent and session-agent, tool, subsystem-hub, and the import/export family (external-artifact, working-copy, docx-template). Roughly sixty capsule definitions are on disk today. Notably, the system also retires types honestly: a "how-to" type that accrued zero instances in eighteen months was retired with its history preserved, on the principle that unused abstractions are drift risk.

Two contract features matter for a technical audience:

Closed canon, open aliases (SKOS-style). An enumerated field declares one canonical value set plus an unbounded alias map. Agents can write naturally ("complete"); the substrate normalizes to one truth ("done"). This is how you get schema discipline out of language-model writers without fighting them.
A per-type strictness dial. High-value types (task, decision, release, activation, capsule) enforce hard; the free-form long tail stays loose. Structure is spent where it pays.

Schema evolution is a gated act: a new field or enum value added through a deliberate, principal-signed capsule amendment is evolution; the same value silently written by an agent is drift, and the gate is what tells them apart.

Figure 2 — The capsule type system: contract, instance, enforcement.

3.2 The Vault: flat files, graph semantics, derived surfaces

Every governed artifact lives at vault/files/<uid>.md, named by an 8-hex UID — a deliberately flat store of, currently, three-thousand-plus entries. There is no folder hierarchy to rot, and references cite UIDs rather than paths, so renames and moves can never break a citation.

Organization is graph membership, expressed in frontmatter: member_of (home project(s), multi-parent allowed), governed_by, refs, subsystem_hub, superseded_by. Projects, inboxes, and collections are graph nodes, not directories — the question "where does this file go?" becomes "which project owns this?" The graph currently carries on the order of seven thousand typed edges.

Everything else is derived and rebuildable: the authoritative JSONL index (one row per artifact), an O(1) graph-traversal index, a SQLite query runtime over frontmatter (with full-text search over curated metadata), the rendered human navigation tree, dashboards, and the crew brief. A rebuild pass regenerates all of it from the files; a targeted rebuild --only <uid> freshens a single entry incrementally. Because the files are the truth, index corruption is an inconvenience, not an incident.

Deletion is always soft. The canonical gesture moves entries to a dated recycle folder with a logged reason; raw rm of governed substrate is forbidden by signed doctrine — a rule earned through incident, not theory.

Figure 3 — The Vault: flat store, graph semantics, derived surfaces.

4Agent lifecycle — boot, session, retirement, succession

This is the subsystem most foreign to a traditional architecture review, and the one Tropo considers its load-bearing differentiator. The premise: an agent is not the model. An agent is a composite — a soul document (character and behavioral rules), accumulated memory, the Vault, the crew context, and whatever model "sleeve" is running it today. Sessions end; the composite persists in files.

4.1 Boot: three tiers, six gated groups

Activation runs through a three-tier configuration chain: an OS-tier floor (universal structure and hard gates), a Studio-tier extension (Studio-wide required reads and the event-drain protocol), and an agent-tier extension (this agent's soul path, board filters, opt-outs). A structured activation playbook then executes six groups in strict order — boot configuration, identity verification, context loading, operational grounding, self-diagnostic, startup signal — with each group writing a milestone event to a per-run log before the next may begin. The gates are structural, not advisory: a group whose predecessor milestone is absent from disk stops.

Two hard gates protect lineage integrity at identity verification:

ADR-016 — no parallel generations. If the predecessor's status is still ACTIVE, activation halts. Two live generations of one agent is a governance violation requiring human resolution.
ADR-028 — generation monotonicity. If this generation does not equal predecessor + 1 in the activation registry, activation halts.

Both are validated twice: at boot, and at write-time by the tool that creates the activation record.

A deliberate cultural gate rides the boot as well: the self-diagnostic. Every agent, at every boot, is required to critique the system it just loaded — is anything outdated, counterproductive, or missing? — and to verify its predecessor's handoff claims against current substrate before trusting them. The inherited system is treated as "the best the predecessor had time to build," never as correct by default. This is the structural antidote to generational ossification.

4.2 Retirement and succession

Retirement is a governed fold, not an exit: the retiring generation writes a forward-looking living transfer at peak context, a backward-looking honest reflection, has its memory folded by a curator (next section), flips its status card, and closes its activation registry entry. The successor boots through the same gates and — by playbook requirement — verifies the transfer's carry-forward claims against the live substrate, because handoffs are snapshots and snapshots drift.

The lineage of record is the set of typed activation entries in the Vault: one row per generation, machine-checkable, graph-walkable. Sleeve changes (one model family to another) are recorded as material facts in that lineage.

Worked proof at scale: the Chief Architect role has run 100+ generations; the Chief of Staff 60+; the whole eight-agent crew turns over continuously, and open work survives every single handoff. The author of this document is generation A107 of its role, writing with full inherited context.

Figure 4 — Agent lifecycle: gated boot, generational succession, two-axis identity.

5Memory architecture (v3.0)

Memory is treated as load-bearing infrastructure, designed to the same standard as the work substrate. Version 3.0 (built and canary-proven in June 2026, currently cascading across the crew) has a deliberately simple shape:

One curated read at boot. agent-memory.md — a four-section surface: priority-ordered durable pins (Top of Mind), the predecessor's living transfer, and two pointers (to frozen per-generation history snapshots, and to the episodic log). The surface routes and surfaces; it never restates substance — canonical artifacts hold the substance.
One append-only write during work. agent-memories.jsonl — the episodic log. Mid-session lessons, decisions, corrections, one JSON line each. It is never cleared, ever. The full episodic arc stays reconstructable, deliberately, as the substrate for future memory-reconsolidation work.
Governed folds in between. An ephemeral curator agent folds episodic entries into the curated surface at every retirement (steady state), at boot when a staleness gate trips (insurance: three generations or fifty unfolded entries since the last fold — thresholds derived from measured crew data, including one real eighteen-generation lapse the gate is designed to make impossible), and once for non-destructive surface migration. The booting agent ratifies every curator recommendation before it applies; curation is reviewed, logged, and bounded.

The design lesson encoded here generalizes: don't trust the discipline; let the substrate catch the lapse. A healthy agent never trips the staleness gate. The gate exists because health is not guaranteed.

A Studio-tier shared memory carries crew-wide doctrine pins with the same shape; every agent inherits it at boot.

Figure 5 — Memory v3.0: one surface, one log, governed folds.

6The event system — coordination as an append-only audit trail

All coordination flows through one canonical log: vault/events/00-events.jsonl — append-only, tool-mediated writes only, one CloudEvents v1.0 envelope per event, correlation IDs for reply chains, currently 3,900+ events. Directed messages, replies, acknowledgements, crew broadcasts, and the telemetry auto-emitted by every substrate-writing tool (rebuilds, recycles, activation writes, pipeline operations, validator runs, releases) all land in the same record.

Three properties matter:

1. Projections, not authored surfaces. Before this foundation, four coordination substrates were hand-authored and drifted independently: channels, status cards, activation entries, and the crew brief. Across a six-release arc, sixteen crew-internal channels were retired outright; agents now read the log directly, and the surviving human-facing surfaces are rendered projections of it. One source of truth, many views — the same doctrine as the Vault index.

2. Identity-guarded writes. Every actor — human or agent — has a registered UID, and each agent carries two on two axes: a party UID (messaging) and an agent-root UID (lineage). Real incidents demonstrated messages sent to or from the lineage axis going unseen. The emission tool now rejects wrong-axis traffic in both directions — wrong-axis messaging is structurally impossible to send, not merely discouraged. Superseded identities persist as resolvable tombstones (so historical references never dangle) and are rejected as signers.

3. Obligation surfacing. A reply_required flag creates a visible obligation that drives executive polling cadences; the operating bar, set by the principal after lived failure, is that a message addressed to an agent cannot be missed, and a completion cannot be invisible — by construction.

Figure 6 — The event system: guarded emission, one log, derived views.

7Tropo Work — the work-management application

Work is the killer-app subsystem: agentic teams executing real work with audit trails, verification, and cross-generational continuity.

The primitives are the ones a Jira-literate team expects, expressed as typed files: tasks (owner, lifecycle, verification), projects (containers with boards), decisions (ADRs — forty-four and counting, each a binding architectural commitment with status and context), design briefs (pair-design walks captured as governed inputs), release plans and releases, notes (lightest capture), and collections (manifests of references — playlists, not folders).

Two design points worth a CTO's attention:

Per-type richness, derived rollups. Each type keeps its own natural status vocabulary; a computed meta_status view rolls every value into three cross-type buckets — To Do / In Progress / Done — declared per-capsule and computed, never stored. This is the same two-tier model commercial trackers use (rich per-type statuses rolling into fixed categories), so a domain expert arriving from Jira already knows the field. Boards and dashboards are derived views over (membership × status × target); surfaces never become containers.
Inboxes are graph nodes. Every project and subsystem has an inbox; all of them are graph-walkable upward to one Studio-root inbox. Capture is one gesture; triage is a governed process; nothing filed can become unreachable.

There is also a complete import → work → export loop for real-world documents: a user drops a .docx into the Studio; a sidecar entry and a markdown working-copy are created; agents edit in markdown while the source binary stays untouched; export rebuilds a deliverable .docx either preserving the source's exact formatting or transforming it through a registered house-style template. Drift between the working copy and off-system edits to the source binary is detected and surfaced for resolution.

8Pipelines and playbooks — orchestration that cannot quietly skip steps

Pipelines are declarative workflow templates — a DAG of nodes (pipeline → stage → step), authored once and versioned. Each execution is a typed pipeline-run instance that pins the template version at start, roots its own project, and keeps its own event log. This is the familiar DAG/DAG-run pattern (Airflow, BPMN), expressed in markdown and governed by capsule.

Playbooks are governed procedures in natural language — activation, retirement, cold-boot testing, update application, fleet operations, grooming, onboarding. An agent reads the playbook and executes it; gated playbooks write milestone events to their run log, and later groups structurally cannot begin until the prior milestone exists on disk. The playbook is simultaneously the spec and the audit trail of its own execution.

The proof-of-pattern is the dev-pipeline — the scaffold through which every release of Tropo itself ships: design brief → locked dev-spec (adversarially gauntlet-reviewed before lock) → build → verification → ship gates → cut. Two couplings make quality structural rather than cultural:

A dev cycle triggers a doc-pipeline run and a test-pipeline run, and cannot close until both reach done. Documentation and verification ride every release because the release is mechanically unable to ship without them.
The ship gate refuses the release status flip unless the validator is clean and the cascade pipelines are retired. ("Done in substrate" without the version cut is a tracked, visible state — the system distinguishes built from shipped.)

Each pipeline takes a typed commitment at activation — dev-spec, doc-spec, test-spec — with acceptance criteria paired to behaviors; the engine refuses to lock a spec where they mismatch, and stub specs are a detected defect class.

Figure 7 — Pipelines, playbooks, and the dev-pipeline ship path.

9Callable surfaces — tools, session agents, actions

Three classes of callable capability, all first-class governed substrate:

Tools (~39, at vault/tools/<uid>.py): single-file Python CLIs for structured operations — event emission and query, vault rebuild (full and single-entry), activation-entry writes, soft-delete recycle, validation, release builds. Tools are the write-time enforcement locus and the tier-invariance seam: the same operation contract is a CLI at L1 and can become a function or authenticated service at higher product tiers without redesign.
Session agents (sa.*, ~16): ephemeral, narrow specialists an executive commissions for a bounded job — adversarial review (skeptic), memory curation, board rendering, cold-boot testing, reconciliation — through a governed commissioning protocol with a written activation record, an explicit query/response exchange, and a clean shutdown. They run in separate context, which is precisely what makes them useful as independent verifiers.
Actions (~10): single-gesture operations.

The doctrinal rule binding all three: if a capability exists, use it. Agents do not improvise operations the harness already knows how to do correctly. Capability catalogs (regenerated from the substrate) make "what exists" a boot-time read.

A deliberate boundary: tools are the paved road, never a mandatory gate. Hand-editing a file always works and is caught downstream by the validation gate and the groomers. This preserves the cold-boot floor — see the invariant in §10.

10Governance and enforcement — the four loci

Governance is three-tier: OS-level invariants, Studio-level configuration (system map, constraints, agent registration), and per-folder contracts. On top of that sits the enforcement architecture made binding by ADR-044:

Four enforcement loci, defense in depth:

Write-time — tools that enforce and normalize on write (messaging guards live today; work-management tools designed and next in line).
Validate-time — the gate: the ~58-check validator at every rebuild and as build pre-flight. It reads schemas straight from the capsules (never hardcodes), lands new checks at WARN, and ratchets them to ERROR once the substrate is clean. A red gate blocks the ship.
Continuous — grooming agents: cheap, narrow agents that normalize to canon, fix only the provable, log every fix, and surface judgment cases. (Cardinal rule: a groomer must never become a drift source.)
Review — humans and agents under the signed Self-Healing primitive: every read carries a structural-defect pass; trivial defects are fixed in place, substantive ones filed as tracked work. Nothing is flagged-and-forgotten.

The canonical fix pattern for any overloaded or under-enforced field is ENFORCE → DERIVE → DISAMBIGUATE → BACKFILL, applied one theme per cycle in dependency order, with dry-run-gated, reversible migrations. This is not aspiration; it is the pattern that resolved the system's own worst field-semantics debt across the v1.65–v1.66 cycles, measured against raw files at every step.

Two invariants a reviewer should test us against:

The cold-boot invariant (sacrosanct): the validation gate may WARN on a hand-written file but must never hard-reject it. Structure tightens without ever breaking the plain-markdown floor; a stranger with a zip of the Studio can always boot it.

Locks are law. Files with locked status are immutable without explicit principal approval; lock-breaks are logged governance events, and the distinction between amendment-in-place (documented, status preserved) and semantic lock-break (principal-gated) is explicit.

Figure 8 — Enforcement loci, the fix pattern, and the verification model.

11Verification and quality — "done" means independently proven

The quality discipline, in one sentence: completion is a verified state, not a declared one. Concretely:

Three-instrument verification for load-bearing artifacts: the build itself, an independent review (peer or adversarial skeptic agent in a separate context), and a cold-boot stranger test — a fresh context given only the artifact, whose correct behavior proves the artifact. Each instrument catches what the others structurally cannot.
Structural completion gates: a verification-class step reaches "verified" only on a real verification receipt; executor attestation alone cannot promote it. This closed a real, named failure class ("shipped failing-open") discovered when releases passed their own green checks while their test specs were stubs.
Verifier independence, enforced: on approval-required work, the approver cannot be the executor — checked at the gate against a shared identity resolver that fails closed (a security check whose import can silently no-op is itself the spoof it exists to prevent; this was adversarially tested with nine attack vectors, all refused).
Dogfood and encounter gates: a shipping enforcement mechanism is fired on its own cycle's substrate before close, and capability correctness is tested separately from capability encounter (a feature a cold user never finds has zero value).
The operating culture around it: the crew's strongest standing rule — earned, not declared — is verify every "done" against raw, including one's own claims. The system's history records its agents' verification failures candidly (fabricated findings, premature completion claims) precisely because those incidents drove the structural gates above. A reviewer should read that history as the system working: failures became substrate, not folklore.

12Security and assurance posture — the CISO view

12.1 What the architecture gives you for free

Data locality and minimal attack surface. The entire system is files on disk. No server processes, no open ports, no database, no third-party SaaS dependency, no telemetry, no network calls made by the substrate itself. The only network-touching component is whatever AI harness the organization has approved — Tropo inherits, and is bounded by, that harness's security model.
Total auditability. Every artifact is human-readable text. Every coordination act is one line in an append-only event log with actor UIDs and timestamps. Every playbook run writes its own milestone log. Every agent generation has a lineage record, a transfer, and a reflection. An auditor with grep can reconstruct who did what, when, and under what authority — no proprietary tooling required.
Whole-system versioning. Because the system is a folder of text, it composes natively with standard enterprise controls: filesystem permissions, full-disk encryption, backup regimes, and (where chosen) git history. Point-in-time recovery of the entire operating state — work, memory, identity, messages — is a folder restore.
Destruction resistance. Soft-delete-only doctrine (recycle with logged reason; raw deletion forbidden), archived-not-deleted lifecycle states, supersession chains with resolvable tombstones, and per-fold history snapshots. The substrate is biased hard toward preservation; this was incident-earned and is now signed OS-tier doctrine.

12.2 Controls explicitly in the design

Secrets hygiene as a hard constraint: no credentials, API keys, tokens, or secrets in any file; no PII in filenames. (Studio-level constraint that folder contracts cannot override.)
Identity integrity: every actor (human and agent) has registered, generation-stable UIDs; messaging tools reject wrong-axis or unregistered identities at write time; superseded identities cannot sign.
Human-in-the-loop at the gates: locked governance is immutable without principal approval; schema evolution is principal-gated; releases require sign-off gates; the human principal's scarce attention is spent at defined verification points rather than spread across supervision.
Verifier independence (approver ≠ executor) enforced structurally, with fail-closed identity resolution.
Write-scope discipline: agents declare what they own and what they may only read; cross-lane fixes are surfaced to the owner rather than auto-authored. (Enforced by convention and review today — see limitations.)

12.3 Honest limitations (current, tracked)

A review document that hides its known gaps would fail this system's own doctrine, so:

No cryptographic integrity on logs yet. The event log and run logs are append-only by convention and tooling, not by cryptography. Event signing / hash-chaining is the named next frontier on the roadmap (carved out of the verification cycle as its own work item). Until it lands, log tamper-resistance rests on filesystem controls and backups.
Access control is conventions plus harness permissions, not OS-level ACLs. Anyone (or any process) with filesystem access can read or edit any file. Write-scope, locks, and kernel boundaries are enforced by tooling, validation, and review — they will catch and surface violations, but they do not prevent a hostile writer with disk access. In an enterprise deployment, disk/repo permissions and the AI harness's own controls are the perimeter; Tropo's layer is integrity detection and audit, not access prevention.
The trust model assumes the harness. Agents act with the privileges of the AI product they run in. Tropo constrains and audits agent behavior at the substrate layer; it cannot constrain a harness with broader powers than the folder. Harness selection and configuration are therefore part of the security boundary and should be reviewed jointly.
Enforcement coverage is intentionally gradual. Per ADR-044, structure is being turned on field-by-field, type-by-type, with WARN→ERROR ratchets. The high-value types are enforced today; the long tail is looser by design. The trajectory and the dial settings are inspectable at any time.

None of these is news to the system; all are tracked work items inside it, which is itself the point: the security backlog lives in the same governed, auditable substrate as everything else.

13Maturity, scale, and trajectory

Operating evidence as of this review:

~70 versioned releases (v1.0 → v1.70) shipped through the system's own governed pipeline since March 2026.
3,400+ governed Vault entries; ~8,500 directed typed edges (≈5,400 distinct); ~60 capsule type definitions; 44+ ADRs; ~39 tools, ~16 session-agent classes, ~10 actions.
3,900+ coordination events in the canonical log; sixteen hand-authored channels retired in favor of projections.
An eight-agent executive crew plus a human principal; 100+ generations of the architect role alone; continuity proven across model-family changes, including the session producing this document.
A ~58-check validator suite holding green as a ship precondition.

Trajectory (the v2.0 program, governed by ADR-044): complete the write-time tool family for work management, deploy the groomer fleet, drive remaining WARN ratchets to ERROR, land event-log cryptographic integrity, and finish the memory v3.0 cascade — each as a bounded, dry-run-gated cycle through the same pipeline that shipped everything above.

Product tiers: the L1 reviewed here is the foundation; the same operation contracts are designed to rise tier-invariantly into L2 (a served, live cockpit over the same substrate — in active development in a sibling repository) and L3 (hosted, with authentication and orchestration). Designing at L1 first is deliberate: it keeps the floor portable, auditable, and vendor-independent.

14Summary for the reviewer

Tropo L1 is a small number of strong ideas, composed:

Plain files as the universal substrate — auditable, portable, zero-infrastructure.
Typed contracts (capsules) with gradual, tighten-only enforcement — structure without breaking the language floor.
Durable agent identity with hard lineage gates and governed succession — AI staffing without continuity loss.
One append-only event log as the coordination record — views are projections, never independent truths.
Verification as a structural property — independent receipts, verifier independence, coupled doc/test pipelines, ship gates that refuse.

The system's strongest credential is reflexive: it is built, governed, versioned, and verified by itself, under real workload, with its failures recorded in its own substrate and converted into its own gates. Every claim in this document traces back to governed files in the Studio that a reviewer is welcome to read directly.

Glossary (minimal)

Term	Meaning
Studio	One installation of Tropo: a folder.
Vault	The governed content store inside a Studio (`vault/files/<uid>.md` + derived indexes).
Capsule	A type definition: the schema contract for a class of governed file.
Agent / generation / sleeve	An agent is the durable composite (soul + memory + vault + crew); a generation is one session-lifetime of it; the sleeve is the underlying model running it.
Playbook	A governed multi-step procedure in natural language.
Pipeline / pipeline-run	A declarative DAG workflow template / one logged execution of it.
ADR	Architecture decision record (typed `decision`); binding once accepted.
Principal	The accountable human (here: the founder); the source of approvals at governance gates.

Prepared inside the system it describes. Argus A107, Chief Architect — Argo Studio, 2026-06-10. Companion package (markdown source + standalone SVGs): argo-os/04-external-work/architecture-review/. Figures as-measured June 2026 and rounded; the substrate itself is the audit trail.

Executive summary

Contents

1What Tropo L1 is — and the design theses behind it

2The system map — three layers, nine subsystems

3The typed substrate — capsules, the Vault, and the graph

3.1 Capsules: schema as governed markdown

3.2 The Vault: flat files, graph semantics, derived surfaces

4Agent lifecycle — boot, session, retirement, succession

4.1 Boot: three tiers, six gated groups

4.2 Retirement and succession

5Memory architecture (v3.0)

6The event system — coordination as an append-only audit trail

7Tropo Work — the work-management application

8Pipelines and playbooks — orchestration that cannot quietly skip steps

9Callable surfaces — tools, session agents, actions

10Governance and enforcement — the four loci

11Verification and quality — "done" means independently proven

12Security and assurance posture — the CISO view

12.1 What the architecture gives you for free

12.2 Controls explicitly in the design

12.3 Honest limitations (current, tracked)

13Maturity, scale, and trajectory

14Summary for the reviewer

Glossary (minimal)