Design Philosophy: Coworker, Not Plugin

Framing the Relationship

Most AI assistants are deployed under one of two mental models. In the plugin model, the assistant is embedded directly into a human’s identity, tools, and permissions. In the service model, it exists as a remote, opaque system operated by a third party. Both are convenient. Neither is adequate.

The plugin model fails through inheritance. When an assistant shares a human’s identity, credentials, or environment, it acquires authority far beyond what its function requires. A misconfigured automation does not merely fail — it fails as you, with your access, inside your perimeter. Mistakes propagate instantly and are difficult to distinguish from intentional actions after the fact.

The service model fails through opacity. When the assistant is fully externalized, its internal state, memory, and decision-making become invisible to the operator. Trust is not earned through transparency but assumed through continued operation, which is a weaker and more dangerous basis for collaboration.

We adopt a third model: the assistant as a coworker.

The Coworker Model

A coworker is neither an extension of oneself nor an autonomous authority. A coworker has their own identity, receives authority through explicit delegation, operates within understood boundaries, produces work that can be reviewed, and can be stopped, corrected, or dismissed. Applying this model to an AI assistant produces a set of concrete design consequences.

The assistant must not share the human’s identity, act invisibly, escalate privileges implicitly, or perform irreversible actions without review. Conversely, it must be auditable, interruptible, revocable, and replaceable. This framing limits power, not capability. The assistant can still do a great deal — it simply cannot do so in ways that escape observation or resist correction.

Separation of Identity, Authority, and Execution

A recurring failure mode in AI deployments is the collapse of three distinct concepts into one: identity (who the actor is), authority (what the actor is allowed to do), and execution (how actions are carried out).

In many systems, these three layers are effectively fused. The assistant runs under the human’s account, with the human’s permissions, inside the human’s environment. When something goes wrong, it is often impossible to determine whether the human acted, the assistant acted, or a misconfiguration created an unintended action entirely. Attribution becomes guesswork.

In the coworker model, these layers are always separated. The assistant holds its own identity. Authority is granted through explicit, revocable mechanisms. Execution is constrained by infrastructure, tooling, and review processes. No single component is permitted to silently bridge these layers, which ensures that compromise, error, or misalignment in one does not automatically propagate to the others.

Avoiding Absolutism

Absolutist rules are appealing because they simplify reasoning. They are also brittle. Statements like “the assistant must always require human approval” or “the assistant must never act autonomously” are comforting as slogans, but incorrect when applied universally and permanently.

This architecture avoids absolutism by treating all constraints as conditional, reviewable, and contextual. Human judgment is privileged today because it is currently the best available evaluator of outcomes. This is an empirical claim, not a metaphysical one. Should that assumption change — because the assistant’s judgment demonstrably improves, because better evaluation mechanisms emerge, or because the operating context shifts — the architecture must be re-evaluated rather than defended on principle.

What remains non-negotiable is not who judges, but that judgment itself must be explicit and accountable. The identity of the evaluator can change. The requirement for traceable evaluation cannot.

The North Star

All design decisions in this system are evaluated against an ordered priority list:

Security — minimize harm, blast radius, and irreversibility
Performance — reduce friction and wasted effort
Intelligence — maximize usefulness within the above constraints

This ordering is intentional and strict. When tradeoffs arise, the higher-priority constraint wins. A system that is intelligent but insecure is dangerous. A system that is fast but insecure is reckless. Intelligence and performance are pursued only to the extent that they do not compromise the security posture.

Success as the Absence of Failure

Traditional system evaluation emphasizes visible success: features delivered, tasks completed, capabilities demonstrated. This architecture adopts a different metric. A successful system is one that does not surprise its operator, does not escalate silently, does not accumulate invisible risk, and fails quietly and recoverably when it does fail.

This is a deliberately conservative standard. It means the system will sometimes appear less capable than alternatives that optimize for throughput or autonomy. That appearance is accurate — and acceptable. The architecture trades peak capability for predictability, on the grounds that a system whose failure modes are understood is more valuable over time than one whose success modes are impressive.

Constraint as Enabler

Constraints are often framed as limitations to be overcome. In this architecture, they function as enablers. Clear boundaries reduce cognitive load for the operator, improve predictability of system behavior, make trust possible because it is grounded in verifiable limits, and allow delegation without requiring the operator to anticipate every possible failure. By constraining what the assistant can do unilaterally, the architecture makes it safer to ask the assistant to do more overall.

Designing for Replacement

A final consequence of the coworker model is that the assistant must be replaceable. No irreplaceable component can be trusted indefinitely, because irreplaceability creates leverage — and leverage, in a system designed around bounded authority, is a structural defect.

In practice, this means memory is externalized rather than locked inside the assistant. Configuration is documented rather than accumulated implicitly. Authority is delegated through reversible mechanisms rather than baked into the environment. If the assistant must be rebuilt, replaced, or retired, the surrounding system survives intact. This requirement forces discipline in how the assistant is integrated and prevents the kind of operational dependency that makes decommissioning feel impossible.

This document establishes the philosophical foundation for all subsequent design choices. Every concrete mechanism described later exists to enforce or operationalize these principles.