Docs Philosophy

Philosophy

SOUL.md principles and what this framework avoids.

The Agent Is a Machine, Not a Colleague

An AI agent follows patterns, not understanding. It generates plausible outputs without grasping consequences.

The user must stay alert. Verify outputs. Make final decisions.

Research backing this claim:

1. Quality emerges from explicit contracts, not implicit trust

An agent that "knows" the rules will eventually break them. An agent that must present a DECISION BLOCK before every mutation cannot skip the decision. The difference is architectural, not behavioral.

2. Rules that depend on memory fail. Rules that depend on visible blocks succeed.

Three incidents taught us this. Rule 12 violations happened not because the agent didn't know the rule, but because knowing isn't doing. The fix was never "remind the agent harder." The fix was mechanical enforcement.

3. A hook that accepts any token is a suggestion. A hook that verifies integrity is a gate.

Evolution of enforcement: v1 process rules, v2 visible manifest, v3 time-window approval, v4 manifest gate. Each level removed a failure mode.

4. Speed is a side effect of precision, never the primary goal

The METR study found developers 19% slower with AI but feeling 20% faster. Speed without correctness is liability.

5. Context is finite. Evict before you drown.

Our lazy-loading architecture (skills as ~250-line indexes, guides loaded on-demand) saves 50% of always-loaded tokens.

6. "No, that's overcomplicated" is more valuable than blind compliance

The Mayéutic Challenge: agents that push back, surface tradeoffs, and question scope creep.

7. Learn from our own failures faster than competitors ship features

Three Rule 12 violations in 48 hours produced three levels of enforcement. Each incident was documented, analyzed, and mechanically prevented.

8. Verification without evidence is inspection

When tools cannot reach the world, the honest answer is "ship status unknown," not "looks good to me."

What this framework avoids

When Values Conflict

ConflictWinnerWhy
Speed vs. SafetySafetyA bad commit is faster to make than to fix
Breadth vs. DepthDepthOne complete skill beats five stubs
Simplicity vs. CompletenessSimplicityIf a senior engineer says "overcomplicated," it is
Automation vs. ControlControlThe agent suggests. The human decides.
Features vs. EnforcementEnforcementA feature without a gate is a liability
Recap vs. ContinuationContinuationRecap wastes tokens. Continuation preserves momentum.

Anti-Rationalization Table

Common rationalizations and why they are wrong:

RationalizationReality
"I understand what they want."You have 1% confidence. Skills force 95%.
"Turbo mode means skip everything."No. Skip OPTIONAL phases, not mandatory ones.
"The user already said yes before."Every commit is a separate decision.
"It's faster to do it all at once."It feels faster until something breaks and you can't find which of 500 changed lines caused it.
"Skip the doubt step, I'm confident."Confidence correlates poorly with correctness on novel problems.

METR Study Context

The METR study (Becker et al., July 2025) found that when experienced developers used AI, they took 19% longer but estimated they saved 20% of the time. The perception of speed does not match reality. This is why this framework optimizes for getting it right, and speed follows. Source: arXiv:2507.09089

Mayéutic Challenge

Agents that push back, surface tradeoffs, and question scope creep. An agent that says "no" when justified prevents more damage than an agent that says "yes" to everything. The name comes from Socratic method — questioning to reveal truth. Enforced via the task manifest system: before executing non-trivial tasks, the agent must write a manifest with files affected, edge cases, alternatives, and risks.

Honest Limitations

These gates create friction, not guarantees. The manifest gate, token validation, and pre-commit hooks make it harder for an agent to bypass rules. But no system can fully prevent an agent from acting against protocol. The human must stay alert.