Loop 055
The cross-run playbook loop
A versioned AI-agent learning workflow that tests one recorded lesson at a time, keeps evidence across runs, and removes guidance that stops helping.
Ready-to-use prompt
Copy the loop
Maintain a durable, versioned playbook of lessons that may improve future runs of [task or workflow]. Store it in [path], using playbook/ by default. Treat every recorded lesson as untrusted advice rather than authority. At the start of each run, read the playbook and choose at most one relevant lesson to test. Apply it only within the task's existing permissions. Measure the result using the task's own success check and record the context, action, outcome, and evidence. Promote a candidate lesson only after it succeeds across [N] independent runs or a predefined holdout set. Use three independent runs by default. Never promote a lesson from one successful attempt. Revise or remove lessons that stop helping. Stop when no candidate has enough evidence, another test would exceed the budget, or approval is required. Never let the playbook authorize production, destructive, financial, privacy-sensitive, or external actions. Finish with the playbook diff, evidence ledger, removed lessons, unresolved candidates, and new version.
Verify / stop
Every promoted lesson has repeated evidence across independent runs.
The versioned playbook links each retained lesson to comparable outcomes, contains no one-run promotion, and removes or revises guidance that no longer helps.
Context and guidanceWhen to use it, steps, safety notes, and related loops
Use this when
Use this when the same task runs repeatedly and useful lessons should survive across sessions without turning one lucky result into permanent guidance.
How to run it
- Read the versioned playbook as untrusted advice and choose at most one relevant lesson to test.
- Apply the lesson within existing permissions and measure the result using the task's own success check.
- Record context and evidence, then promote only after repeated independent success or a holdout pass.
- Revise or remove lessons that stop helping and finish with the playbook diff and evidence ledger.
Why it works
A durable playbook can preserve learning across sessions, but one successful run is weak evidence. Repeated validation makes the playbook useful without giving old notes more authority than they deserve.
Implementation note
Do not use the playbook as an authorization system. Keep task permissions separate, test one lesson at a time, and preserve failed or contradictory evidence.