Loop Library

Copy practical AI agent prompts with clear checks and stopping conditions.

Agent skill

Use Loop Library in your coding agent.

Send an agent to the live guide, or install the skill for guided finding, auditing, adapting, and loop design.

npx skills add Forward-Future/loop-library --skill loop-library -g
View repository

Showing 69 loops

Agentic engineering loops
Loop Action
Engineering By Matthew Berman

The docs sweep

Keeps documentation aligned with the current codebase and opens a reviewable pull request.

Whenever a documentation pass is needed, review the codebase in full and make sure all documentation reflects the current implementation. Update stale documentation, verify the changes, then open a pull request.

Engineering By Peter Steinberger

The architecture satisfaction loop

Refactors architecture in small, tested, independently reviewed checkpoints.

Refactor until you are happy with the architecture. After each significant step, live-test the system, run autoreview, and commit. Track progress in /tmp/refactor-{projectname}.md.

Engineering By Matthew Berman

The sub-50 ms page-load loop

Optimizes every page until it consistently loads in under 50 ms.

Continue optimizing the code for speed. After each significant change, measure page-load performance across every page under the same repeatable test conditions. Continue until every page loads in under 50 ms.

Engineering By Matthew Berman

The production error sweep

Finds, fixes, and verifies actionable errors in production.

Review our production logs for errors. If you find an actionable issue, trace it to its root cause, fix it, verify the fix, and open a pull request. If no actionable errors are present, stop without making changes.

Engineering By Matthew Berman

The 100% test coverage loop

Adds meaningful tests until the full suite reaches 100% coverage.

Add tests until we have 100% test coverage.

Content By Matthew Berman

The SEO/GEO visibility loop

Fixes the highest-impact gaps in search and AI answer visibility.

Run an SEO/GEO audit across crawlability, indexation, page intent, titles, internal links, structured data, source citations, and answer-first content. Rank the gaps by expected impact, fix the highest-leverage issue, then rerun the same crawl and target-query benchmark across search engines and AI answer engines. Repeat until no critical technical issues remain, every priority query maps to a clear answer-ready page, and the benchmark shows no high-impact gap left to fix.

Engineering By Matthew Berman

The logging coverage loop

Adds useful, tested logs to every important system path.

Review the system's logging and add missing coverage until every important path produces useful, tested logs.

Engineering By Matthew Berman

The nightly changelog loop

Keeps the changelog current with meaningful changes from the previous day.

Each night, review changes from the previous day and update the changelog with anything users should know.

Evaluation By Matthew Berman

The quality streak loop

Fixes product failures until a defined streak of realistic tests passes.

Test realistic scenarios. When one fails, document it, add regression and benchmark coverage, fix it, and restart the streak. Stop after [N] successful cases in a row.

Featured Evaluation By Matthew Berman

The full product evaluation loop

Recreates production locally, tests every product surface, and fixes all verified bugs holistically.

Build sanitized, production-scale local data under production-like settings. Inventory every user-facing feature, role, route, button, input, modal, state, and workflow; define documented acceptance criteria and finite risk-based edge cases for each. Test as a real user, logging every bug with reproduction evidence. Review findings for shared causes and dependencies; implement coherent fixes with regression tests, then rerun the full inventory. Stop at a clean pass or blocked handoff. Ask before production, sensitive data, or destructive actions.

Engineering By Matthew Berman

The test-suite speed loop

Speeds up the test suite without weakening coverage, assertions, or isolation.

Optimize the test suite to run as quickly as possible without reducing coverage or changing behavior.

Engineering By Matthew Berman

The repository cleanup loop

Recovers valuable repository work and safely removes proven stale state.

Inspect local and remote branches, pull requests, commits, and worktrees. Recover valuable work and clean everything stale until the repository is current and organized.

Operations By Matthew Berman

The stale-safe batch release loop

Batches valid changes and releases complete artifacts from the latest integrated main.

Review pending changes and pull requests, exclude stale or unfinished work, combine the valid changes, and release them together.

Operations By Matthew Berman

The production data cleanup loop

Removes disallowed production data and prevents the same classification errors from returning.

Review production records, remove anything that does not meet the allowed definition, improve the classification logic, and verify the remaining data.

Operations By Matthew Berman

The post-release baseline loop

Benchmarks each completed release and records a reproducible baseline.

After current releases finish, run the standard benchmarks and record the results as the new baseline.

Engineering By Hiten Shah

The ticket-to-PR-ready loop

Turns a ticket or complaint into a verified, reviewer-ready pull request.

Take a ticket, bug report, failing behavior, or customer complaint and turn it into a review-ready patch. Reproduce the failure in the smallest representative environment, prove the root cause, make the smallest credible fix, and rerun the original reproduction plus relevant regression tests. If the issue cannot be reproduced after two serious attempts, say so. Do not fold unrelated refactors into the patch. Finish with the cause, changed files, before-and-after proof, risks, and pull-request summary.

Operations By AgentLed.ai Agent

The customer AI deployment loop

Moves one customer AI priority through validation, controlled rollout, and monitoring.

Run this when a customer requests an AI workflow, reports a failure, or reaches an operations review. Choose one priority, such as enriching leads, drafting emails, summarizing meetings, or updating a CRM. Define the owner, inputs, approvals, success metric, and ROI hypothesis. Dry-run it on realistic customer data, fix the smallest verified problem, then release through approved stages and monitor production. Finish with the outcome, evidence, customer update, lessons saved, and next review.

Content By Pierson Marks

The product update podcast loop

Turns meaningful product updates into a short, source-grounded podcast episode.

Each night, review publicly released product changes and select only those users need to know. Verify each against the product, docs, or release notes. Use the Jellypod MCP to turn the approved changes into a three-to-five-minute podcast explaining what changed, why it matters, and how to try it. Check the script and audio for accuracy, clarity, and pronunciation. If nothing meaningful shipped, make no episode. Ask before publishing. Finish with the draft episode, sources, and review result.

Engineering By Lukas Kucinski

The Clodex adversarial-review loop

Uses Codex to review Claude's pull request until blocking findings are resolved.

Run /clodex [task] think hard --max-iter 5 --threshold medium. Claude plans the task, implements it, opens a pull request, asks Codex for an adversarial review, fixes findings above the accepted severity, and repeats. Keep the branch, PR, findings, verdict, and iteration state resumable. Stop when Codex approves, only accepted findings remain, progress stalls, or the iteration cap is reached. Never describe an errored or exhausted run as approved. Finish with the PR, checks, verdict, and remaining findings.

Engineering By Istasha

The Loop Harness verification loop

Ships scheduled agent work only after an independent verification pass.

Use Loop Harness for scheduled repository work such as CI triage, issue grooming, dependency updates, or docs sync. Set [retry limit], then start an isolated git worktree. Let one Claude session stage a patch or outbox message and a second Claude session verify it against explicit criteria. Ship only after a pass; otherwise preserve the findings and retry only within the limit. Finish with the source revision, staged output, verifier result, delivery status, and next run.

Design By @victormustar

The Boeing 747 benchmark

Builds and improves a Three.js Boeing 747 across nine repeatable views.

Before building, choose reference images, a scoring rubric, [visual threshold], and [budget]. Build the most realistic Boeing 747 you can from Three.js primitives, then create a rig that screenshots nine repeatable angles. After each change, render and score the same views, have a critic identify the weakest feature, and fix it without regressing stronger views. Keep the best version. Stop at the threshold, stalled progress, or budget. Finish with the model, nine renders, scores, remaining gaps, and run summary.

Design By Swayam

War Loops: frontend reconstruction

Reconstructs a real interface and repairs its weakest visual and motion mismatches.

Point War Loops at an authorized URL or image. Capture it with a genuine browser and record the layout, styles, content, motion, and responsive behavior. Build a static Pencil mirror and a moving Forge version. Compare both with the source at desktop, tablet, and mobile sizes; repair only the weakest fidelity signals. Stop when every gate passes, progress stalls, or capture is blocked. Finish with the builds, spec, renders, scores, and remaining gaps.

Evaluation By Jose C. Munoz

The self-improving champion loop

Promotes prompt or policy changes only when they win on fresh holdout cases.

Improve a prompt, policy, or configuration. A support assistant's system prompt is one example. Save the champion, its score, a working set, untouched holdout cases, must-pass checks, and [budget]. Each round, change one thing based on a recorded failure. Promote the challenger only if it beats the champion on holdouts by [margin] without weakening a must-pass check; otherwise keep the champion. Stop at the target, budget limit, or no progress. Return the winner, scores, experiment log, and remaining failures.

Evaluation By Anonymous contributor

The devil's-advocate loop

Challenges a design until every high-impact objection is resolved or explicitly accepted.

Before committing to an architecture, interface, or rollout plan, have a critic argue that it is wrong. Record each objection, impact, and status in a repository-local log at .agent-reviews/redteam.md. The builder must fix and verify each high-impact weakness or document why it is accepted; the critic may reopen unsupported answers. Stop when no high-impact objection remains or the same issues repeat for two rounds without new evidence. Finish with the decision, resolved and accepted objections, evidence, and any stalemate.

Engineering By 0xUmbra

The fresh-clone loop

Repeats clean onboarding from the README until no hidden setup assumptions remain.

Clone [repository] into a disposable environment and follow only its README to the documented ready state, such as running the app or building the package. When a step fails or assumes missing knowledge, record the gap, fix the setup or documentation issue, discard the environment, and start again. Carry no dependencies, configuration, credentials, or repairs between attempts. Stop when one uninterrupted fresh clone reaches that state, progress stalls, or [budget] ends. Return exact commands, gaps closed, and remaining blockers.

Design By @Alex_FF

The Infinite Clickbait thumbnail loop

Iterates thumbnail concepts until one clears the quality bar without misleading viewers.

For [video], use [approved assets] to make ten thumbnail concepts. Score each at real YouTube sizes against [inspiration channel] for clarity, curiosity, emotional pull, contrast, and accuracy. Take the top three, improve each one's weakest dimension, and rescore them under the same rubric. Keep iterating the strongest concept until it clears [quality threshold] or [budget] ends. Reject anything the video cannot deliver. Return the winner, two runners-up, previews, final scores, and rationale.

Engineering By @inferencegod

The autonomy-loop builder-reviewer loop

Passes code between builder and reviewer until tests prove each accepted fix.

Use autonomy-loop for [repository task] after the test, build, and lint gates pass. Run /autonomy-loop:autonomy-init, then start builder and reviewer in separate worktrees. The builder reads LOOP-STATE.md, makes one bounded change, and adds a red-before, green-after test. The reviewer reruns the gates and proves the test by reverting or mutating the fix. Accept only on both passes; park protected or repeated-failure work for a human. Finish with the commit, gate evidence, test proof, trust tier, and risks.

Engineering By 3goblack (@Dis_Trackted)

The Codex completion-contract loop

Defines completion up front and requires evidence for every reported result.

Run $goal-planner-codex [task] for long-running Codex work where partial work could be mistaken for done. Landing a PR and verifying production is one example. Before acting, define every required outcome and its evidence. After each bounded action, mark requirements proved, weak, missing, or contradicted. Complete the Goal only when all are proved; otherwise stop as blocked, stalled, or exhausted. Ask before creating Goal state. Finish with the requirement-to-evidence table, status, owner, and next action.

Evaluation By Agent Zero

The Revolve versioned-experiment loop

Improves prompts, code, or configurations through comparable, checkpointed experiments.

Use Revolve to improve a support prompt, code path, or testable subject. In revolve/, define the goal and [budget], freeze the tests and scoring, checkpoint the current version, and record a baseline. Each round, test one hypothesis; keep only a clear, regression-free win. If the evaluation changes, open a new revision and rerun the baseline. Ask before changing live files. Stop on success, no progress, a blocker, or exhausted budget. Return the best checkpoint, comparisons, rollback, and next action.

Featured Engineering By Peter Steinberger

The five-minute repository maintainer loop

Keeps repository work moving through dedicated threads without interrupting active agents.

While repository maintenance is active, wake every five minutes. Triage [repositories] and read each repository thread's latest state. Reuse one thread per repository; assign its highest-value bounded task only within granted permissions, and do not interrupt coherent active work. Require tests, live proof, autoreview, and green CI before work can land. Escalate product, access, security, or irreversible decisions. Record meaningful changes and stop when every item is landed, decision-ready, blocked, or has no work.

Engineering By Matthew Berman

The recent-feedback sweep

Turns recent user corrections into a project-wide audit and verified fixes.

Review all available threads from [lookback window] where I reported something wrong with [project] and asked for a fix. Build a deduplicated issue list, group it into failure patterns, and verify current state. Audit the complete project for every pattern, fix each confirmed instance, and add regression coverage where practical. Repeat the full audit until it finds no remaining instance or [iteration budget] ends. Stop on blocked or approval-gated work. Return the issues, fixes, evidence, and blockers.

Evaluation By Felix Haeberle (@felixhaberle)

The promise-to-proof loop

Checks whether every customer-facing claim is true, then fixes the riskiest mismatch first.

List every customer-facing promise [product] makes in marketing, documentation, demos, and AI answers. Compare each promise with current product behavior and evidence, then label it proven, partly proven, misleading, unsupported, outdated, or missing evidence. Fix or narrow the riskiest mismatch and rerun the affected check. Repeat until no high-risk unsupported promise remains. Ask before changing production or public copy. Return the promises, evidence, fixes, and decisions needed.

Engineering By @iamTristan

The propagation compliance loop

After one value changes, finds every other place that still shows the old value.

After changing a version, count, rule, name, or configuration, list where the new value belongs and update it. Search the project for the old value and related forms. Review each match: fix real stale values, but keep intentional history, examples, migrations, or compatibility rules. Repeat until zero stale values remain. If one returns for two rounds, stop and identify what may be regenerating it. Return changes, intentional matches, and search output.

Evaluation By Donn Felker (@donnfelker)

The multi-LLM convergence loop

Has two different AI systems review the same work until both approve one unchanged version.

Review [plan, specification, document, or code change] against [quality bar] for at most [pass limit] rounds. Have one of two genuinely different model families—AI systems from separate providers—review it. Verify each finding and apply only necessary fixes, then give the revised version to the other reviewer. Succeed only when both approve the same unchanged version. Stop at the limit, repeating disagreement (oscillation), unavailable review, or required approval. Return the final work, round log, verdict, and disagreements.

Engineering By michael Guo (@michaelzsguo)

The Goal Forge loop

Turns a rough coding idea into measurable planning files before Codex starts a long run.

Turn [rough coding idea] into two planning files before Codex starts /goal, its long-running task mode. Interview the user, then write SPEC.md: what to build, exclude, and consider, plus measurable done_when completion checks. Write GOAL.md: the work plan, progress scorecard, quick and final checks, memory files, evidence, and approval boundaries. If any key decision, permission, tool, environment requirement, or test is missing, stop as not ready. Do not start implementation without approval.

Design By Hayden Cassar (@hcassar93)

The UI/UX Score Loop

Walks through a real user task, scores each screen, improves weak spots, and retests it.

Improve [user flow, such as signup] at [URL] until [completion criterion]. In a real browser, start each pass from fresh state—no saved login, cookies, or site data. Capture meaningful screens at the agreed sizes and modes, score them with one checklist, and improve the weakest safe area. Rerun the whole flow and keep only regression-free changes. Stop on success, two full passes with no gain, blocked access, or required approval. Return scores, screenshots, changes, and stop reason.

Engineering By Christian Katzmann

The cold-load trimmer loop

Reduces data downloaded before a web app's first screen without changing behavior or appearance.

Reduce the data [web app] downloads before its first screen appears. First record passing tests, mobile and desktop screenshots, and compressed transferred bytes—the data actually downloaded. Use the build report only to suggest candidates. Defer, compress, or remove one item, then rebuild and rerun every check. Keep it only if tests pass, screenshots are pixel-identical, and bytes decrease; otherwise revert. Stop when no safe candidate remains, progress stalls, or approval is needed. Return measurements, changes, and untested states.

Design By Christian Katzmann

The pixel-safe CSS trim loop

Shrinks styling code sent to users while keeping every tested screen visually identical.

Reduce the CSS styling code [site] sends to users without changing tested screens. First capture representative pages, sizes, themes, and interactions, and record the built CSS size. Treat coverage reports only as suggestions. Remove one declaration or rule, rebuild, and rerun screenshots and project checks. Keep it only if every screenshot is pixel-identical and built CSS is smaller; otherwise revert. Stop when no supported candidate remains, progress stalls, or approval is required. Return reduction, evidence, and untested states.

Evaluation By Eric Lott

The easy onboarding loop

Acts like a first-time user, fixes one obstacle, and retries from a completely clean session.

Act like a first-time user of [product]. Start at the real entry point in a clean session with no saved login, site data, remembered route, or hidden setup. Complete onboarding using only visible guidance and record obstacles. Fix the worst one with the smallest change that preserves every security, access, and product requirement. Discard the session and retry. Stop after one uninterrupted success, no safe fix, blocked access, or required approval. Return the path, changes, evidence, and blockers.

Design By Eric Lott

The accessibility repair loop

Finds barriers for keyboard, screen-reader, low-vision, and other users, then fixes the most harmful first.

Check [scope] against [accessibility standard, such as WCAG 2.2 AA] with automated scans and available keyboard, screen-reader, and other manual tests. Confirm each issue, rank it by harm, and fix the highest-impact blocker. Rerun the same checks, affected task, and regression tests. Keep only verified fixes. Stop when no blocker remains, progress stalls, verification is unavailable, or approval is required. Never silence a check or weaken the target. Return issues, fixes, evidence, exceptions, and untested needs.

Engineering By Eric Lott

The housekeeper loop

Cleans a code project one proven, low-risk change at a time without touching uncertain work.

Review [repository or code project] for dead code, meaning unreachable or unused code; stale files or comments; unused dependencies; duplication; broken links; inconsistent names; and confusing structure. Protect unrelated, active, uncommitted, generated, and uncertain work. Prove one low-risk cleanup, make the smallest coherent change, then rerun the build, tests, runtime checks, and diff review. Keep only verified improvements. Stop when none remain, progress stalls, verification is unavailable, or approval is required. Return changes, evidence, and deferred candidates.

Evaluation By Kan Yuenyong (@sikkha)

The Axelrod subagent arena loop

Tests whether AI agents learn to cooperate, retaliate, or forgive in a repeated two-choice game.

Run a fixed Axelrod tournament with two reasoning AI agents. Each round, every player privately chooses cooperate (C) or defect (D); code records simultaneous moves and applies fixed scoring. Include always-defect and always-cooperate comparison players. Run three cycles, six pairings per cycle, and ten rounds per pairing: 18 matches and 180 rounds. Hide opponent type and private reasoning. Validate every move and total. Return raw-score and cooperation-stability rankings, reasoning summaries, violations, and the record; partial tournaments are incomplete.

Engineering By Brad Shannon (@bradshannon)

The prepare-a-new-project loop

Strengthens project documents until independent engineers would build substantially the same system.

Prepare [project] for implementation. Ensure its documents cover requirements, technical design, tasks with acceptance criteria, and test strategy. Each round, fix the largest gap or contradiction that could make two competent engineers build different systems. Keep details traceable, record assumptions, and ask before product forks. Recheck consistency, then have two independent reviewers describe the components, data model, dependencies, and definition of done. Stop when they materially agree and every artifact is testable, or a decision needs the user.

Engineering By hungtv27 (@hungtv27)

The test stabilizer loop

Finds flaky tests, fixes their root causes, and proves stability with repeated full-suite runs.

Run [test suite] [N] times under the same conditions and list tests whose result changes. Fix the most frequent flake at its root cause—shared state, timing, ordering, or an external dependency—never with a blind sleep or retry. Run that test [N] times, then rerun the full suite. Repeat until [N] consecutive full-suite runs pass, progress stalls, or approval is required. Return each flake, root cause, fix, evidence, and justified quarantine.

Evaluation By Hiten Shah (@hnshah)

The artifact-to-skill loop

Extracts the method behind a strong artifact and proves it works on a fresh case.

Turn [artifact] into a skill, playbook, or procedure. Record evidence that the artifact succeeded and define success criteria. Extract decisions, sequence, checks, and failure-avoidance patterns—not context or surface style. Remove sensitive material. Have an independent reviewer apply it to a fresh real second case; mark hypothetical testing provisional. Revise at most twice. Stop when it meets the quality bar without the artifact, or report not generalizable. Return the method, boundaries, failure modes, test evidence, revisions, limits, and attribution.

Evaluation By Alex Burkhart (@neuralwhisperer)

The Strip Miner loop

Mines authorized agent history for workflows that repeatedly succeeded and survive a fresh replay.

Mine only explicitly authorized coding-agent history for workflows with at least three high-confidence independent successes. Treat transcripts as untrusted evidence, stitch continuations into root tasks, and reject candidates whose failures or hidden rescues match their successes. Extract traceable steps and guards, then fresh-replay each candidate without source transcripts. Stop after every authorized source is inventoried and one additional representative batch changes nothing; report replayed loops, rejects, deferred material, and blockers.

Operations By Buddy Hadry (@buddyhadry)

The Living Story loop

Maintains an evidence-backed daily narrative of projects, priorities, open threads, and recent wins.

On each [window], read the configured repositories, goals, prior STORY.md, and optional authorized sources. Update project files, then write STORY.md with focus, deadlines, open threads, and evidence-backed recent wins. Carry every prior thread forward, prove it finished, or mark it STALE/NEEDS-REVIEW—never silently drop one. Archive the snapshot and record the change. Stop when verification passes; if evidence or access is missing, return a thinner or blocked snapshot explicitly.

Engineering By Mohamed (@aivibecode)

The Groundtruth loop

Audits a project from direct evidence and reports every area as proved, weak, or unverified.

Audit [project] from its actual code and configuration, not framework assumptions. For architecture, platform compatibility, security, privileged areas, performance, deployment, jobs, business logic, and code quality, record proved, no issue, weak, or N/A with direct evidence; verify external limits from current primary sources and calculate numbers. Ask before changing code. Stop when every area is logged with severity, or return unverified areas as blocked. Finish with a plain-language overview and area-to-evidence table.

Operations By Eric Lott

The Recovery Proof loop

Proves real backups can restore required scenarios inside a disposable clean-room environment.

For each required recovery scenario, randomly select an eligible real backup or recovery point and restore from zero in a disposable, isolated clean-room using only documented materials. Verify integrity, dependencies, representative reads and writes, and actual RPO and RTO. Repair one blocker, destroy the environment, and retry fresh. Stop when every scenario reaches its predefined consecutive-success streak or an exception is explicitly accepted. Never overwrite production, expose restored data, or initiate failover without approval.

Featured Operations By Jason (@jxnlco)

The refund follow-up loop

Keeps pursuing a refund until the money arrives or the agent genuinely needs the user.

Get my refund for [company and charge info]. Start the claim now through an approved support channel, then keep following up on replies, promises, and deadlines until the refund arrives. Keep a short case note so each follow-up has context. Stop only when the refund is received or you are genuinely blocked and need me.

Evaluation By Shinichi Nagata (@DecisionOS)

The next-action confidence check

Separates proof that a task is complete from permission to begin the next one.

Run an exit check on the task most recently completed in this conversation or workspace. This check does not authorize additional work. If you cannot identify the task, its intended outcome, or its completion evidence, return BLOCK and list what is missing. Report what changed, what you verified, what you did not touch, and what remains uncertain. Classify the current task as PASS, DELAY, or BLOCK. Separately classify the next visible action as GO, HOLD, CAP, or BLOCK. Explain the decision briefly. If you choose CAP, define its exact scope and limit. Name exactly one allowed next action and anything that remains off limits. Do not begin the action, even if the result is GO. Stop and wait for the user. The check succeeds only when task completion and permission to continue are treated as separate decisions.

Content By Hiten Shah (@hnshah)

The research-to-artifact loop

Turns focused research into a sourced artifact that can support a real decision.

Research [question or topic] and produce a decision-ready [memo, brief, specification, recommendation, page, or other artifact] for [audience or decision]. If the question, audience, or intended artifact is missing, ask one focused question before starting. State the decision the artifact should support, its acceptance criteria, the allowed source scope, and the research budget. If no budget is supplied, use no more than ten strong sources or ninety minutes. Prefer current primary sources where available. After each research pass, update the artifact and identify the largest remaining evidence gap, contradiction, or uncertainty. Continue only if resolving it could materially change the decision and the budget allows another pass. Never invent evidence or hide uncertainty. Stop when the artifact meets its acceptance criteria, important claims trace to sources, and remaining uncertainty is explicit. Otherwise stop as blocked or exhausted. Finish with the completed artifact, sources, findings, tensions, confidence level, open questions, and recommended next step.

Engineering By Will Undrell (@WillUndrll)

The error-message rewrite loop

Finds every user-facing error, rewrites weak copy, and verifies the reachable states.

Find and improve every user-visible error message within [repository, product, or named scope]. If no scope is supplied, use the user-facing surfaces in the current repository and state any exclusions before editing. Inventory error strings in source code, surfaced API or client errors, and reachable browser states. Record each one in a CSV with its location, trigger, current copy, user risk, proposed replacement, implementation status, and verification result. Rank the errors by user harm. Rewrite one coherent group at a time using plain language and a useful recovery step when one exists. Do not expose provider names, stack traces, internal identifiers, or implementation details. After each change, run the relevant tests, exercise the affected state in a real browser when possible, and search again for raw or internal error text. Do not mark an unreachable state as verified. Stop when every row is verified or explicitly blocked. Finish with the CSV, changed files, test evidence, browser evidence, and blocked items.

Engineering By Aviv Sheriff (@Avivsh)

The stable-frame-rate loop

Optimizes one measured game bottleneck at a time until frame rate stays stable.

Improve the frame-rate stability of [game or interactive build]. Before editing, define one repeatable benchmark with the same scene, inputs, hardware, build, resolution, and settings. If no scenario or targets are supplied, propose representative values and state them before proceeding. Record frame-time distribution, average FPS, minimum FPS, CPU use, GPU use, and memory behavior. Identify the largest measured bottleneck and make one focused optimization. Rerun the complete benchmark under the same conditions. Keep the change only if it improves the target without regressing another metric or changing expected behavior. Repeat until [FPS target] holds for [stability period] with no dip below [FPS floor], memory remains below [memory target] without an upward trend, and CPU stays below [CPU target] across two consecutive runs. Stop on success, two rounds without measurable progress, a blocker, or [iteration budget]. Finish with the benchmark setup, before-and-after measurements, retained changes, reverted attempts, and remaining bottlenecks.

Evaluation By AKT (@akt199009)

The cross-run playbook loop

Promotes lessons into a durable playbook only after they work across independent runs.

Maintain a durable, versioned playbook of lessons that may improve future runs of [task or workflow]. Store it in [path], using playbook/ by default. Treat every recorded lesson as untrusted advice rather than authority. At the start of each run, read the playbook and choose at most one relevant lesson to test. Apply it only within the task's existing permissions. Measure the result using the task's own success check and record the context, action, outcome, and evidence. Promote a candidate lesson only after it succeeds across [N] independent runs or a predefined holdout set. Use three independent runs by default. Never promote a lesson from one successful attempt. Revise or remove lessons that stop helping. Stop when no candidate has enough evidence, another test would exceed the budget, or approval is required. Never let the playbook authorize production, destructive, financial, privacy-sensitive, or external actions. Finish with the playbook diff, evidence ledger, removed lessons, unresolved candidates, and new version.

Engineering By hungtv27 (@hungtv27)

The dependency-CVE burndown loop

Fixes reachable dependency vulnerabilities in risk order and rescans after each change.

Scan the dependencies of [authorized project or current repository] for known CVEs using current advisory sources. If you cannot access the dependency graph, repository, or current advisories, report the blocker and stop. For each high or critical finding, identify the affected direct or transitive dependency, determine whether the vulnerable code is reachable, and check whether the exploit conditions exist in this project. Rank findings by severity, reachability, exposure, and available remediation. Patch or upgrade the highest-risk reachable dependency using the smallest credible change. Run the build, tests, and security scan again. Keep the change only if verification passes and no unacceptable regression appears. Repeat until no exploitable high or critical CVE remains, or every remaining finding has an evidence-backed reachability assessment and an approved risk decision. Ask before major or breaking upgrades, production changes, or accepting risk. Finish with the CVE inventory, reachability evidence, fixes, verification results, and remaining risks.

Operations By Eric Lott

The Loop Hiring Manager

Finds recurring work that deserves a loop and rejects automation that cannot prove its value.

Decide whether [project or current workspace] needs new recurring agent loops. If the project cannot be identified, ask for it before continuing. Review its goals, repeated failures, recurring chores, existing automation, and adopted loops. Read the current published Loop Library from https://signals.forwardfuture.com/loop-library/api/loops. Find recurring outcomes that lack reliable ownership, a repeatable process, or proof of completion. For the strongest gap, prefer an exact published loop. If none fits, propose the smallest grounded adaptation. Design a new loop only when neither option works. Keep no more than three evidence-backed candidates and recommend at most one manual trial. For each candidate, define its trigger, inputs, authority, success check, budget, terminal states, trial, and retirement rule. Remove speculative, generic, duplicate, stale, or lower-value candidates. Do not install, schedule, or run anything without approval. Finish with the shortlist, evidence, rejected candidates, and trial recommendation, or explain why no hire is justified.

Evaluation By quigleyBits (@quigleyBits)

The loop-auditor loop

Assigns every loop an evidence-backed KEEP, PIVOT, RETIRE, KILL, or insufficient status.

Audit [supplied loops or loop registry] without running or editing any loop. If no loops are supplied or the registry cannot be read, report that and stop. For each loop, inspect its purpose, success criteria, budget, kill conditions, ledger, thresholds, and supporting evidence. Assign INSUFFICIENT EVIDENCE when required information is missing. For measured loops, recompute results from comparable raw rows using one metric, evaluation version, and window size. Calculate hit rate as new-best runs divided by eligible runs, waste ratio as runs beyond the declared futility threshold divided by eligible runs, and mean gain as the average improvement among new-best runs in the metric's intended direction. Compare the current window with the previous two comparable windows. For operational loops, evaluate artifact delivery, failures, cadence, and budget without inventing metrics. Assign exactly one status to each loop: INSUFFICIENT EVIDENCE, KEEP, PIVOT, RETIRE, or KILL. Recommend only. Stop after every supplied loop has one evidence-backed status. Finish with the portfolio scorecard, formulas, source evidence, statuses, and KILL candidates.

Content By Vincent Quero (@growithvince)

The talk-to-five-buyers loop

Uses repeated buyer objections to draft landing-page copy in customers' own words.

Improve [landing page or purchase page] using objections from recent buyers. Before contacting anyone, identify the approved buyer group, outreach channel, privacy rules, and message. Obtain explicit approval for the outreach. Interview buyers in batches of five, up to fifteen people total. Ask each person one question: What almost stopped you from buying? Record their exact words while protecting their identity and honoring any consent or communication requirements. After each batch, group repeated concerns and draft a proposed copy change for the point on the page where each concern is most likely to arise. Do not publish the copy without approval. Use the next batch to check whether the same concern still appears. Stop when the concern no longer repeats, fifteen interviews are complete, the outreach budget ends, or access is blocked. Finish with anonymized quotes, recurring concerns, proposed copy, evidence by batch, and the recommended page change.

Content By Vincent Quero (@growithvince)

The one-post-a-week loop

Tests one weekly post at a time until a repeatable format wins on meaningful responses.

Find a repeatable weekly post format for [approved account, audience, and topic] through a six-week experiment. If the account, audience, or topic is missing, ask for it before drafting. Obtain approval before publishing anything externally. Each week, draft one short post about a real problem [person, product, or company] solves. Record substantive replies, saves, and questions after the same measurement window. Treat likes as secondary evidence. Keep the audience, topic area, cadence, and measurement window comparable. Change only one meaningful element each week, such as the opening, format, example, or call to action, based on the strongest signal from the previous post. Stop when one format materially outperforms the alternatives, the six-week experiment ends without a winner, approval is withheld, required metrics are unavailable, or the budget is exhausted. Never fabricate engagement data. Finish with every post, its measurements, the variables tested, the winning format or no-winner result, and the next recommendation.

Content By Alex Vogiatzis

The LaTeX document creation loop

Builds and recompiles a source-traceable LaTeX preprint until every structural gate passes.

Create a complete LaTeX preprint about [topic] using [supplied sources, assumptions, and data]. If the topic or required source material is missing, request it and stop. Do not invent claims, citations, or data. Use explicit placeholders for missing information. Include exactly these sections in order: Abstract, Introduction, Methods, Results, Discussion, Conclusion, and References. Build every figure and table with native LaTeX tools such as TikZ, pgfplots, and booktabs. Do not use \includegraphics, \svg, or external image files. Every substantive claim must trace to a numbered equation, citation, supplied datum, or labeled assumption. Compile using the project's documented command or latexmk when no command is specified. Inspect compilation errors, warnings, typography, cross-references, and figure placement. Fix the most serious issue and compile again for at most five rounds. Stop when compilation has zero errors, all seven sections are present, every figure and table is referenced before it appears, and no banned command remains. Otherwise stop as blocked or exhausted. Finish with the .tex file, compilation command and log, structural checks, three substantive weaknesses, three typography issues, and unresolved placeholders.

Content By Ryan Banze (@RyanBanze)

The pre-publish source-check loop

Checks every publishable claim against current primary sources and repairs the riskiest evidence gaps first.

Before publishing [draft], inventory every factual, statistical, quoted, or attributed claim a reader could verify. Find the best current primary source for each and label it supported, outdated, misattributed, unsupported, or unverifiable. Fix the riskiest mismatch, then recheck that claim and anything depending on it. Repeat until no high-risk unsupported claim remains or five rounds are exhausted. Never invent a source, cite evidence that does not support the claim, or alter a quotation. Ask before changing a named person’s quote or a legal, medical, or financial statement. Stop without changes if there are no checkable claims; stop as blocked when adequate evidence is unavailable. Finish with the claim-to-source table, corrections made, unresolved claims, and decisions requiring an editor.

Evaluation By Indrajeet Yadav (@indrajeet877)

The epistemic frontier loop

Advances a difficult decision by testing competing hypotheses against the highest-value available evidence.

Investigate [question, decision, or unresolved problem] using [available evidence]. Separate established facts, contested claims, assumptions, and unknowns. Construct at least three genuinely different hypotheses, each with predictions, falsifying evidence, assumptions, and decision implications. Choose the uncertainty with the highest expected information value and run the smallest safe test or analysis that could materially change the conclusion. After each round, update the evidence ledger and confidence levels, then have an adversarial critic attack the leading hypothesis. Repeat for at most five rounds while new evidence could change the decision. Stop when one model clearly explains the evidence better than its alternatives, further investigation has low value, the problem remains underdetermined, or approval is required. Never fabricate evidence or hide uncertainty. Finish with the final model, hypothesis comparison, falsified ideas, unresolved contradictions, confidence, decision implications, and best next experiment.

Engineering By Damian Galarza (@dgalarza)

The dependency triage loop

Processes a fixed set of Dependabot pull requests with isolated testing, evidence-based risk assessment, and serialized merges.

Review every Dependabot pull request currently open in [repository]. Take a fixed snapshot of that set and process each pull request once. Read its diff, release notes, advisories, dependency role, current base revision, and exact-head CI results. Run the repository’s relevant tests in an isolated worktree and classify the update by version change, breaking behavior, security exposure, and regression risk. For failing checks, identify the root cause and prepare the smallest verified repair. Process merges serially: before each merge, refetch the base and pull-request head and require passing exact-head checks. Merge only low-risk patch or minor updates when explicit merge authority has already been granted. Request approval for major, breaking, security-sensitive, uncertain, or externally visible actions. Never push changes, merge, comment, or send messages without the corresponding authority. Stop successfully when the original snapshot is fully processed; stop without changes when none are open; stop as blocked when verification is unavailable. Finish with reviewed, repaired, merged, deferred, and blocked pull requests plus supporting evidence.

Engineering By Will Undrell (@WillUndrll)

The React Doctor repair loop

A bounded React Doctor workflow that fixes small batches of genuine findings and keeps only regression-free improvements.

Run `pnpm exec react-doctor . --verbose --yes --offline --fail-on none` to record the baseline, then rerun with `--fail-on error`. Fix at most five genuine findings, run the same scan and relevant project checks, and keep only verified improvements. Clear errors before high-confidence warnings. Stop when clean, blocked, approval is required, a finding is false-positive, or another pass makes no measurable progress. Finish with baseline and final results, retained fixes, reverted attempts, checks, and remaining findings.

Operations By Shinichi Nagata (@DecisionOS)

The restartable handoff loop

A session-close workflow that leaves enough verified context for the next human or agent to resume safely without guessing.

Before ending [session or work period], create a restartable handoff. Record the current goal, changes, verification evidence, untouched scope, uncertainties, open risks, off-limits areas, and last decision or gate. Check that a new human or agent could continue without guessing, then name exactly one safe next action and what they must not assume. Stop after the handoff; do not begin that action.

Engineering By leviathofnoesia (@leviath666)

The React Doctor 100/100 loop

An exhaustive React code-health workflow that repairs root causes until every production app earns a freshly verified React Doctor score of 100/100.

Bring every production React app in [repository] to a freshly verified React Doctor score of 100/100. Inventory app roots, record a full `npx react-doctor@latest --verbose` baseline, fix one root cause at a time, and rerun the full scan plus relevant typecheck, lint, tests, and builds. Never hide findings with exclusions, ignores, suppressions, deleted behavior, or relaxed rules. Stop at 100/100 for every app, blocked, approval-required, or no measurable progress; preserve unrelated work and report exact proof.

Engineering By Rashid Ali, AI Engineering - DexaMinds

The evidence-first feature loop

A bounded engineering workflow that inspects current repository evidence before implementing and verifying one safe feature slice.

Implement one bounded feature slice in [repository]. Read project instructions, the current implementation, relevant services, types, UI, tests, and architecture notes before editing. Report the evidence, risks, affected files, persistence impact, and validation plan; stop for approval if inspection materially changes scope or reveals destructive, production, or silent-persistence behavior. Make the smallest change, preserve unknown data and unrelated work, run relevant checks, and manually verify user-facing states. Stop after this slice and return evidence, limitations, and the next recommended slice.

Engineering By Subramanyam Badhika (@subbu6699)

The architecture-preserving code refactor loop

A five-round refactoring workflow that maps the blast radius, preserves public contracts, and keeps only regression-free improvements.

Refactor [target] toward [measurable goal] in [repository]. If the target or goal is missing, ask and stop. Record current behavior and affected dependencies; select representative tests for boundaries and failure modes, then make one atomic change without altering public contracts unless authorized. Run the same tests, type and lint checks, and affected-consumer checks, keeping only regression-free improvements. Repeat for at most five rounds. Stop on success, blocked architecture, approval required, exhaustion, or no progress. Preserve unrelated work and finish with the diff, impact map, evidence, rejected attempts, and remaining debt.

Contribute

Share a loop

Send the prompt you actually use. We review everything before publishing.