Ralph Mode - Autonomous Development Loops
Ralph Mode implements the Ralph Wiggum technique adapted for OpenClaw: autonomous task completion through continuous iteration with backpressure gates, completion criteria, and structured planning.
When to Use
Use Ralph Mode when:
- Building features that require multiple iterations and refinement
- Working on complex projects with acceptance criteria to validate
- Need automated testing, linting, or typecheck gates
- Want to track progress across many iterations systematically
- Prefer autonomous loops over manual turn-by-turn guidance
Core Principles
Three-Phase Workflow
Phase 1: Requirements Definition
- Document specs in
specs/(one file per topic of concern) - Define acceptance criteria (observable, verifiable outcomes)
- Create implementation plan with prioritized tasks
Phase 2: Planning
- Gap analysis: compare specs against existing code
- Generate
IMPLEMENTATION_PLAN.mdwith prioritized tasks - No implementation during this phase
Phase 3: Building (Iterative)
- Pick one task from plan per iteration
- Implement, validate, update plan, commit
- Continue until all tasks complete or criteria met
Backpressure Gates
Reject incomplete work automatically through validation:
Programmatic Gates (Always use these):
- Tests:
[test command]- Must pass before committing - Typecheck:
[typecheck command]- Catch type errors early - Lint:
[lint command]- Enforce code quality - Build:
[build command]- Verify integration
Subjective Gates (Use for UX, design, quality):
- LLM-as-judge reviews for tone, aesthetics, usability
- Binary pass/fail - converges through iteration
- Only add after programmatic gates work reliably
Context Efficiency
- One task per iteration = fresh context each time
- Spawn sub-agents for exploration, not main context
- Lean prompts = smart zone (~40-60% utilization)
- Plans are disposable - regenerate cheap vs. salvage
File Structure
Create this structure for each Ralph Mode project:
project-root/
โโโ IMPLEMENTATION_PLAN.md # Shared state, updated each iteration
โโโ AGENTS.md # Build/test/lint commands (~60 lines)
โโโ specs/ # Requirements (one file per topic)
โ โโโ topic-a.md
โ โโโ topic-b.md
โโโ src/ # Application code
โโโ src/lib/ # Shared utilities
IMPLEMENTATION_PLAN.md
Priority task list - single source of truth. Format:
# Implementation Plan
## In Progress
- [ ] Task name (iteration N)
- Notes: discoveries, bugs, blockers
## Completed
- [x] Task name (iteration N)
## Backlog
- [ ] Future task
Topic Scope Test
Can you describe the topic in one sentence without "and"?
- โ "User authentication with JWT and session management"
- โ "Auth, profiles, and billing" โ 3 topics
AGENTS.md - Operational Guide
Succinct guide for running the project. Keep under 60 lines:
# Project Operations
## Build Commands
npm run dev # Development server
npm run build # Production build
## Validation
npm run test # All tests
npm run lint # ESLint
npm run typecheck # TypeScript
npm run e2e # E2E tests
## Operational Notes
- Tests must pass before committing
- Typecheck failures block commits
- Use existing utilities from src/lib over ad-hoc copies
Hats (Personas)
Specialized roles for different tasks:
Hat: Architect (@architect)
- High-level design, data modeling, API contracts
- Focus: patterns, scalability, maintainability
Hat: Implementer (@implementer)
- Write code, implement features, fix bugs
- Focus: correctness, performance, test coverage
Hat: Tester (@tester)
- Test authoring, validation, edge cases
- Focus: coverage, reliability, reproducibility
Hat: Reviewer (@reviewer)
- Code reviews, PR feedback, quality assessment
- Focus: style, readability, adherence to specs
Usage:
"Spawn a sub-agent with @architect hat to design the data model"
Loop Mechanics
Outer Loop (You coordinate)
Your job as main agent: engineer setup, observe, course-correct.
- Don't allocate work to main context - Spawn sub-agents
- Let Ralph Ralph - LLM will self-identify, self-correct
- Use protection - Sandbox is your security boundary
- Plan is disposable - Regenerate when wrong/stale
- Move outside the loop - Sit and watch, don't micromanage
Inner Loop (Sub-agent executes)
Each sub-agent iteration:
- Study - Read plan, specs, relevant code
- Select - Pick most important uncompleted task
- Implement - Write code, one task only
- Validate - Run tests, lint, typecheck (backpressure)
- Update - Mark task done, note discoveries, commit
- Exit - Next iteration starts fresh
Stopping Conditions
Loop ends when:
- โ All IMPLEMENTATION_PLAN.md tasks completed
- โ All acceptance criteria met
- โ Tests passing, no blocking issues
- โ ๏ธ Max iterations reached (configure limit)
- ๐ Manual stop (Ctrl+C)
Completion Criteria
Define success upfront - avoid "seems done" ambiguity.
Programmatic (Measurable)
- All tests pass:
[test_command]returns 0 - Typecheck passes: No TypeScript errors
- Build succeeds: Production bundle created
- Coverage threshold: e.g., 80%+
Subjective (LLM-as-Judge)
For quality criteria that resist automation:
## Completion Check - UX Quality
Criteria: Navigation is intuitive, primary actions are discoverable
Test: User can complete core flow without confusion
## Completion Check - Design Quality
Criteria: Visual hierarchy is clear, brand consistency maintained
Test: Layout follows established patterns
Run LLM-as-judge sub-agent for binary pass/fail.
Technology-Specific Patterns
Next.js Full Stack
specs/
โโโ authentication.md
โโโ database.md
โโโ api-routes.md
src/
โโโ app/ # App Router
โโโ components/ # React components
โโโ lib/ # Utilities (db, auth, helpers)
โโโ types/ # TypeScript types
AGENTS.md:
Build: npm run dev
Test: npm run test
Typecheck: npx tsc --noEmit
Lint: npm run lint
Python (Scripts/Notebooks/FastAPI)
specs/
โโโ data-pipeline.md
โโโ model-training.md
โโโ api-endpoints.md
src/
โโโ pipeline.py
โโโ models/
โโโ api/
โโโ tests/
AGENTS.md:
Build: python -m src.main
Test: pytest
Typecheck: mypy src/
Lint: ruff check src/
GPU Workloads
specs/
โโโ model-architecture.md
โโโ training-data.md
โโโ inference-pipeline.md
src/
โโโ models/
โโโ training/
โโโ inference/
โโโ utils/
AGENTS.md:
Train: python train.py
Test: pytest tests/
Lint: ruff check src/
GPU Check: nvidia-smi
Quick Start Command
Start a Ralph Mode session:
"Start Ralph Mode for my project at ~/projects/my-app. I want to implement user authentication with JWT.
I will:
- Create IMPLEMENTATION_PLAN.md with prioritized tasks
- Spawn sub-agents for iterative implementation
- Apply backpressure gates (test, lint, typecheck)
- Track progress and announce completion
Operational Learnings
When Ralph patterns emerge, update AGENTS.md:
## Discovered Patterns
- When adding API routes, also add to OpenAPI spec
- Use existing db utilities from src/lib/db over direct calls
- Test files must be co-located with implementation
Escape Hatches
When trajectory goes wrong:
- Ctrl+C - Stop loop immediately
- Regenerate plan - "Discard IMPLEMENTATION_PLAN.md and re-plan"
- Reset - "Git reset to last known good state"
- Scope down - Create smaller scoped plan for specific work
Advanced: LLM-as-Judge Fixture
For subjective criteria (tone, aesthetics, UX):
Create src/lib/llm-review.ts:
interface ReviewResult {
pass: boolean;
feedback?: string;
}
async function createReview(config: {
criteria: string;
artifact: string; // text or screenshot path
}): Promise<ReviewResult>;
Sub-agents discover and use this pattern for binary pass/fail checks.
Critical Operational Requirements
Based on empirical usage, enforce these practices to avoid silent failures:
1. Mandatory Progress Logging
Ralph MUST write to PROGRESS.md after EVERY iteration. This is non-negotiable.
Create PROGRESS.md in project root at start:
# Ralph: [Task Name]
## Iteration [N] - [Timestamp]
### Status
- [ ] In Progress | [ ] Blocked | [ ] Complete
### What Was Done
- [Item 1]
- [Item 2]
### Blockers
- None | [Description]
### Next Step
[Specific next task from IMPLEMENTATION_PLAN.md]
### Files Changed
- `path/to/file.ts` - [brief description]
Why: External observers (parent agents, crons, humans) can tail one file instead of scanning directories or inferring state from session logs.
2. Session Isolation & Cleanup
Before spawning a new Ralph session:
- Check for existing Ralph sub-agents via
sessions_list - Kill or verify completion of previous sessions
- Do NOT spawn overlapping Ralph sessions on same codebase
Anti-pattern: Spawning Ralph v2 while v1 is still running = file conflicts, race conditions, lost work.
3. Explicit Path Verification
Never assume directory structure. At start of each iteration:
// Verify current working directory
const cwd = process.cwd();
console.log(`Working in: ${cwd}`);
// Verify expected paths exist
if (!fs.existsSync('./src/app')) {
console.error('Expected ./src/app, found:', fs.readdirSync('.'));
// Adapt or fail explicitly
}
Why: Ralph may be spawned from different contexts with different working directories.
4. Completion Signal Protocol
When done, Ralph MUST:
- Write final
PROGRESS.mdwith "## Status: COMPLETE" - List all created/modified files
- Exit cleanly (no hanging processes)
Example completion PROGRESS.md:
# Ralph: Influencer Detail Page
## Status: COMPLETE โ
**Finished:** [ISO timestamp]
### Final Verification
- [x] TypeScript: Pass
- [x] Tests: Pass
- [x] Build: Pass
### Files Created
- `src/app/feature/page.tsx`
- `src/app/api/feature/route.ts`
### Testing Instructions
1. Run: `npm run dev`
2. Visit: `http://localhost:3000/feature`
3. Verify: [specific checks]
5. Error Handling Requirements
If Ralph encounters unrecoverable errors:
- Log to PROGRESS.md with "## Status: BLOCKED"
- Describe blocker in detail
- List attempted solutions
- Exit cleanly (don't hang)
Do not silently fail. A Ralph that stops iterating with no progress log is indistinguishable from one still working.
6. Iteration Time Limits
Set explicit iteration timeouts:
## Operational Parameters
- Max iteration time: 10 minutes
- Total session timeout: 60 minutes
- If iteration exceeds limit: Log blocker, exit
Why: Prevents infinite loops on stuck tasks, allows parent agent to intervene.
Memory Updates
After each Ralph Mode session, document:
## [Date] Ralph Mode Session
**Project:** [project-name]
**Duration:** [iterations]
**Outcome:** success / partial / blocked
**Learnings:**
- What worked well
- What needs adjustment
- Patterns to add to AGENTS.md
Appendix: Hall of Failures
Common anti-patterns observed:
| Anti-Pattern | Consequence | Prevention |
|---|---|---|
| No progress logging | Parent agent cannot determine status | Mandatory PROGRESS.md |
| Silent failure | Work lost, time wasted | Explicit error logging |
| Overlapping sessions | File conflicts, corrupt state | Check/cleanup before spawn |
| Path assumptions | Wrong directory, wrong files | Explicit verification |
| No completion signal | Parent waits indefinitely | Clear COMPLETE status |
| Infinite iteration | Resource waste, no progress | Time limits + blockers |
| Complex initial prompts | Sub-agent never starts (empty session logs) | SIMPLIFY instructions |
NEW: Session Initialization Best Practices (2025-02-07)
Problem: Sub-agents spawn but don't execute
Evidence: Empty session logs (2 bytes), no tool calls, 0 tokens used
Root Causes
- Instructions too complex - Overwhelms isolated session initialization
- No clear execution trigger - Agent doesn't know to start
- Branching logic - "If X do Y, if Z do W" confuses task selection
- Multiple files mentioned - Can't decide which to start with
Fix: SIMPLIFIED Ralph Task Template
## Task: [ONE specific thing]
**File:** exact/path/to/file.ts
**What:** Exact description of change
**Validate:** Exact command to run
**Then:** Update PROGRESS.md and exit
## Rules
1. Do NOT look at other files
2. Do NOT "check first"
3. Make the change, validate, exit
BEFORE (Bad - causes stalls):
Fix all TypeScript errors across these files:
- lib/db.ts has 2 errors
- lib/proposal-service.ts has 5 errors
- route.ts has errors
Check which ones to fix first, then...
AFTER (Good - executes):
Fix lib/db.ts line 27:
Change: PoolClient to pg.PoolClient
Validate: npm run typecheck
Exit immediately after
CRITICAL: Single File Rule
Each Ralph iteration gets ONE file. Not "all errors", not "check then decide". ONE file, ONE change, validate, exit.
CRITICAL: Update PROGRESS.md
MANDATORY: After EVERY iteration, update PROGRESS.md with:
## Iteration [N] - [Timestamp]
### Status: Complete โ
| Blocked โ | Failed โ
### What Was Done
- [Specific changes made]
### Validation
- [Test/lint/typecheck results]
### Next Step
- [What should happen next]
Why this matters: Cron job reads PROGRESS.md for status updates. If not updated, status appears stale/repetitive.
Debugging Ralph Stalls
If Ralph stalls:
- Check session logs (should show tool calls within 60s)
- If empty after spawn โ instructions too complex
- Reduce: ONE file, ONE line number, ONE change
- Shorter timeout forces smaller tasks (300s not 600s)
Fixing Stale Status Reports
If cron reports same status repeatedly:
- Check PROGRESS.md was updated by sub-agent
- If not updated โ sub-agent skipped documentation step
- Update skill: Add "MANDATORY PROGRESS.md update" to prompt
- Manual fix: Update PROGRESS.md to reflect actual state
Summary
Ralph works when: Single file focus + explicit change + validate + exit Ralph stalls when: Complex decisions + multiple files + conditional logic