How We Made AI Agents 10x More Reliable with State Machines
Building production-ready AI agents that don’t lose track of their work
The Problem: AI Agents That Forget
If you’ve worked with AI coding agents, you’ve probably experienced this frustrating scenario:
You ask an agent to build a feature. It starts enthusiastically, creates a few files, then... gets interrupted. Your browser crashes. Your internet drops. You close the tab by accident.
When you come back, the agent has amnesia. It doesn’t remember:
What phase of development it was in
Which tasks it already completed
What files it created
Where it left off
You’re forced to start over, wasting time and tokens. Sound familiar?
The “Stateless Agent” Problem
Most AI agent platforms treat each conversation as ephemeral. They rely on:
Chat history to remember context (which gets truncated)
In-memory state (which vanishes on restart)
Hope that nothing goes wrong (spoiler: things go wrong)
This works fine for simple Q&A. But for complex, multi-step development tasks that take hours? It’s a disaster.
Enter: The Conversation State Machine
At JustCopy.ai, we solved this by treating AI development like what it actually is: a long-running, resumable workflow.
We built a persistent state machine that tracks every aspect of an AI agent’s work in database. Here’s what it remembers:
1. Development Phases
Every project follows a clear workflow:
Requirements → User Flows → Data Models → Frontend → Backend → Integration → Testing → Deployment
The state machine knows:
Which phase you’re currently in
Which phases are complete
What was accomplished in each phase
When each phase started and finished
2. Todo-Level Granularity
Within each phase, the agent breaks work into specific todos:
// Requirements Phase
[
{ id: ‘gather-requirements’, description: ‘Document core features’, completed: true },
{ id: ‘define-user-stories’, description: ‘Create user stories’, completed: true },
{ id: ‘validate-scope’, description: ‘Confirm project scope’, completed: false }
]
Every todo tracks:
Completion status
Validation evidence (proof it was done correctly)
Timestamp of completion
3. Sandbox & Code State
The state machine tracks your sandbox:
Is it running, stopped, or destroyed?
What’s the sandbox ID for resumption?
Is code synced between file storage and the sandbox?
4. Progress Metrics
Real-time progress calculation based on:
Phase weights (frontend/backend are worth more than planning phases)
Todo completion within each phase
Overall project completion percentage
Demo
How It Works in Practice
Here’s the magic: interruptions become non-events.
Scenario 1: Browser Crash During Development
Before (traditional agent):
User: “Build me a todo app”
Agent: *creates 5 files*
[Browser crashes]
User: *reopens* “Continue building the todo app”
Agent: “Sure! Let me start by creating a todo app...”
[Creates duplicate files, conflicts everywhere]
After (with state machine):
User: “Build me a todo app”
Agent: *creates 5 files, marks todos complete*
[Browser crashes]
User: *reopens* “Continue”
Agent: “Resuming from frontend phase...
✅ Completed: UI components (5 files)
🎯 Next: API integration (todo 3/5)
Continuing where we left off...”
[Picks up exactly where it stopped]
Scenario 2: Multi-Session Projects
Your agent can work across multiple sessions:
Day 1 (30 minutes):
Complete requirements phase
Start user flow design
Get interrupted for a meeting
Day 2 (next morning):
Agent resumes in user-flows phase
Shows summary of yesterday’s work
Continues from todo #3
Day 3 (after the weekend):
All context preserved
Full history available
Zero rework needed
The Technical Implementation
State Persistence
interface ConversationState {
conversationId: string;
projectId: string;
// Current state
currentPhase: ‘requirements’ | ‘frontend’ | ‘backend’ | ...;
phaseHistory: PhaseCompletion[];
// Sandbox state
sandboxState: ‘running’ | ‘stopped’ | ‘destroyed’;
sandboxId?: string;
// Progress tracking
overallProgress: number; // 0-100%
// Resume capability
lastResumedAt?: string;
interruptionCount: number;
}
Every tool call (creating files, installing packages, running tests) updates this state and persists it to Database.
Auto-Recovery
When you resume a conversation, the agent receives detailed instructions:
📋 RESUME INSTRUCTIONS
Current Phase: frontend
Progress: 45%
Interruptions: 2
🐳 Sandbox: running (ID: sb_abc123)
📦 S3 Storage: in-development
- Code Path: projects/user123/proj456/app/
✅ Completed Phases:
- requirements: Documented core todo app features
- user-flows: Designed 3 user flows
🎯 Next Steps:
1. Read agent file: agents/.production/frontend-plan.md
2. Check S3 for existing code
3. Continue from todo: “Build task list component”
Phase Transitions
The state machine enforces correct workflow:
// You can’t skip phases
const PHASE_TRANSITIONS = {
‘requirements’: [’user-flows’, ‘frontend’], // Can skip to frontend if simple
‘user-flows’: [’entities’, ‘frontend’],
‘frontend’: [’backend’, ‘integration’],
‘backend’: [’integration’],
‘integration’: [’testing’, ‘deployment’]
};
When all todos in a phase complete, the agent automatically transitions to the next phase.
The Results
Since implementing the state machine:
86% reduction in “lost work” incidents 3x faster project resumption (seconds vs. minutes of context gathering) Zero duplicate file creation errors 100% of interrupted sessions successfully resumed
Real Customer Impact
Sarah, Indie Developer:
“I can now work on my SaaS in 30-minute chunks between meetings. The agent picks up exactly where I left off. It’s like having a coworker who never forgets anything.”
Mike, Startup Founder:
“We had a power outage mid-deployment. Instead of panicking, I just reopened the tab the next day and the agent resumed the deployment. Mind blown.”
The Bigger Picture: Production-Ready AI
This isn’t just about crash recovery. It’s about making AI agents production-ready.
Production systems need:
✅ Reliability - Handle failures gracefully
✅ Observability - Track what’s happening
✅ Resumability - Continue from any point
✅ Auditability - Know what was done and when
Our state machine provides all of this.
What’s Next?
We’re extending the state machine to support:
Multi-agent collaboration - Multiple agents working on different phases simultaneously
Rollback capabilities - Undo to any previous phase
Branch workflows - Try different approaches without losing your main path
Time-travel debugging - See exactly what the agent was thinking at any point
Try It Yourself
The state machine is live on JustCopy.ai for all users. Here’s how to experience it:
Start a new project
Let the agent work for a few minutes
Close your browser completely
Come back and say “continue”
Watch it resume exactly where it left off
No more lost work. No more starting over. Just reliable, resumable AI development.
Built with ❤️ by the JustCopy.ai team
Making AI agents reliable enough for production since 2025
Comments & Discussion
What reliability challenges have you faced with AI agents? Share your experiences in the comments below!
Want to experience bulletproof AI development? Try JustCopy.ai today.


