Overview
Checkpointing lets you persist workflow state over time so you can:- Recover from failures
- Resume long-running agents
- Support human-in-the-loop workflows
- Build time-travel debugging tools
run_id, step_name, timestamp, and a serializable snapshot of state.
Checkpoint stores
Checkpoint storage is abstracted by theCheckpointStore protocol. Coevolved includes an in-memory store for development:
MemoryCheckpointStore: convenient, but not durable
Checkpoint policies
CheckpointPolicy controls when to checkpoint:
- Before a step/iteration
- After a step/iteration
- On error
- On interrupt
agent_loop) accept a checkpoint store + policy and apply it around each iteration.
Resume workflows
Checkpointing is most useful when paired with:- A stable
run_idyou can use to query history - A resume mechanism (for interrupts or retries)
Coevolved’s checkpointing primitives are intentionally storage-agnostic. You decide how to persist, index, and load state.