Skip to main content

Replay a Run

Stub

This How-to is a stub. JARVIS emits durable artifacts, but a first-class “replay API” is not yet implemented.

Goal

You will replay a run with the same inputs and compare outputs to debug regressions.

When to use this

You want deterministic regression testing across stack versions.
You need to debug “why did it do that?” with stable inputs.

Prerequisites

A run with durable artifacts (candidate sets, decisions, outputs)
A way to re-run with pinned versions (recommended: JARVIS_Release pinning)

Steps

Identify the run to replay (run_id).
Capture the exact inputs and the pinned stack/node versions.
Start a new run with the same inputs and pins.
Compare artifacts and outputs.

Verify

The replay run completes and produces comparable artifacts.
Differences are explainable (version change, policy change, model/profile change).

Troubleshooting

Non-determinism → pin model/profile and reduce LLM variability.
Missing artifacts → ensure artifact emission is enabled and stored durably.
External dependencies changed → record external inputs as artifacts or mock them.

Cleanup / Rollback

Optional: delete replay runs according to retention policy.

Next steps

How-to: Diff two runs
Concept: Artifacts and replay

Goal
When to use this
Prerequisites
Steps
Verify
Troubleshooting
Cleanup / Rollback
Next steps