Skip to main content
Hardsim preserves different outputs depending on whether a job succeeds or fails.

Successful Jobs

A successful simulation or Isaac Lab rollout typically includes:
  • rollout.zarr
  • render.mp4 when video was requested
  • artifact_manifest.json
  • runner.log
  • any task-authored logs such as user_job.log
runner.log is the structured runner summary. On successful jobs it should look like:
status=ok
stage=completed
message=isaac_lab rollout task completed
duration_s=39.416

Failed Jobs

Failed jobs do not need to produce success artifacts like rollout.zarr or render.mp4. Instead, Hardsim preserves diagnostics artifacts so you can debug the real cause:
  • runner.log
  • diagnostics.json
  • user_job.log
  • command.stdout.log
  • command.stderr.log
  • asset_staging.log when asset staging happened
The API/UI should surface the original failure first, and only mention missing success artifacts as secondary context.

What Each File Is For

runner.log

Structured machine-readable runner status. Typical fields:
  • status
  • stage
  • message
  • error_code
  • error_category
  • retryable
  • duration_s
  • traceback

diagnostics.json

Structured summary generated by the worker when a job fails. Use this when you need one compact object with:
  • top-level error
  • stage
  • error code/category
  • runner summary
  • tails from user/stdout/stderr logs

user_job.log

Task-authored log output. This is the best place to inspect:
  • task telemetry
  • controller state transitions
  • target selection
  • task-specific failure reasons

command.stderr.log

Container/runtime stderr output. Use this for:
  • Isaac startup failures
  • missing dependency errors
  • plugin or shutdown crashes

Typical Failure Pattern

When a workload fails before writing success artifacts, you may see an error shaped like:
simulation task reported failure before required artifacts were written

task.error=A3 tabletop pick canary failed: RuntimeError('task success criteria unmet: cube was never grasp-attached to the robot')

missing_artifact=rollout.zarr
This means:
  1. the real failure happened first
  2. rollout.zarr was never written because the task failed
  3. you should debug the task failure, not the missing artifact symptom
For any failed robotics workload:
  1. Read runner.log
  2. Read user_job.log
  3. Read command.stderr.log
  4. Watch render.mp4 if a partial video was still produced
  5. Fix the workload logic or task configuration

Downloading Artifacts

result = client.wait(job.job_id, poll_interval_s=2.0, timeout_s=1800.0, raise_on_error=False)
paths = client.download(job.job_id, "./outputs")
print(result["status"], paths)