Artifacts and Diagnostics

Hardsim preserves different outputs depending on whether a job succeeds or fails.

Successful Jobs

A successful simulation or Isaac Lab rollout typically includes:

rollout.zarr
render.mp4 when video was requested
artifact_manifest.json
runner.log
any task-authored logs such as user_job.log

runner.log is the structured runner summary. On successful jobs it should look like:

status=ok
stage=completed
message=isaac_lab rollout task completed
duration_s=39.416

Failed Jobs

Failed jobs do not need to produce success artifacts like rollout.zarr or render.mp4. Instead, Hardsim preserves diagnostics artifacts so you can debug the real cause:

runner.log
diagnostics.json
user_job.log
command.stdout.log
command.stderr.log
asset_staging.log when asset staging happened

The API/UI should surface the original failure first, and only mention missing success artifacts as secondary context.

What Each File Is For

`runner.log`

Structured machine-readable runner status. Typical fields:

status
stage
message
error_code
error_category
retryable
duration_s
traceback

`diagnostics.json`

Structured summary generated by the worker when a job fails. Use this when you need one compact object with:

top-level error
stage
error code/category
runner summary
tails from user/stdout/stderr logs

`user_job.log`

Task-authored log output. This is the best place to inspect:

task telemetry
controller state transitions
target selection
task-specific failure reasons

`command.stderr.log`

Container/runtime stderr output. Use this for:

Isaac startup failures
missing dependency errors
plugin or shutdown crashes

Typical Failure Pattern

When a workload fails before writing success artifacts, you may see an error shaped like:

simulation task reported failure before required artifacts were written

task.error=A3 tabletop pick canary failed: RuntimeError('task success criteria unmet: cube was never grasp-attached to the robot')

missing_artifact=rollout.zarr

This means:

the real failure happened first
rollout.zarr was never written because the task failed
you should debug the task failure, not the missing artifact symptom

Recommended Debug Workflow

For any failed robotics workload:

Read runner.log
Read user_job.log
Read command.stderr.log
Watch render.mp4 if a partial video was still produced
Fix the workload logic or task configuration

Downloading Artifacts

result = client.wait(job.job_id, poll_interval_s=2.0, timeout_s=1800.0, raise_on_error=False)
paths = client.download(job.job_id, "./outputs")
print(result["status"], paths)

​Successful Jobs

​Failed Jobs

​What Each File Is For

​runner.log

​diagnostics.json

​user_job.log

​command.stderr.log

​Typical Failure Pattern

​Recommended Debug Workflow

​Downloading Artifacts

​Related