test(e2e): rename registry entry to euroc_machine_hall with real SHA256

The prior registry entry was speculative: ``euroc_mh01`` pointing at an
old ``robotics.ethz.ch`` URL that no longer resolves (TCP timeout).
The dataset moved to ETH Research Collection (DOI 10.3929/ethz-b-000690084)
as a single 12.6 GB ``machine_hall.zip`` bundle containing MH_01…MH_05.
There's no stable direct download URL — DSpace gates behind a UI —
so:

- Renamed entry: ``euroc_mh01`` → ``euroc_machine_hall`` (matches the
  actual artifact).
- SHA256 set to the real bundle hash 5ed7d07…
- URL left empty (same pattern as ``vpair_sample``); the CLI now
  exits 3 and prints fetch instructions for empty-URL entries instead
  of crashing on ``urllib.request.urlretrieve("")``.
- Adapter ``DatasetNotAvailableError`` message and conftest skip-reason
  updated to tell engineers how to fetch/unpack manually.
- ``test_registry_has_euroc_machine_hall`` pin test replaces the old
  pin; asserts real hash (not the ``"0"*64`` placeholder).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Yuzviak
2026-04-17 17:48:01 +03:00
committed by Maksym Yuzviak
parent f2f278bc09
commit b57187e1b8
5 changed files with 51 additions and 26 deletions
+3 -1
View File
@@ -33,7 +33,9 @@ class EuRoCAdapter(DatasetAdapter):
raise DatasetNotAvailableError(
f"EuRoC sequence not found at {self._root} "
f"(expected {self._root}/mav0/). "
"Run `python scripts/download_dataset.py euroc_mh01` first."
"Fetch the Machine Hall bundle from ETH Research Collection "
"(DOI 10.3929/ethz-b-000690084), then unpack the inner "
"MH_0N_easy.zip of interest into this directory."
)
@property
+16 -19
View File
@@ -16,26 +16,23 @@ class DatasetSpec:
unpack: bool = True
# NOTE ON SHA256 VALUES BELOW:
# The EuRoC MH_01 hash is a placeholder of 64 zeros. Before the first real
# download the engineer MUST:
# 1. Manually fetch the zip:
# curl -L <url from spec> -o /tmp/MH_01_easy.zip
# 2. Compute its hash:
# sha256sum /tmp/MH_01_easy.zip
# 3. Replace the placeholder here with the real hex string.
# 4. Commit the updated hash alongside the first real test run.
# Leaving a length-valid (64-char) placeholder keeps the registry well-formed
# and lets the download function refuse to keep any file that doesn't match,
# so the placeholder state fails loudly rather than silently accepting.
# REGISTRY NOTES:
# Entries with url="" are manual-download-only. The ETH Research Collection
# (DOI 10.3929/ethz-b-000690084) gates downloads behind a DSpace UI that
# doesn't expose a stable direct URL, so the bundle has to be fetched by
# hand and placed beside the project. SHA256 remains authoritative: the
# download helper refuses anything that doesn't match.
DATASET_REGISTRY: dict[str, DatasetSpec] = {
"euroc_mh01": DatasetSpec(
url=(
"http://robotics.ethz.ch/~asl-datasets/ijrr_euroc_mav_dataset"
"/machine_hall/MH_01_easy/MH_01_easy.zip"
),
sha256="0" * 64, # placeholder — see note above
target_subdir="euroc/MH_01",
"euroc_machine_hall": DatasetSpec(
# 12.6 GB bundle containing MH_01 … MH_05 (each as inner .zip + .bag).
# Download from ETH Research Collection:
# https://doi.org/10.3929/ethz-b-000690084
# After fetching, unpack the inner MH_0N_easy.zip of interest into
# datasets/euroc/MH_0N/ so the adapter finds mav0/.
url="", # manual download — see comment above
sha256="5ed7d07903f8d19b6c8808e2ae8a0872b281f6e34ef5497023b8ac58c3de0f6f",
target_subdir="euroc",
unpack=False, # the bundle itself is not unpacked end-to-end; see README
),
"vpair_sample": DatasetSpec(
url="", # manual download only — see Zenodo link on