mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-07 02:03:55 -05:00
LinkML schema at vault/schema/question_schema.yaml is the sole schema
source of truth. Pydantic models in vault_cli.models are currently
hand-authored to match; full LinkML codegen wires in Phase 2 with the
drift-check in CI.
Core modules:
vault_cli/models.py — Pydantic question model (closed enums, content-
format per field, schema_version=1 gate).
vault_cli/hashing.py — canonical content_hash over whitelisted fields;
release_hash Merkle with __policy__ and
__canon_version__ leaves (Chip N-H5).
vault_cli/yaml_io.py — hardened SafeLoader: 256KB cap, depth 10 cap,
aliases rejected, timeout (H-7).
vault_cli/paths.py — path-as-classification parser with lowercase +
enum enforcement (H-9).
vault_cli/loader.py — walks vault/questions/, returns loaded + errors
(never raises — aggregate reporting).
vault_cli/validator.py — tiered invariant engine; fast + structural tiers
implemented per ARCHITECTURE.md §5.
vault_cli/compiler.py — YAML → SQLite with release_metadata rows
(release_id, release_hash, policy_version,
schema_version, published_count).
vault_cli/policy.py — single filter predicate. No consumer
re-implements (H-21).
release-policy.yaml v1: status=published. Dropped require_validated in
the wake of 9199/8053 resolution — validation is implicit in the
maintainer-approval → status=published transition, not a separate flag.
Tests (19 pass): key-order hash invariance (Soumith M-NEW-4), policy
filter correctness (H-21 runtime check), YAML hardening (H-7).
32 lines
883 B
Python
32 lines
883 B
Python
"""Tests for the hardened YAML loader (REVIEWS.md H-7)."""
|
|
|
|
from __future__ import annotations
|
|
|
|
import pytest
|
|
|
|
from vault_cli.yaml_io import MAX_BYTES, VaultYamlError, load_bytes
|
|
|
|
|
|
def test_simple_mapping_loads() -> None:
|
|
assert load_bytes(b"a: 1\nb: 2\n") == {"a": 1, "b": 2}
|
|
|
|
|
|
def test_size_cap() -> None:
|
|
big = b"x: " + b"a" * (MAX_BYTES + 1)
|
|
with pytest.raises(VaultYamlError, match="exceeds max"):
|
|
load_bytes(big)
|
|
|
|
|
|
def test_depth_cap() -> None:
|
|
# 12 nested mappings (> MAX_DEPTH=10)
|
|
doc = "".join(f"{' ' * i}a:\n" for i in range(12)) + " " * 24 + "b: 1\n"
|
|
with pytest.raises(VaultYamlError, match="nesting depth"):
|
|
load_bytes(doc.encode())
|
|
|
|
|
|
def test_rejects_aliases() -> None:
|
|
# Classic billion-laughs-lite pattern
|
|
doc = b"a: &A\n x: 1\nb: *A\n"
|
|
with pytest.raises(VaultYamlError, match="aliases"):
|
|
load_bytes(doc)
|