cs249r_book

mirror of https://github.com/harvard-edge/cs249r_book.git synced 2026-05-08 02:28:25 -05:00

Author	SHA1	Message	Date
Vijay Janapa Reddi	7700726de2	chore(staffml): release polish — drop hash pin, skeletons, error reporting Three small polish items flagged in the pre-release audit: 1. DROP release_hash pin The regression guard in staffml-validate-vault.yml compared vault.db's computed release_hash against a pinned value in interviews/vault/corpus-equivalence-hash.txt. That pin was load-bearing when corpus.json was the source of truth (guarded drift between committed-JSON and computed-from-YAMLs hash), but post-v1.0 the YAMLs ARE the source of truth and the hash is deterministic from them. The pin became a circular check that would bounce every YAML-touching PR unless the contributor remembered to manually bump the hash. Removed the pin comparison; the step now just runs vault build as a reproducibility smoke test. Real integrity still comes from vault check --strict + codegen drift earlier in the same workflow. Deleted interviews/vault/corpus-equivalence-hash.txt. 2. Hydration SKELETON for scenario Summary bundle ships scenario: "" and details with empty strings; useFullQuestion fetches the real content from the worker (~100-300ms warm, <5s cold). Before this commit the practice + plans pages showed a visibly empty region for that hydration window, then popped the scenario in — a text-FOUC. Added ScenarioSkeleton component (three pulsing bars of approximate paragraph height, aria-busy) and rendered it when current.scenario is empty on both practice and plans. Layout no longer jumps when real text arrives. 3. CLIENT-SIDE ERROR REPORTER Silent production regressions (like the getQuestionFullDetail shape mismatch in PR #1440) were only discoverable when a user said 'getting an error'. Added a lightweight error reporter that hooks window.error + unhandledrejection, scrubs email patterns, rate-limits to 20 unique reports per tab, and pipes into the existing analytics worker as 'client_error' events. No new vendor dependency — reuses analytics-worker KV storage. Worker allowlist extended: adds 'client_error' event type + larger 8 KiB per-event cap to fit stack traces + 'message/stack/url/ userAgent' to the allowed-fields list. Installed from Providers.tsx at app mount. Build verified green.	2026-04-22 12:17:12 -04:00
Vijay Janapa Reddi	3160a1cee5	feat(analytics): secure StaffML analytics worker and add IRT fields * Fixed race condition in KV storage causing data loss under concurrent POSTs * Secured GET /summary endpoint with ADMIN_SECRET auth header * Added userLevel, industryRole, and yearsExperience to telemetry schema for Item Response Theory (IRT) validation * Re-balanced vendor representation in paper examples (added AMD MI300X and Intel Gaudi 3)	2026-04-08 18:43:02 -04:00
Vijay Janapa Reddi	5955e4a9e2	feat(staffml): complete feedback pipeline with tests and CI Fix the feedback data round-trip end-to-end: - QuestionFeedback: dedup guard, aria-pressed, hydrate previous feedback on mount, wire Report/Suggest to analytics events - analytics.ts: computeSummary() aggregates thumbs and difficulty with last-write-wins dedup per question+session - dashboard: new thumbs ratio and difficulty distribution panels - gauntlet: add QuestionFeedback to per-question review - progress.ts: include analytics in export/import - worker.js: server-side summary aggregates feedback with dedup Add Vitest test infrastructure (34 journey tests across 2 files) and embed type-check + test steps in both CI deploy workflows so tests gate every build before deployment.	2026-04-05 13:19:02 -04:00
Vijay Janapa Reddi	a14d46f223	feat(staffml): comprehensive analytics — close all 13 tracking gaps Systematic audit found 13 gaps in analytics coverage. Now tracking: Session signals: - session_start with isReturning flag + screenWidth - search_query with result counts (debounced 1s) Content quality signals: - questionId in question_scored (enables per-question IRT) - hadUserAnswer flag on answer_revealed (reveal-without-typing rate) - hadUserAnswer on answer_response_time - star_gate_shown / star_gate_verified (gate drop-off measurement) Feature usage: - gauntlet_completed with pct score (was defined but never wired) - search tracking (what users look for = gold for content gaps) Worker updated with new event types + allowed fields.	2026-04-02 15:15:11 -04:00
Vijay Janapa Reddi	ae278e9b92	feat(staffml): deploy Cloudflare analytics worker + wire CI pipelines - Deploy analytics worker to mlsysbook.ai/api/staffml-analytics - KV namespace: bf81298013404118beab61f55afe1d7d - Add NEXT_PUBLIC_ANALYTICS_URL to both CI workflows - Events batched client-side every 30s, flushed on page unload - Worker validates events, strips PII, stores with 90-day TTL - CORS restricted to mlsysbook.ai, harvard-edge.github.io, localhost	2026-04-02 15:15:10 -04:00
Vijay Janapa Reddi	3340f5e977	feat(staffml): add response time + napkin grade tracking for IRT calibration Analytics now captures: - answer_response_time: seconds spent before revealing, with napkin grade - question_thumbs: binary quality signal (up/down) - question_difficulty_feedback: perceived difficulty vs assigned level - question_contributed: in-app contribution tracking These signals enable empirical difficulty calibration (IRT) when aggregated across users. Response time is a more objective difficulty proxy than self-assessed scores.	2026-04-02 07:20:00 -04:00
Vijay Janapa Reddi	098f872821	feat(staffml): 8,891 Qs + backward design + math verification + A100 fix Corpus: 8,891 published (87.8% validated). Backward design methodology. A100 constants fixed (FP16: 156→312 TFLOPS). Math verification done. New figures: backward design chain, applicability matrix. Bibliography updated (Wiggins, Messick). Verification script added.	2026-04-01 23:53:38 -04:00

7 Commits