mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-05-06 01:28:35 -05:00
Track *.epub, *.pdf, *.mp3/wav/m4a/mp4/mov/webm, *.wasm via Git LFS so future additions stop compounding the clone-size problem. Existing history is NOT migrated by this change — see issues #1393 and #1175 for the planned Phase 2 (`git lfs migrate import`, force-push, team re-clone) which requires VJ approval and coordinated rollout. Mixed-size patterns (*.png, *.jpg, *.gif) are deliberately omitted: the repo has thousands of small icon PNGs alongside 1-5 MB cover art / kit photos and a blanket pattern would LFS-track the small ones too. Leaving those for VJ to scope by path. Relates to #1393, #1175.
84 lines
4.4 KiB
Plaintext
84 lines
4.4 KiB
Plaintext
# =============================================================================
|
|
# .gitattributes — going-forward Git LFS tracking and text/binary handling
|
|
# =============================================================================
|
|
#
|
|
# IMPORTANT: this file affects ONLY future `git add` operations. Existing
|
|
# blobs in history are NOT migrated by these patterns. A separate, coordinated
|
|
# `git lfs migrate import` (Phase 2) is required to actually relocate the
|
|
# ~2 GB of binaries already in `.git`. See PR #(this PR) and issues #1393,
|
|
# #1175 for the migration plan.
|
|
#
|
|
# `.gitignore` takes precedence over LFS tracking — if a file is ignored
|
|
# (e.g., `.gitignore` exempts callout-icon PDFs from the global *.pdf rule),
|
|
# `.gitattributes` LFS tracking will only apply if the file is actually being
|
|
# staged.
|
|
|
|
# -----------------------------------------------------------------------------
|
|
# Distribution / publish artefacts (large, infrequently changing, binary)
|
|
# -----------------------------------------------------------------------------
|
|
# EPUB: zero currently tracked in HEAD; ~952 MB across 15 historical versions
|
|
# in `assets/downloads/Machine-Learning-Systems.epub`. Mark for LFS so any
|
|
# future re-add does not bloat .git.
|
|
*.epub filter=lfs diff=lfs merge=lfs -text
|
|
|
|
# PDF: covers TinyTorch-Guide.pdf, 00_tinytorch.pdf, distribution PDFs.
|
|
# Note: `.gitignore` excludes most PDFs by default but explicitly allows
|
|
# callout-icon PDFs, mlsysim docs, paper figures, etc. Those exempted PDFs
|
|
# WILL be LFS-tracked under this pattern when newly added — that's the
|
|
# intended behaviour: small icon PDFs are still small as LFS pointers, and
|
|
# they are infrequently changed.
|
|
*.pdf filter=lfs diff=lfs merge=lfs -text
|
|
|
|
# -----------------------------------------------------------------------------
|
|
# Audio / video (always binary, never deltas well)
|
|
# -----------------------------------------------------------------------------
|
|
# Two MP3 podcasts are currently tracked (~16 MB combined). Multiple sites
|
|
# (book quarto, socratiQ, kits) may add more in future.
|
|
*.mp3 filter=lfs diff=lfs merge=lfs -text
|
|
*.wav filter=lfs diff=lfs merge=lfs -text
|
|
*.m4a filter=lfs diff=lfs merge=lfs -text
|
|
*.mp4 filter=lfs diff=lfs merge=lfs -text
|
|
*.mov filter=lfs diff=lfs merge=lfs -text
|
|
*.webm filter=lfs diff=lfs merge=lfs -text
|
|
|
|
# -----------------------------------------------------------------------------
|
|
# Bundled JS / WASM artefacts (when ever tracked)
|
|
# -----------------------------------------------------------------------------
|
|
# These are typically build outputs and SHOULD be ignored via .gitignore
|
|
# rather than tracked (see `book/quarto/tools/scripts/socratiQ/bundle.js`,
|
|
# the historical `scripts/ai_menu/dist/bundle.js`, and the Next.js
|
|
# `staffml/_next/static/chunks/*.js` blobs). However, if a bundle ever does
|
|
# need to be tracked (e.g., a vendored externally-published artefact),
|
|
# treat it as binary so we don't burn diff cycles.
|
|
*.wasm filter=lfs diff=lfs merge=lfs -text
|
|
|
|
# -----------------------------------------------------------------------------
|
|
# NOT added to LFS (uncertainty / mixed-size patterns) — defer to VJ
|
|
# -----------------------------------------------------------------------------
|
|
# *.png — repo mixes 1-5 MB cover art / kit photos with thousands of small
|
|
# icon PNGs. A blanket pattern would LFS-track the small ones too.
|
|
# Recommend either path-scoped patterns
|
|
# (e.g. `book/quarto/assets/images/covers/**/*.png filter=lfs ...`)
|
|
# or rasterizing big PNGs to a single canonical location first.
|
|
# *.jpg / *.jpeg / *.gif — same mixed-size issue. The single biggest GIF is
|
|
# `book/quarto/contents/vol1/introduction/images/gif/_alphafold.gif`
|
|
# at 3 MB; most others are small.
|
|
# *.json — `corpus.json`, `corpus-summary.json`, `search.json` are big but
|
|
# they are build artefacts and already in `.gitignore`. JSON in
|
|
# general should NOT be LFS-tracked (it's text and diffs well).
|
|
|
|
# -----------------------------------------------------------------------------
|
|
# Text-handling normalization
|
|
# -----------------------------------------------------------------------------
|
|
# Tell git to auto-normalize line endings on text files. Binary patterns
|
|
# above already opt out via `-text`.
|
|
* text=auto eol=lf
|
|
|
|
# Shell scripts and Makefiles must keep LF on Windows checkouts.
|
|
*.sh text eol=lf
|
|
Makefile text eol=lf
|
|
|
|
# Avoid CRLF translation for Windows-native batch files.
|
|
*.bat text eol=crlf
|
|
*.cmd text eol=crlf
|