mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-04-30 01:59:20 -05:00

Files

Vijay Janapa Reddi 6191a039f6 Replace \paragraph{} with \noindent\textbf{} throughout paper

Converted all paragraph headings to bold text format for consistent
styling throughout the document. This improves visual consistency and
follows the requested formatting guidelines.

Changes:
- Paper Organization (introduction)
- Build/Use/Reflect cycle descriptions
- Why Milestones Matter
- The Six Historical Milestones
- Experiencing Performance Reality
- All future work subsection headings (Roofline Models, ASTRA-sim,
  Energy and Power Profiling, The Three-Tier Systems Pedagogy)

Table 3 remains correctly positioned in Systems Integration section
where performance trade-offs are discussed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-18 17:16:37 -05:00

34 KiB

Raw Blame History

TinyTorch Literature Review Assessment

Dr. James Patterson - Research Literature Expert

Date: 2025-11-18 Paper Analyzed: /Users/VJ/GitHub/TinyTorch/paper/paper.tex References: /Users/VJ/GitHub/TinyTorch/paper/references.bib

Executive Summary

The TinyTorch paper demonstrates solid core coverage of educational ML frameworks and learning theory, but has critical gaps in three areas that could hurt in peer review:

CORRUPTED BIBLIOGRAPHY ENTRIES - Two citations have completely wrong metadata (bruner1960, perkins1992)
Missing Recent Work - No 2023-2024 ML systems education research cited
Weak Industry Evidence - Workforce gap claims rely solely on non-academic sources

Overall Assessment: 7/10 citation strategy. Strong pedagogical grounding, fair competitive positioning, but needs 5-7 strategic additions and 2 critical fixes before submission.

1.1 Educational ML Frameworks - STRONG ✅

What's Cited:

micrograd (Karpathy 2022) - autograd minimalism
MiniTorch (Rush 2020) - Cornell Tech curriculum
tinygrad (Hotz 2023) - inspectable production system
d2l.ai (Zhang 2021) - comprehensive algorithmic foundations
fast.ai (Howard 2020) - top-down layered API approach

Assessment:

All major direct competitors cited ✅
Comparisons are fair and strategic - acknowledge strengths while positioning TinyTorch's unique systems-first angle
micrograd comparison is accurate (scalar-only, no systems focus)
MiniTorch comparison is respectful and differentiating (math rigor vs. systems-first)
tinygrad comparison correctly identifies scaffolding gap

What's MISSING:

JAX educational materials - Only cited as code (bradbury2018jax) but not discussed in related work despite JAX being increasingly used for educational purposes (functional paradigm, explainability)
nnfs.io (Neural Networks from Scratch) - Harrison Kinsley's book/course, similar bottom-up approach but focuses more on algorithms than systems
PyTorch tutorials evolution - The official PyTorch tutorials have become quite pedagogical, worth brief mention

Recommendation:

Add brief mention of JAX's functional approach in Related Work:

"JAX~\citep{bradbury2018jax} offers an alternative functional paradigm through composable transformations, teaching automatic differentiation via jax.grad and vectorization through vmap. While pedagogically valuable for understanding functional programming applied to ML, JAX assumes framework usage rather than framework construction, positioning it complementary to but distinct from TinyTorch's build-from-scratch approach."

1.2 University Courses - GOOD BUT INCOMPLETE ⚠️

What's Cited:

Stanford CS231n (Johnson 2016) - CNNs course with NumPy assignments
CMU DL Systems (Chen 2022) - production ML systems course
Harvard TinyML (Banbury 2021) - embedded deployment focus

Assessment:

CS231n citation is appropriate - isolated exercises vs. cumulative framework
CMU DL Systems positioned correctly as "advanced follow-on" to TinyTorch
TinyML comparison is EXCELLENT - clear differentiation (edge deployment vs. framework internals)

What's MISSING:

Berkeley CS182/282A (Deep Learning) - Widely adopted course, some from-scratch assignments
Full Stack Deep Learning (FSDL) - Production ML course, natural comparison point
MIT 6.S191 - Introductory deep learning, large MOOC audience
Other Cornell courses beyond MiniTorch

Verdict: Adequate coverage of major courses. CS231n and CMU DL Systems are the most important, both cited. Berkeley/MIT would strengthen "landscape coverage" but not critical.

2. Learning Theory Grounding

2.1 Pedagogical Citations - STRONG BUT UNBALANCED ✅⚠️

What's Cited:

Constructionism - Papert 1980 ✅
Cognitive Apprenticeship - Collins 1989 ✅ (cited 3x, most frequent learning theory)
Cognitive Load - Sweller 1988 ✅
Productive Failure - Kapur 2008 ✅ (cited 3x)
Threshold Concepts - Meyer 2003 ✅
Situated Learning - Lave & Wenger 1991 ✅

Assessment: This is exceptionally strong grounding in learning theory. The six theories cited cover:

Knowledge construction (Papert, Collins)
Cognitive constraints (Sweller, Kapur)
Conceptual transformation (Meyer)
Contextual learning (Lave & Wenger)

Citation frequencies reveal emphasis:

Cognitive Apprenticeship (3x) - heavily emphasized
Productive Failure (3x) - heavily emphasized
Others (1x each) - mentioned but less central

CRITICAL PROBLEMS:

CORRUPTED bruner1960process entry:
```
@article{bruner1960process,
  author = {Frolli, A and Cerciello, F...}, # WRONG AUTHORS
  title = {Narrative Approach and Mentalization.}, # WRONG TITLE
  year = {1960}, # But doi is from 2023!
```
This appears to be searching for Bruner's scaffolding work but got wrong paper. FIX IMMEDIATELY - should cite:
- Bruner, J. S. (1960). "The Process of Education" - classic scaffolding work
- OR Wood, Bruner, Ross (1976). "The role of tutoring in problem solving" - original scaffolding paper

CORRUPTED perkins1992transfer entry:

@article{perkins1992transfer,
  author = {Burstein, R and Henry, NJ...}, # WRONG - this is about infant mortality
  title = {Mapping 123 million neonatal, infant and child deaths...}, # COMPLETELY WRONG
  journal = {Nature},
  year = {1992}, # But doi says 2019!

This is cited ONCE but appears to be looking for transfer of learning work. Should be:

Perkins, D. N., & Salomon, G. (1992). "Transfer of learning" - International encyclopedia of education

What's MISSING:

Bloom's Taxonomy - Paper mentions "systems analysis" and "evaluation" levels but doesn't cite Bloom
- You have thompson2008bloom in bib but NEVER CITED in paper!
- This is about CS assessment using Bloom's - should use it when discussing NBGrader assessment
Scaffolding - You discuss scaffolding 5+ times but only cite it via Cognitive Apprenticeship
- Missing direct scaffolding citations: Wood, Bruner, Ross (1976) or Vygotsky (1978) ZPD
- You HAVE vygotsky1978mind in bib but NEVER CITE IT!
Active Learning - You discuss peer instruction, engagement, but no citations
- Missing: Freeman et al. (2014) PNAS meta-analysis on active learning effectiveness
- Missing: Mazur (1997) Peer Instruction
Chunking/Progressive Disclosure - You claim this as innovation but don't cite HCI/progressive disclosure literature
- Missing: Nielsen (1993) progressive disclosure in UI design
- Missing: Mayer (2009) multimedia learning principles

Recommendations:

CRITICAL FIXES (must do before submission):

Fix bruner1960process - find correct Bruner scaffolding citation
Fix perkins1992transfer - find correct transfer learning citation
Cite thompson2008bloom when discussing NBGrader assessment levels
Cite vygotsky1978mind when discussing scaffolding/ZPD

STRATEGIC ADDITIONS (strengthen theoretical foundation): 5. Add Freeman et al. (2014) active learning meta-analysis to justify hands-on approach 6. Add progressive disclosure HCI literature (Nielsen or Mayer) to ground the progressive disclosure pattern claim

2.2 CS Education Research - ADEQUATE ⚠️

What's Cited:

NBGrader (Blank 2019) ✅
Learner-Centered Design (Guzdial 2015) ✅
Peer Instruction (Porter 2013) ✅
Auto-grading review (Ihantola 2010) ✅
Teaching OOP (Kölling 2001) ✅
CS Ed Research (Fincher 2004) ✅

Assessment:

Good breadth of CS education foundations
NBGrader citation is essential - correctly used
Guzdial citation adds learner-centered credibility
Porter peer instruction supports engagement claims

What's MISSING:

Computing education research methodology - If you're making pedagogical claims, should cite:
- Robins et al. (2003) "Learning and Teaching Programming: A Review and Discussion"
- Bennedsen & Caspersen (2007) on failure rates in CS1
Notional machines - Your "mental models of framework internals" aligns with notional machine concept:
- Sorva (2013) "Notional Machines and Introductory Programming Education"
SIGCSE/ICER recent work on ML education - Missing 2023-2024 papers from top CS Ed venues

Recommendation:

Consider adding notional machine citation when discussing "mental models"
Conduct targeted search for 2023-2024 SIGCSE/ICER papers on ML education

3. Systems Education Context

3.1 Systems Thinking - WEAK ❌

What's Cited:

TVM compiler (Chen 2018) ✅
PyTorch autograd (Paszke 2017) ✅
Roofline model (Williams 2009) ✅ (only in future work!)
ASTRA-sim (Chakkaravarthy 2023) ✅ (only in future work!)

Assessment: This is the weakest area of the related work. You claim "systems-first" as a major contribution but barely ground it in systems education literature.

CRITICAL GAPS:

No systems thinking pedagogy citations
- You use "systems thinking" 20+ times but cite NO systems thinking education research
- Missing: Meadows (2008) "Thinking in Systems" - foundational
- Missing: Senge (1990) "The Fifth Discipline" - organizational learning
- Missing: Richmond (1993) "Systems thinking: critical thinking skills for the 1990s and beyond"
No compiler course pedagogogy
- You compare to "compiler course model" but cite NO compiler education papers
- Missing: Aho et al. (2006) "Compilers: Principles, Techniques, and Tools" (Dragon Book)
- Missing: Appel (1998) "Modern Compiler Implementation" series
- Should cite actual compiler courses that use incremental construction
No operating systems pedagogy
- TinyTorch's "build the whole system" mirrors OS courses (xv6, Pintos, Nachos)
- Missing: Arpaci-Dusseau (2018) "Operating Systems: Three Easy Pieces"
- Missing: Ousterhout (1999) "Teaching Operating Systems Using Log-Based Kernels"
No software engineering education
- Package organization, module dependencies, integration - all SE concepts
- Missing: any SE education citations

What's PARTIALLY There:

ML Systems papers - You cite:
- MLSys book (Reddi 2024) ✅ - good!
- FlashAttention (Dao 2022) ✅ - technical paper, not educational
- Horovod, DeepSpeed - technical papers, not educational

Recommendation - ADD IMMEDIATELY:

Systems Thinking Foundation:

@book{meadows2008thinking,
  author = {Meadows, Donella H.},
  title = {Thinking in Systems: A Primer},
  year = {2008},
  publisher = {Chelsea Green Publishing}
}

Compiler Course Model:

@book{aho2006compilers,
  author = {Aho, Alfred V. and Lam, Monica S. and Sethi, Ravi and Ullman, Jeffrey D.},
  title = {Compilers: Principles, Techniques, and Tools},
  edition = {2nd},
  year = {2006},
  publisher = {Addison-Wesley}
}

OS Course Model:

@book{arpaci2018operating,
  author = {Arpaci-Dusseau, Remzi H. and Arpaci-Dusseau, Andrea C.},
  title = {Operating Systems: Three Easy Pieces},
  year = {2018},
  publisher = {Arpaci-Dusseau Books}
}

Then add paragraph in Related Work:

"TinyTorch's incremental system construction draws pedagogical inspiration from compiler courses~\citep{aho2006compilers} and operating systems courses~\citep{arpaci2018operating}, where students build complete systems (compilers from lexer to code generator, OS kernels from process management to file systems) to develop systems thinking~\citep{meadows2008thinking} through component integration. This ``build the whole stack'' approach has proven effective for teaching complex systems concepts in CS education."

3.2 ML Systems Workforce - PROBLEMATIC ⚠️❌

What's Cited:

Robert Half (2024) - industry report ⚠️
Keller Executive Search (2025) - industry report ⚠️

Assessment: The workforce gap motivates the entire paper, but you rely on non-peer-reviewed industry sources. This is risky for academic venues.

Problems:

These are recruiting firms, not research organizations
No academic backing for "3:1 demand-supply ratio"
No academic backing for "only 150,000 skilled practitioners"
"78% year-over-year growth" might be inflated by industry hype

What's MISSING:

Academic sources on ML workforce:

ACM/IEEE computing workforce reports
Bureau of Labor Statistics data on ML engineer demand
Academic papers on AI skills gap:
- Ransbotham et al. (2020) MIT Sloan on AI adoption barriers
- Brynjolfsson & Mitchell (2017) on AI workforce requirements
Industry-academic collaboration papers
Survey papers from CACM, IEEE Computer, or similar

Recommendation - CRITICAL for SIGCSE/Educational Venues:

Either:

Downplay workforce claims - Make it secondary motivation, not primary
Add academic backing - Find peer-reviewed sources on AI skills gap
Reframe as "tacit knowledge problem" - This is your strongest argument and doesn't need industry stats

Suggested Academic Citations:

@article{ransbotham2020expanding,
  author = {Ransbotham, Sam and Kiron, David and Gerbert, Philipp and Reeves, Martin},
  title = {Expanding AI's Impact with Organizational Learning},
  journal = {MIT Sloan Management Review},
  year = {2020}
}

@article{brynjolfsson2017machine,
  author = {Brynjolfsson, Erik and Mitchell, Tom},
  title = {What can machine learning do? Workforce implications},
  journal = {Science},
  volume = {358},
  number = {6370},
  pages = {1530--1534},
  year = {2017}
}

4. Citation Balance

4.1 Citation Distribution Analysis

Total unique citations: ~40 entries in references.bib Actually cited in paper: ~35-38 (some in bib never cited)

By Category:

Educational Frameworks: 6 citations (micrograd, MiniTorch, tinygrad, d2l.ai, fast.ai, CS231n)
Learning Theory: 8 citations (Papert, Collins, Sweller, Kapur, Meyer, Lave, Bruner*, Perkins*)
CS Education: 6 citations (NBGrader, Guzdial, Porter, Ihantola, Kölling, Fincher)
ML Systems: 4 citations (Reddi MLSys book, Chen DL Systems, Williams Roofline, ASTRA-sim)
Technical ML: 10 citations (PyTorch, TensorFlow, autograd papers, optimization papers, FlashAttention, etc.)
Workforce/Industry: 2 citations (Robert Half, Keller)
Other: 4 citations (MLPerf, CIFAR, historical ML papers)

Assessment:

Over-cited Areas:

Technical ML papers - You cite lots of algorithm papers (Vaswani attention, Kingma Adam, etc.) but these are for milestone validation, not related work. This is fine but heavy.

Under-cited Areas:

ML Systems Education - Only 2 real educational systems citations (Reddi book, Chen course)
Systems Thinking Pedagogy - Zero citations despite 20+ mentions
Recent Work (2023-2024) - Only 2-3 recent citations, rest are 2020 or older

Citation Age Distribution:

2024-2025: 2 citations (Reddi MLSys book, Keller workforce)
2022-2023: 4 citations (micrograd, tinygrad, Chen DL Systems, ASTRA-sim)
2018-2021: 10 citations (NBGrader, fast.ai, d2l.ai, JAX, etc.)
2000-2017: 15 citations (learning theory classics, foundational ML)
Pre-2000: 5 citations (Papert, Sweller, Bruner, etc.)

Verdict:

Good balance between classic theory (pre-2000) and recent work (2020+)
TOO FEW 2023-2024 citations for a 2025 paper - looks outdated
Learning theory foundations are appropriately older (Papert 1980, Sweller 1988)

4.2 Uncited Entries in references.bib

Entries in bib but NEVER cited in paper:

thompson2008bloom - Bloom's taxonomy for CS assessment ❗ SHOULD CITE
vygotsky1978mind - Mind in Society, scaffolding/ZPD ❗ SHOULD CITE
bradbury2018jax - JAX framework - mentioned in text but not formally cited
Several technical papers (Baydin autograd survey, Chen gradient checkpointing, etc.)

Recommendation:

Remove uncited technical papers from bib (clutter)
ADD citations for thompson2008bloom and vygotsky1978mind in appropriate sections

5. Competitive Positioning

5.1 Framework Comparison Table (Table 1) - EXCELLENT ✅

Assessment: This table is pedagogically brilliant and strategically sound:

Strengths:

Separates educational vs. production frameworks clearly
Five comparison dimensions are well-chosen: Purpose, Scope, Systems Focus, Target Outcome
TinyTorch row is bolded - appropriate emphasis
Fair to competitors - doesn't strawman

Strategic Positioning:

micrograd: acknowledged strength (understand backprop) while showing limitation (no systems)
MiniTorch: respectful ("build from first principles") while differentiating (embedded systems from M01)
tinygrad: acknowledges sophistication ("understand compilation") while showing gap (scaffolding)
PyTorch/TensorFlow: correctly positioned as "what comes after" TinyTorch

Minor Suggestions:

Consider adding "Assessment" column to highlight NBGrader infrastructure advantage
Consider adding "Hardware Requirements" to emphasize CPU-only accessibility

5.2 Detailed Framework Comparisons (Prose) - GOOD ✅

micrograd comparison (line 379):

"pedagogical clarity comes from intentional minimalism... necessarily omits systems concerns"

Verdict: ✅ Fair and strategic. Acknowledges strength (clarity) while identifying gap (no systems).

MiniTorch comparison (line 381):

"core curriculum emphasizes mathematical rigor... TinyTorch differs through systems-first emphasis"

Verdict: ✅ Respectful differentiation. Doesn't claim MiniTorch is bad, just different focus.

tinygrad comparison (line 383):

"pedagogically valuable through its inspectable design, tinygrad assumes significant background"

Verdict: ✅ Acknowledges value while identifying accessibility gap.

d2l.ai comparison (line 387):

"excels at algorithmic understanding through framework usage"

Verdict: ✅ Generous acknowledgment of widespread adoption and algorithmic strength.

fast.ai comparison (line 387):

"distinctive top-down pedagogy starts with practical applications"

Verdict: ✅ Accurately describes complementary approach without criticism.

Overall Competitive Positioning: STRONG. Fair, strategic, differentiated without being aggressive.

6. "What Are Reviewers Likely to Ask 'Why Didn't You Cite X?'"

6.1 High-Risk Omissions

SIGCSE/ICER Reviewers Will Ask About:

"Why no recent SIGCSE/ICER papers on ML education?"
- Risk: HIGH for SIGCSE submission
- You cite nothing from 2023-2024 SIGCSE/ICER
- Need: Search SIGCSE 2023, 2024, ICER 2023, 2024 for ML education papers
"Why no systems thinking education citations despite 'systems-first' claim?"
- Risk: HIGH for any venue
- You use "systems thinking" 20+ times but cite zero systems thinking pedagogy
- Need: Meadows (2008) or similar foundational systems thinking work
"Why no Bloom's taxonomy when discussing assessment?"
- Risk: MEDIUM for education venues
- You have thompson2008bloom in bib but don't cite it
- Need: Cite when discussing NBGrader assessment levels
"Why no scaffolding citations when you discuss scaffolding 5+ times?"
- Risk: MEDIUM for education venues
- You have vygotsky1978mind in bib but don't cite it
- Need: Cite Vygotsky or Wood/Bruner/Ross when discussing scaffolding

MLSys/Technical Reviewers Will Ask About:

"Why no compiler course citations when you compare to 'compiler course model'?"
- Risk: MEDIUM for MLSys
- You claim to follow compiler course pedagogy but cite no compiler education
- Need: Dragon Book or equivalent compiler course reference
"Why cite industry reports (Robert Half, Keller) for workforce claims?"
- Risk: MEDIUM for academic venues
- These aren't peer-reviewed sources
- Need: Academic sources on AI skills gap OR downplay workforce motivation

General Academic Reviewers Will Ask About:

"Why no recent work? Most citations are 2020 or older."
- Risk: LOW-MEDIUM for 2025 submission
- Only 2-3 citations from 2023-2024
- Need: More recent ML education / ML systems education work

6.2 Medium-Risk Omissions

"Why no active learning meta-analysis when you claim hands-on effectiveness?"
- Freeman et al. (2014) PNAS is the canonical citation for active learning superiority
- Risk: LOW-MEDIUM
"Why no progressive disclosure HCI literature when you claim it as innovation?"
- Nielsen or Mayer on progressive disclosure in UI/learning
- Risk: LOW-MEDIUM
"Why no notional machines when discussing mental models?"
- Sorva (2013) on notional machines in programming education
- Risk: LOW

6.3 Low-Risk Omissions (Nice to Have)

Berkeley CS182, MIT 6.S191, Full Stack Deep Learning courses
nnfs.io (Neural Networks from Scratch book/course)
More JAX educational materials discussion
Software engineering education (package design, modularity)
Educational data mining / learning analytics for assessing NBGrader

7. Strategic Recommendations

7.1 MUST FIX Before Submission (Critical Issues)

Priority 1: Fix Corrupted Bib Entries ⚠️⚠️⚠️

Fix bruner1960process - Currently has wrong paper (Narrative/Mentalization)
- Find: Bruner "The Process of Education" (1960) OR Wood, Bruner, Ross (1976) scaffolding paper
Fix perkins1992transfer - Currently has infant mortality paper
- Find: Perkins & Salomon (1992) "Transfer of learning"

Priority 2: Cite What You Already Have 3. Cite thompson2008bloom when discussing NBGrader assessment (Section 4.4) 4. Cite vygotsky1978mind when discussing scaffolding/ZPD (Section 3.2.2 Related Work)

Priority 3: Add Systems Thinking Foundation 5. Add Meadows (2008) "Thinking in Systems" - cite when using "systems thinking" 6. Add compiler course reference (Dragon Book or equivalent) - cite when comparing to "compiler course model"

7.2 SHOULD ADD for Stronger Positioning (High Value)

Priority 4: Recent ML Education Work 7. Search SIGCSE 2023, 2024 proceedings for ML education papers 8. Search ICER 2023, 2024 proceedings for ML/programming education papers 9. Search MLSys 2023, 2024 for any education-track papers

Priority 5: Strengthen Workforce Motivation 10. Replace or supplement Robert Half/Keller with academic sources: - Brynjolfsson & Mitchell (2017) Science paper on ML workforce - Ransbotham et al. (2020) MIT Sloan on AI adoption barriers - OR downplay workforce statistics, emphasize "tacit knowledge problem"

Priority 6: Active Learning Foundation 11. Add Freeman et al. (2014) active learning meta-analysis to justify hands-on approach

7.3 COULD ADD for Completeness (Medium Value)

Priority 7: Progressive Disclosure Grounding 12. Add Nielsen (1993) or Mayer (2009) on progressive disclosure in learning/UI

Priority 8: Notional Machines 13. Add Sorva (2013) on notional machines when discussing "mental models of internals"

Priority 9: OS Course Pedagogy 14. Add OS education reference (OSTEP or similar) to strengthen "build the whole system" comparison

7.4 REMOVE or DOWNPLAY (Clutter Reduction)

Priority 10: Clean Up Bib 15. Remove uncited technical papers from references.bib (e.g., Baydin autograd survey if not cited) 16. Remove or integrate technical milestone papers (Rosenblatt, Rumelhart, LeCun) - currently just used for historical validation, not related work

8. Citation Quality Checklist

Going through your current citations:

Educational Frameworks

✅ micrograd (Karpathy 2022) - GitHub repo, appropriate
✅ MiniTorch (Rush 2020) - Cornell Tech, appropriate
✅ tinygrad (Hotz 2023) - GitHub repo, appropriate
✅ d2l.ai (Zhang 2021) - arXiv + Cambridge Press, peer-reviewed equivalent
✅ fast.ai (Howard 2020) - Published in Information journal, peer-reviewed
✅ CS231n (Johnson 2016) - Course webpage, appropriate for course citation

Learning Theory

✅ Papert (1980) - Classic, appropriate
✅ Collins (1989) - Routledge chapter, peer-reviewed
✅ Sweller (1988) - Cognitive Science journal, peer-reviewed
✅ Kapur (2008) - Cognition and Instruction, peer-reviewed
✅ Meyer (2003) - Book chapter, peer-reviewed
✅ Lave & Wenger (1991) - Cambridge University Press, peer-reviewed
❌ Bruner (1960) - CORRUPTED ENTRY - fix immediately
❌ Perkins (1992) - CORRUPTED ENTRY - fix immediately
⚠️ Vygotsky (1978) - IN BIB BUT NEVER CITED - should cite

CS Education

✅ NBGrader (Blank 2019) - JOSE, peer-reviewed
✅ Guzdial (2015) - Morgan & Claypool, peer-reviewed
✅ Porter (2013) - ACM SIGCSE, peer-reviewed
✅ Ihantola (2010) - Koli Calling, peer-reviewed
✅ Kölling (2001) - ITiCSE, peer-reviewed
✅ Fincher (2004) - Taylor & Francis, peer-reviewed
⚠️ Thompson (2008) - IN BIB BUT NEVER CITED - should cite

Workforce/Industry

⚠️ Robert Half (2024) - Industry press release, not peer-reviewed
⚠️ Keller (2025) - Industry report, not peer-reviewed

ML Systems

✅ Reddi MLSys book (2024) - IEEE conference, peer-reviewed
✅ Chen DL Systems (2022) - CMU course, appropriate
✅ Williams Roofline (2009) - CACM, peer-reviewed
✅ ASTRA-sim (2020, 2023) - IEEE ISPASS, peer-reviewed

Technical ML

✅ PyTorch autograd (Paszke 2017) - NIPS workshop, appropriate
✅ TVM (Chen 2018) - OSDI, peer-reviewed
✅ FlashAttention (Dao 2022) - NeurIPS, peer-reviewed
✅ Vaswani attention (2017) - NeurIPS, peer-reviewed (note: doi shows 2025 reprint, original was 2017)
✅ Adam (Kingma 2014) - arXiv preprint widely cited, appropriate
✅ Horovod (Sergeev 2018) - arXiv preprint, appropriate
✅ DeepSpeed (Rasley 2020) - ACM KDD, peer-reviewed
✅ Historical ML (Rosenblatt 1958, Rumelhart 1986, LeCun 1998) - Classic papers, appropriate

Quality Issues:

2 corrupted entries (bruner1960, perkins1992) - CRITICAL FIX
2 industry reports (Robert Half, Keller) - risky for academic venues
2 uncited entries (thompson2008bloom, vygotsky1978mind) - should cite or remove

Overall Citation Quality: 7/10

Strong peer-reviewed foundation for learning theory and CS education
Good mix of classic and recent work
Technical citations are appropriate
MAJOR issues: 2 corrupted entries, industry sources for key motivation

9. Comparison Table: TinyTorch vs. What Reviewers Expect

Aspect	What You Have	What Reviewers Expect	Gap
Educational Frameworks	6 major frameworks cited (micrograd, MiniTorch, tinygrad, d2l.ai, fast.ai, CS231n)	All major educational frameworks	✅ GOOD
Learning Theory	8 theories cited (constructionism, cognitive apprenticeship, cognitive load, productive failure, threshold concepts, situated learning, + 2 corrupted)	Foundational learning theories with 3-5 key frameworks	✅ STRONG (once corrupted entries fixed)
Systems Thinking	Zero pedagogy citations despite 20+ mentions	At least 1-2 systems thinking education sources	❌ CRITICAL GAP
Compiler Course Model	Mentioned but not cited	Citation to compiler education when claiming to follow model	⚠️ GAP
Recent Work (2023-2024)	2-3 recent citations	5-10 recent citations showing awareness of field	⚠️ GAP
Workforce Motivation	2 industry reports	Academic sources or downplayed motivation	⚠️ RISKY
Scaffolding	Discussed but not directly cited	Vygotsky or Wood/Bruner/Ross when discussing scaffolding	⚠️ GAP
Assessment	NBGrader cited but no Bloom's taxonomy	Bloom's when discussing assessment levels	⚠️ MINOR GAP
Active Learning	Hands-on approach claimed but not grounded	Freeman et al. meta-analysis or similar	⚠️ MINOR GAP
Competitive Positioning	Fair, strategic, differentiated	Fair comparison acknowledging strengths	✅ EXCELLENT

10. Final Recommendations

10.1 Must-Do Before Submission (1-2 hours)

Fix corrupted bruner1960process entry - search for correct Bruner or Wood/Bruner/Ross scaffolding paper
Fix corrupted perkins1992transfer entry - search for correct Perkins & Salomon transfer of learning paper
Add systems thinking citation - Meadows (2008) "Thinking in Systems"
Add compiler course citation - Aho et al. (2006) "Compilers" (Dragon Book)
Cite vygotsky1978mind when discussing scaffolding/ZPD
Cite thompson2008bloom when discussing NBGrader assessment

10.2 Should-Do for Strong Submission (4-6 hours)

Search SIGCSE 2023-2024 for recent ML education papers - add 2-3 if found
Replace/supplement workforce citations with academic sources:
- Brynjolfsson & Mitchell (2017) Science
- Ransbotham et al. (2020) MIT Sloan
Add Freeman et al. (2014) active learning meta-analysis
Add OS course pedagogy (Arpaci-Dusseau OSTEP) to strengthen systems course comparison

10.3 Could-Do for Comprehensive Submission (2-4 hours)

Add progressive disclosure HCI literature (Nielsen or Mayer)
Add notional machines (Sorva 2013) for mental models
Add JAX discussion in related work (functional paradigm)
Search MLSys 2023-2024 for any education-focused papers

10.4 Reframing Recommendations

Current Introduction Opening:

"Machine learning deployment faces a critical workforce bottleneck: demand for ML systems engineers outstrips supply by over 3:1..."

Suggested Reframe (de-emphasize industry stats):

"Machine learning systems engineering requires tacit knowledge that resists formal instruction: understanding why Adam requires 2× optimizer state memory, when attention's O(N²) scaling becomes prohibitive, how to navigate accuracy-latency-memory tradeoffs in production. These engineering judgment calls depend on mental models of framework internals, traditionally acquired through years of debugging PyTorch or TensorFlow rather than formal courses~\citep{reddi2024mlsysbook}. While workforce demand for such skills has grown substantially~\citep{brynjolfsson2017machine}, current ML education..."

Why: Leads with the "tacit knowledge problem" (your strongest argument), supported by academic citation, with workforce as secondary motivation.

11. Success Metrics

After implementing recommendations, your citation strategy should score:

Criterion	Current	Target	Priority
Related Work Coverage	7/10	9/10	HIGH
Learning Theory Grounding	6/10 (corrupted entries)	9/10	CRITICAL
Systems Education Context	4/10	8/10	CRITICAL
Citation Balance	7/10	8/10	MEDIUM
Competitive Positioning	9/10	9/10	✅ MAINTAIN
Citation Quality	7/10 (corrupted entries)	9/10	CRITICAL
Recency	6/10	8/10	MEDIUM
Academic Rigor	7/10 (industry sources)	9/10	HIGH

Overall: 6.5/10 → 8.5/10 with recommended changes

12. Reviewer Likelihood of "Missing Citation" Comments

By Venue:

SIGCSE/ICER Submission:

HIGH RISK (>70% chance):
- "Why no recent SIGCSE/ICER work on ML education?"
- "Why no Bloom's taxonomy when discussing assessment?"
- "Why no scaffolding citations (Vygotsky, Wood/Bruner/Ross)?"
MEDIUM RISK (30-70%):
- "Why no systems thinking pedagogy despite systems-first claim?"
- "Why no active learning citations?"
- "Fix corrupted Bruner/Perkins entries"

MLSys/Systems Submission:

HIGH RISK (>70%):
- "Why no compiler course citations when comparing to compiler pedagogy?"
- "Why no systems thinking education citations?"
MEDIUM RISK (30-70%):
- "Why cite industry reports for workforce claims?"
- "Why no OS course pedagogy for 'build the whole system' claim?"

General CS Education Venues (e.g., TOCE, CSE):

HIGH RISK:
- Systems thinking pedagogy gap
- Recent work gap (2023-2024)
- Corrupted bib entries
MEDIUM RISK:
- Scaffolding citations
- Active learning foundations
- Progressive disclosure HCI literature

13. Citation Search Queries for Missing Work

To fill gaps, search these:

Google Scholar Queries:

Recent ML Education:
- "machine learning education" 2023 2024 SIGCSE
- "teaching machine learning" 2023 2024 ICER
- "ML systems education" 2023 2024
- "deep learning pedagogy" 2023 2024
Systems Thinking Education:
- "systems thinking" education pedagogy
- "teaching systems thinking" computer science
- Meadows "Thinking in Systems"
Compiler Course Pedagogy:
- "compiler course" pedagogy education
- "teaching compilers" incremental construction
- Aho "Compilers: Principles, Techniques, and Tools"
Scaffolding:
- Wood Bruner Ross 1976 scaffolding
- Vygotsky "zone of proximal development" education
AI Workforce (Academic):
- Brynjolfsson Mitchell 2017 machine learning workforce
- "AI skills gap" academic research
- "machine learning talent" shortage study
Active Learning:
- Freeman 2014 "active learning increases student performance"
Progressive Disclosure:
- Nielsen progressive disclosure
- Mayer multimedia learning progressive complexity

ACM Digital Library Queries:

Search SIGCSE 2023, 2024 proceedings: "machine learning" OR "deep learning" OR "neural networks"
Search ICER 2023, 2024 proceedings: "machine learning" OR "ML" OR "framework"

arXiv Queries:

cat:cs.CY "machine learning education" 2023
cat:cs.LG "teaching" OR "education" OR "pedagogy" 2023 2024

Conclusion

The TinyTorch paper has solid core citations for educational frameworks and learning theory, excellent competitive positioning, but critical gaps in systems thinking pedagogy, recent work, and two corrupted bib entries.

Priority actions:

Fix corrupted entries (bruner1960, perkins1992) - blocking issue
Add systems thinking (Meadows 2008) - supports main claim
Add compiler course reference (Aho 2006) - supports pedagogical model
Cite existing entries (vygotsky1978mind, thompson2008bloom)
Search for 2023-2024 ML education work - shows awareness of field
Replace/supplement workforce citations with academic sources

With these changes, the citation strategy moves from 6.5/10 to 8.5/10, significantly reducing reviewer pushback risk.

Files Referenced:

/Users/VJ/GitHub/TinyTorch/paper/paper.tex (1035 lines, ~40,000 words)
/Users/VJ/GitHub/TinyTorch/paper/references.bib (528 lines, 40 entries)

Assessment Completed: 2025-11-18 Reviewer: Dr. James Patterson (Literature Review Specialist)

34 KiB Raw Blame History Unescape Escape

TinyTorch Literature Review Assessment

Executive Summary

1. Related Work Coverage

1.1 Educational ML Frameworks - STRONG ✅

1.2 University Courses - GOOD BUT INCOMPLETE ⚠️

2. Learning Theory Grounding

2.1 Pedagogical Citations - STRONG BUT UNBALANCED ✅⚠️

2.2 CS Education Research - ADEQUATE ⚠️

3. Systems Education Context

3.1 Systems Thinking - WEAK ❌

3.2 ML Systems Workforce - PROBLEMATIC ⚠️❌

4. Citation Balance

4.1 Citation Distribution Analysis

4.2 Uncited Entries in references.bib

5. Competitive Positioning

5.1 Framework Comparison Table (Table 1) - EXCELLENT ✅

5.2 Detailed Framework Comparisons (Prose) - GOOD ✅

6. "What Are Reviewers Likely to Ask 'Why Didn't You Cite X?'"

6.1 High-Risk Omissions

6.2 Medium-Risk Omissions

6.3 Low-Risk Omissions (Nice to Have)

7. Strategic Recommendations

7.1 MUST FIX Before Submission (Critical Issues)

7.2 SHOULD ADD for Stronger Positioning (High Value)

7.3 COULD ADD for Completeness (Medium Value)

7.4 REMOVE or DOWNPLAY (Clutter Reduction)

8. Citation Quality Checklist

Educational Frameworks

Learning Theory

CS Education

Workforce/Industry

ML Systems

Technical ML

9. Comparison Table: TinyTorch vs. What Reviewers Expect

10. Final Recommendations

10.1 Must-Do Before Submission (1-2 hours)

10.2 Should-Do for Strong Submission (4-6 hours)

10.3 Could-Do for Comprehensive Submission (2-4 hours)

10.4 Reframing Recommendations

11. Success Metrics

12. Reviewer Likelihood of "Missing Citation" Comments

By Venue:

13. Citation Search Queries for Missing Work

Google Scholar Queries:

ACM Digital Library Queries:

arXiv Queries:

Conclusion

34 KiB

Raw Blame History