Converted all paragraph headings to bold text format for consistent styling throughout the document. This improves visual consistency and follows the requested formatting guidelines. Changes: - Paper Organization (introduction) - Build/Use/Reflect cycle descriptions - Why Milestones Matter - The Six Historical Milestones - Experiencing Performance Reality - All future work subsection headings (Roofline Models, ASTRA-sim, Energy and Power Profiling, The Three-Tier Systems Pedagogy) Table 3 remains correctly positioned in Systems Integration section where performance trade-offs are discussed. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
34 KiB
TinyTorch Literature Review Assessment
Dr. James Patterson - Research Literature Expert
Date: 2025-11-18
Paper Analyzed: /Users/VJ/GitHub/TinyTorch/paper/paper.tex
References: /Users/VJ/GitHub/TinyTorch/paper/references.bib
Executive Summary
The TinyTorch paper demonstrates solid core coverage of educational ML frameworks and learning theory, but has critical gaps in three areas that could hurt in peer review:
- CORRUPTED BIBLIOGRAPHY ENTRIES - Two citations have completely wrong metadata (bruner1960, perkins1992)
- Missing Recent Work - No 2023-2024 ML systems education research cited
- Weak Industry Evidence - Workforce gap claims rely solely on non-academic sources
Overall Assessment: 7/10 citation strategy. Strong pedagogical grounding, fair competitive positioning, but needs 5-7 strategic additions and 2 critical fixes before submission.
1. Related Work Coverage
1.1 Educational ML Frameworks - STRONG ✅
What's Cited:
- micrograd (Karpathy 2022) - autograd minimalism
- MiniTorch (Rush 2020) - Cornell Tech curriculum
- tinygrad (Hotz 2023) - inspectable production system
- d2l.ai (Zhang 2021) - comprehensive algorithmic foundations
- fast.ai (Howard 2020) - top-down layered API approach
Assessment:
- All major direct competitors cited ✅
- Comparisons are fair and strategic - acknowledge strengths while positioning TinyTorch's unique systems-first angle
- micrograd comparison is accurate (scalar-only, no systems focus)
- MiniTorch comparison is respectful and differentiating (math rigor vs. systems-first)
- tinygrad comparison correctly identifies scaffolding gap
What's MISSING:
- JAX educational materials - Only cited as code (bradbury2018jax) but not discussed in related work despite JAX being increasingly used for educational purposes (functional paradigm, explainability)
- nnfs.io (Neural Networks from Scratch) - Harrison Kinsley's book/course, similar bottom-up approach but focuses more on algorithms than systems
- PyTorch tutorials evolution - The official PyTorch tutorials have become quite pedagogical, worth brief mention
Recommendation:
- Add brief mention of JAX's functional approach in Related Work:
"JAX~\citep{bradbury2018jax} offers an alternative functional paradigm through composable transformations, teaching automatic differentiation via
jax.gradand vectorization throughvmap. While pedagogically valuable for understanding functional programming applied to ML, JAX assumes framework usage rather than framework construction, positioning it complementary to but distinct from TinyTorch's build-from-scratch approach."
1.2 University Courses - GOOD BUT INCOMPLETE ⚠️
What's Cited:
- Stanford CS231n (Johnson 2016) - CNNs course with NumPy assignments
- CMU DL Systems (Chen 2022) - production ML systems course
- Harvard TinyML (Banbury 2021) - embedded deployment focus
Assessment:
- CS231n citation is appropriate - isolated exercises vs. cumulative framework
- CMU DL Systems positioned correctly as "advanced follow-on" to TinyTorch
- TinyML comparison is EXCELLENT - clear differentiation (edge deployment vs. framework internals)
What's MISSING:
- Berkeley CS182/282A (Deep Learning) - Widely adopted course, some from-scratch assignments
- Full Stack Deep Learning (FSDL) - Production ML course, natural comparison point
- MIT 6.S191 - Introductory deep learning, large MOOC audience
- Other Cornell courses beyond MiniTorch
Verdict: Adequate coverage of major courses. CS231n and CMU DL Systems are the most important, both cited. Berkeley/MIT would strengthen "landscape coverage" but not critical.
2. Learning Theory Grounding
2.1 Pedagogical Citations - STRONG BUT UNBALANCED ✅⚠️
What's Cited:
- Constructionism - Papert 1980 ✅
- Cognitive Apprenticeship - Collins 1989 ✅ (cited 3x, most frequent learning theory)
- Cognitive Load - Sweller 1988 ✅
- Productive Failure - Kapur 2008 ✅ (cited 3x)
- Threshold Concepts - Meyer 2003 ✅
- Situated Learning - Lave & Wenger 1991 ✅
Assessment: This is exceptionally strong grounding in learning theory. The six theories cited cover:
- Knowledge construction (Papert, Collins)
- Cognitive constraints (Sweller, Kapur)
- Conceptual transformation (Meyer)
- Contextual learning (Lave & Wenger)
Citation frequencies reveal emphasis:
- Cognitive Apprenticeship (3x) - heavily emphasized
- Productive Failure (3x) - heavily emphasized
- Others (1x each) - mentioned but less central
CRITICAL PROBLEMS:
-
CORRUPTED bruner1960process entry:
@article{bruner1960process, author = {Frolli, A and Cerciello, F...}, # WRONG AUTHORS title = {Narrative Approach and Mentalization.}, # WRONG TITLE year = {1960}, # But doi is from 2023!This appears to be searching for Bruner's scaffolding work but got wrong paper. FIX IMMEDIATELY - should cite:
- Bruner, J. S. (1960). "The Process of Education" - classic scaffolding work
- OR Wood, Bruner, Ross (1976). "The role of tutoring in problem solving" - original scaffolding paper
-
CORRUPTED perkins1992transfer entry:
@article{perkins1992transfer, author = {Burstein, R and Henry, NJ...}, # WRONG - this is about infant mortality title = {Mapping 123 million neonatal, infant and child deaths...}, # COMPLETELY WRONG journal = {Nature}, year = {1992}, # But doi says 2019!This is cited ONCE but appears to be looking for transfer of learning work. Should be:
- Perkins, D. N., & Salomon, G. (1992). "Transfer of learning" - International encyclopedia of education
What's MISSING:
-
Bloom's Taxonomy - Paper mentions "systems analysis" and "evaluation" levels but doesn't cite Bloom
- You have
thompson2008bloomin bib but NEVER CITED in paper! - This is about CS assessment using Bloom's - should use it when discussing NBGrader assessment
- You have
-
Scaffolding - You discuss scaffolding 5+ times but only cite it via Cognitive Apprenticeship
- Missing direct scaffolding citations: Wood, Bruner, Ross (1976) or Vygotsky (1978) ZPD
- You HAVE vygotsky1978mind in bib but NEVER CITE IT!
-
Active Learning - You discuss peer instruction, engagement, but no citations
- Missing: Freeman et al. (2014) PNAS meta-analysis on active learning effectiveness
- Missing: Mazur (1997) Peer Instruction
-
Chunking/Progressive Disclosure - You claim this as innovation but don't cite HCI/progressive disclosure literature
- Missing: Nielsen (1993) progressive disclosure in UI design
- Missing: Mayer (2009) multimedia learning principles
Recommendations:
CRITICAL FIXES (must do before submission):
- Fix bruner1960process - find correct Bruner scaffolding citation
- Fix perkins1992transfer - find correct transfer learning citation
- Cite thompson2008bloom when discussing NBGrader assessment levels
- Cite vygotsky1978mind when discussing scaffolding/ZPD
STRATEGIC ADDITIONS (strengthen theoretical foundation): 5. Add Freeman et al. (2014) active learning meta-analysis to justify hands-on approach 6. Add progressive disclosure HCI literature (Nielsen or Mayer) to ground the progressive disclosure pattern claim
2.2 CS Education Research - ADEQUATE ⚠️
What's Cited:
- NBGrader (Blank 2019) ✅
- Learner-Centered Design (Guzdial 2015) ✅
- Peer Instruction (Porter 2013) ✅
- Auto-grading review (Ihantola 2010) ✅
- Teaching OOP (Kölling 2001) ✅
- CS Ed Research (Fincher 2004) ✅
Assessment:
- Good breadth of CS education foundations
- NBGrader citation is essential - correctly used
- Guzdial citation adds learner-centered credibility
- Porter peer instruction supports engagement claims
What's MISSING:
- Computing education research methodology - If you're making pedagogical claims, should cite:
- Robins et al. (2003) "Learning and Teaching Programming: A Review and Discussion"
- Bennedsen & Caspersen (2007) on failure rates in CS1
- Notional machines - Your "mental models of framework internals" aligns with notional machine concept:
- Sorva (2013) "Notional Machines and Introductory Programming Education"
- SIGCSE/ICER recent work on ML education - Missing 2023-2024 papers from top CS Ed venues
Recommendation:
- Consider adding notional machine citation when discussing "mental models"
- Conduct targeted search for 2023-2024 SIGCSE/ICER papers on ML education
3. Systems Education Context
3.1 Systems Thinking - WEAK ❌
What's Cited:
- TVM compiler (Chen 2018) ✅
- PyTorch autograd (Paszke 2017) ✅
- Roofline model (Williams 2009) ✅ (only in future work!)
- ASTRA-sim (Chakkaravarthy 2023) ✅ (only in future work!)
Assessment: This is the weakest area of the related work. You claim "systems-first" as a major contribution but barely ground it in systems education literature.
CRITICAL GAPS:
-
No systems thinking pedagogy citations
- You use "systems thinking" 20+ times but cite NO systems thinking education research
- Missing: Meadows (2008) "Thinking in Systems" - foundational
- Missing: Senge (1990) "The Fifth Discipline" - organizational learning
- Missing: Richmond (1993) "Systems thinking: critical thinking skills for the 1990s and beyond"
-
No compiler course pedagogogy
- You compare to "compiler course model" but cite NO compiler education papers
- Missing: Aho et al. (2006) "Compilers: Principles, Techniques, and Tools" (Dragon Book)
- Missing: Appel (1998) "Modern Compiler Implementation" series
- Should cite actual compiler courses that use incremental construction
-
No operating systems pedagogy
- TinyTorch's "build the whole system" mirrors OS courses (xv6, Pintos, Nachos)
- Missing: Arpaci-Dusseau (2018) "Operating Systems: Three Easy Pieces"
- Missing: Ousterhout (1999) "Teaching Operating Systems Using Log-Based Kernels"
-
No software engineering education
- Package organization, module dependencies, integration - all SE concepts
- Missing: any SE education citations
What's PARTIALLY There:
- ML Systems papers - You cite:
- MLSys book (Reddi 2024) ✅ - good!
- FlashAttention (Dao 2022) ✅ - technical paper, not educational
- Horovod, DeepSpeed - technical papers, not educational
Recommendation - ADD IMMEDIATELY:
Systems Thinking Foundation:
@book{meadows2008thinking,
author = {Meadows, Donella H.},
title = {Thinking in Systems: A Primer},
year = {2008},
publisher = {Chelsea Green Publishing}
}
Compiler Course Model:
@book{aho2006compilers,
author = {Aho, Alfred V. and Lam, Monica S. and Sethi, Ravi and Ullman, Jeffrey D.},
title = {Compilers: Principles, Techniques, and Tools},
edition = {2nd},
year = {2006},
publisher = {Addison-Wesley}
}
OS Course Model:
@book{arpaci2018operating,
author = {Arpaci-Dusseau, Remzi H. and Arpaci-Dusseau, Andrea C.},
title = {Operating Systems: Three Easy Pieces},
year = {2018},
publisher = {Arpaci-Dusseau Books}
}
Then add paragraph in Related Work:
"TinyTorch's incremental system construction draws pedagogical inspiration from compiler courses~\citep{aho2006compilers} and operating systems courses~\citep{arpaci2018operating}, where students build complete systems (compilers from lexer to code generator, OS kernels from process management to file systems) to develop systems thinking~\citep{meadows2008thinking} through component integration. This ``build the whole stack'' approach has proven effective for teaching complex systems concepts in CS education."
3.2 ML Systems Workforce - PROBLEMATIC ⚠️❌
What's Cited:
- Robert Half (2024) - industry report ⚠️
- Keller Executive Search (2025) - industry report ⚠️
Assessment: The workforce gap motivates the entire paper, but you rely on non-peer-reviewed industry sources. This is risky for academic venues.
Problems:
- These are recruiting firms, not research organizations
- No academic backing for "3:1 demand-supply ratio"
- No academic backing for "only 150,000 skilled practitioners"
- "78% year-over-year growth" might be inflated by industry hype
What's MISSING:
Academic sources on ML workforce:
- ACM/IEEE computing workforce reports
- Bureau of Labor Statistics data on ML engineer demand
- Academic papers on AI skills gap:
- Ransbotham et al. (2020) MIT Sloan on AI adoption barriers
- Brynjolfsson & Mitchell (2017) on AI workforce requirements
- Industry-academic collaboration papers
- Survey papers from CACM, IEEE Computer, or similar
Recommendation - CRITICAL for SIGCSE/Educational Venues:
Either:
- Downplay workforce claims - Make it secondary motivation, not primary
- Add academic backing - Find peer-reviewed sources on AI skills gap
- Reframe as "tacit knowledge problem" - This is your strongest argument and doesn't need industry stats
Suggested Academic Citations:
@article{ransbotham2020expanding,
author = {Ransbotham, Sam and Kiron, David and Gerbert, Philipp and Reeves, Martin},
title = {Expanding AI's Impact with Organizational Learning},
journal = {MIT Sloan Management Review},
year = {2020}
}
@article{brynjolfsson2017machine,
author = {Brynjolfsson, Erik and Mitchell, Tom},
title = {What can machine learning do? Workforce implications},
journal = {Science},
volume = {358},
number = {6370},
pages = {1530--1534},
year = {2017}
}
4. Citation Balance
4.1 Citation Distribution Analysis
Total unique citations: ~40 entries in references.bib Actually cited in paper: ~35-38 (some in bib never cited)
By Category:
- Educational Frameworks: 6 citations (micrograd, MiniTorch, tinygrad, d2l.ai, fast.ai, CS231n)
- Learning Theory: 8 citations (Papert, Collins, Sweller, Kapur, Meyer, Lave, Bruner*, Perkins*)
- CS Education: 6 citations (NBGrader, Guzdial, Porter, Ihantola, Kölling, Fincher)
- ML Systems: 4 citations (Reddi MLSys book, Chen DL Systems, Williams Roofline, ASTRA-sim)
- Technical ML: 10 citations (PyTorch, TensorFlow, autograd papers, optimization papers, FlashAttention, etc.)
- Workforce/Industry: 2 citations (Robert Half, Keller)
- Other: 4 citations (MLPerf, CIFAR, historical ML papers)
Assessment:
Over-cited Areas:
- Technical ML papers - You cite lots of algorithm papers (Vaswani attention, Kingma Adam, etc.) but these are for milestone validation, not related work. This is fine but heavy.
Under-cited Areas:
- ML Systems Education - Only 2 real educational systems citations (Reddi book, Chen course)
- Systems Thinking Pedagogy - Zero citations despite 20+ mentions
- Recent Work (2023-2024) - Only 2-3 recent citations, rest are 2020 or older
Citation Age Distribution:
- 2024-2025: 2 citations (Reddi MLSys book, Keller workforce)
- 2022-2023: 4 citations (micrograd, tinygrad, Chen DL Systems, ASTRA-sim)
- 2018-2021: 10 citations (NBGrader, fast.ai, d2l.ai, JAX, etc.)
- 2000-2017: 15 citations (learning theory classics, foundational ML)
- Pre-2000: 5 citations (Papert, Sweller, Bruner, etc.)
Verdict:
- Good balance between classic theory (pre-2000) and recent work (2020+)
- TOO FEW 2023-2024 citations for a 2025 paper - looks outdated
- Learning theory foundations are appropriately older (Papert 1980, Sweller 1988)
4.2 Uncited Entries in references.bib
Entries in bib but NEVER cited in paper:
thompson2008bloom- Bloom's taxonomy for CS assessment ❗ SHOULD CITEvygotsky1978mind- Mind in Society, scaffolding/ZPD ❗ SHOULD CITEbradbury2018jax- JAX framework - mentioned in text but not formally cited- Several technical papers (Baydin autograd survey, Chen gradient checkpointing, etc.)
Recommendation:
- Remove uncited technical papers from bib (clutter)
- ADD citations for thompson2008bloom and vygotsky1978mind in appropriate sections
5. Competitive Positioning
5.1 Framework Comparison Table (Table 1) - EXCELLENT ✅
Assessment: This table is pedagogically brilliant and strategically sound:
Strengths:
- Separates educational vs. production frameworks clearly
- Five comparison dimensions are well-chosen: Purpose, Scope, Systems Focus, Target Outcome
- TinyTorch row is bolded - appropriate emphasis
- Fair to competitors - doesn't strawman
Strategic Positioning:
- micrograd: acknowledged strength (understand backprop) while showing limitation (no systems)
- MiniTorch: respectful ("build from first principles") while differentiating (embedded systems from M01)
- tinygrad: acknowledges sophistication ("understand compilation") while showing gap (scaffolding)
- PyTorch/TensorFlow: correctly positioned as "what comes after" TinyTorch
Minor Suggestions:
- Consider adding "Assessment" column to highlight NBGrader infrastructure advantage
- Consider adding "Hardware Requirements" to emphasize CPU-only accessibility
5.2 Detailed Framework Comparisons (Prose) - GOOD ✅
micrograd comparison (line 379):
"pedagogical clarity comes from intentional minimalism... necessarily omits systems concerns"
Verdict: ✅ Fair and strategic. Acknowledges strength (clarity) while identifying gap (no systems).
MiniTorch comparison (line 381):
"core curriculum emphasizes mathematical rigor... TinyTorch differs through systems-first emphasis"
Verdict: ✅ Respectful differentiation. Doesn't claim MiniTorch is bad, just different focus.
tinygrad comparison (line 383):
"pedagogically valuable through its inspectable design, tinygrad assumes significant background"
Verdict: ✅ Acknowledges value while identifying accessibility gap.
d2l.ai comparison (line 387):
"excels at algorithmic understanding through framework usage"
Verdict: ✅ Generous acknowledgment of widespread adoption and algorithmic strength.
fast.ai comparison (line 387):
"distinctive top-down pedagogy starts with practical applications"
Verdict: ✅ Accurately describes complementary approach without criticism.
Overall Competitive Positioning: STRONG. Fair, strategic, differentiated without being aggressive.
6. "What Are Reviewers Likely to Ask 'Why Didn't You Cite X?'"
6.1 High-Risk Omissions
SIGCSE/ICER Reviewers Will Ask About:
-
"Why no recent SIGCSE/ICER papers on ML education?"
- Risk: HIGH for SIGCSE submission
- You cite nothing from 2023-2024 SIGCSE/ICER
- Need: Search SIGCSE 2023, 2024, ICER 2023, 2024 for ML education papers
-
"Why no systems thinking education citations despite 'systems-first' claim?"
- Risk: HIGH for any venue
- You use "systems thinking" 20+ times but cite zero systems thinking pedagogy
- Need: Meadows (2008) or similar foundational systems thinking work
-
"Why no Bloom's taxonomy when discussing assessment?"
- Risk: MEDIUM for education venues
- You have thompson2008bloom in bib but don't cite it
- Need: Cite when discussing NBGrader assessment levels
-
"Why no scaffolding citations when you discuss scaffolding 5+ times?"
- Risk: MEDIUM for education venues
- You have vygotsky1978mind in bib but don't cite it
- Need: Cite Vygotsky or Wood/Bruner/Ross when discussing scaffolding
MLSys/Technical Reviewers Will Ask About:
-
"Why no compiler course citations when you compare to 'compiler course model'?"
- Risk: MEDIUM for MLSys
- You claim to follow compiler course pedagogy but cite no compiler education
- Need: Dragon Book or equivalent compiler course reference
-
"Why cite industry reports (Robert Half, Keller) for workforce claims?"
- Risk: MEDIUM for academic venues
- These aren't peer-reviewed sources
- Need: Academic sources on AI skills gap OR downplay workforce motivation
General Academic Reviewers Will Ask About:
- "Why no recent work? Most citations are 2020 or older."
- Risk: LOW-MEDIUM for 2025 submission
- Only 2-3 citations from 2023-2024
- Need: More recent ML education / ML systems education work
6.2 Medium-Risk Omissions
-
"Why no active learning meta-analysis when you claim hands-on effectiveness?"
- Freeman et al. (2014) PNAS is the canonical citation for active learning superiority
- Risk: LOW-MEDIUM
-
"Why no progressive disclosure HCI literature when you claim it as innovation?"
- Nielsen or Mayer on progressive disclosure in UI/learning
- Risk: LOW-MEDIUM
-
"Why no notional machines when discussing mental models?"
- Sorva (2013) on notional machines in programming education
- Risk: LOW
6.3 Low-Risk Omissions (Nice to Have)
- Berkeley CS182, MIT 6.S191, Full Stack Deep Learning courses
- nnfs.io (Neural Networks from Scratch book/course)
- More JAX educational materials discussion
- Software engineering education (package design, modularity)
- Educational data mining / learning analytics for assessing NBGrader
7. Strategic Recommendations
7.1 MUST FIX Before Submission (Critical Issues)
Priority 1: Fix Corrupted Bib Entries ⚠️⚠️⚠️
- Fix
bruner1960process- Currently has wrong paper (Narrative/Mentalization)- Find: Bruner "The Process of Education" (1960) OR Wood, Bruner, Ross (1976) scaffolding paper
- Fix
perkins1992transfer- Currently has infant mortality paper- Find: Perkins & Salomon (1992) "Transfer of learning"
Priority 2: Cite What You Already Have
3. Cite thompson2008bloom when discussing NBGrader assessment (Section 4.4)
4. Cite vygotsky1978mind when discussing scaffolding/ZPD (Section 3.2.2 Related Work)
Priority 3: Add Systems Thinking Foundation 5. Add Meadows (2008) "Thinking in Systems" - cite when using "systems thinking" 6. Add compiler course reference (Dragon Book or equivalent) - cite when comparing to "compiler course model"
7.2 SHOULD ADD for Stronger Positioning (High Value)
Priority 4: Recent ML Education Work 7. Search SIGCSE 2023, 2024 proceedings for ML education papers 8. Search ICER 2023, 2024 proceedings for ML/programming education papers 9. Search MLSys 2023, 2024 for any education-track papers
Priority 5: Strengthen Workforce Motivation 10. Replace or supplement Robert Half/Keller with academic sources: - Brynjolfsson & Mitchell (2017) Science paper on ML workforce - Ransbotham et al. (2020) MIT Sloan on AI adoption barriers - OR downplay workforce statistics, emphasize "tacit knowledge problem"
Priority 6: Active Learning Foundation 11. Add Freeman et al. (2014) active learning meta-analysis to justify hands-on approach
7.3 COULD ADD for Completeness (Medium Value)
Priority 7: Progressive Disclosure Grounding 12. Add Nielsen (1993) or Mayer (2009) on progressive disclosure in learning/UI
Priority 8: Notional Machines 13. Add Sorva (2013) on notional machines when discussing "mental models of internals"
Priority 9: OS Course Pedagogy 14. Add OS education reference (OSTEP or similar) to strengthen "build the whole system" comparison
7.4 REMOVE or DOWNPLAY (Clutter Reduction)
Priority 10: Clean Up Bib 15. Remove uncited technical papers from references.bib (e.g., Baydin autograd survey if not cited) 16. Remove or integrate technical milestone papers (Rosenblatt, Rumelhart, LeCun) - currently just used for historical validation, not related work
8. Citation Quality Checklist
Going through your current citations:
Educational Frameworks
- ✅ micrograd (Karpathy 2022) - GitHub repo, appropriate
- ✅ MiniTorch (Rush 2020) - Cornell Tech, appropriate
- ✅ tinygrad (Hotz 2023) - GitHub repo, appropriate
- ✅ d2l.ai (Zhang 2021) - arXiv + Cambridge Press, peer-reviewed equivalent
- ✅ fast.ai (Howard 2020) - Published in Information journal, peer-reviewed
- ✅ CS231n (Johnson 2016) - Course webpage, appropriate for course citation
Learning Theory
- ✅ Papert (1980) - Classic, appropriate
- ✅ Collins (1989) - Routledge chapter, peer-reviewed
- ✅ Sweller (1988) - Cognitive Science journal, peer-reviewed
- ✅ Kapur (2008) - Cognition and Instruction, peer-reviewed
- ✅ Meyer (2003) - Book chapter, peer-reviewed
- ✅ Lave & Wenger (1991) - Cambridge University Press, peer-reviewed
- ❌ Bruner (1960) - CORRUPTED ENTRY - fix immediately
- ❌ Perkins (1992) - CORRUPTED ENTRY - fix immediately
- ⚠️ Vygotsky (1978) - IN BIB BUT NEVER CITED - should cite
CS Education
- ✅ NBGrader (Blank 2019) - JOSE, peer-reviewed
- ✅ Guzdial (2015) - Morgan & Claypool, peer-reviewed
- ✅ Porter (2013) - ACM SIGCSE, peer-reviewed
- ✅ Ihantola (2010) - Koli Calling, peer-reviewed
- ✅ Kölling (2001) - ITiCSE, peer-reviewed
- ✅ Fincher (2004) - Taylor & Francis, peer-reviewed
- ⚠️ Thompson (2008) - IN BIB BUT NEVER CITED - should cite
Workforce/Industry
- ⚠️ Robert Half (2024) - Industry press release, not peer-reviewed
- ⚠️ Keller (2025) - Industry report, not peer-reviewed
ML Systems
- ✅ Reddi MLSys book (2024) - IEEE conference, peer-reviewed
- ✅ Chen DL Systems (2022) - CMU course, appropriate
- ✅ Williams Roofline (2009) - CACM, peer-reviewed
- ✅ ASTRA-sim (2020, 2023) - IEEE ISPASS, peer-reviewed
Technical ML
- ✅ PyTorch autograd (Paszke 2017) - NIPS workshop, appropriate
- ✅ TVM (Chen 2018) - OSDI, peer-reviewed
- ✅ FlashAttention (Dao 2022) - NeurIPS, peer-reviewed
- ✅ Vaswani attention (2017) - NeurIPS, peer-reviewed (note: doi shows 2025 reprint, original was 2017)
- ✅ Adam (Kingma 2014) - arXiv preprint widely cited, appropriate
- ✅ Horovod (Sergeev 2018) - arXiv preprint, appropriate
- ✅ DeepSpeed (Rasley 2020) - ACM KDD, peer-reviewed
- ✅ Historical ML (Rosenblatt 1958, Rumelhart 1986, LeCun 1998) - Classic papers, appropriate
Quality Issues:
- 2 corrupted entries (bruner1960, perkins1992) - CRITICAL FIX
- 2 industry reports (Robert Half, Keller) - risky for academic venues
- 2 uncited entries (thompson2008bloom, vygotsky1978mind) - should cite or remove
Overall Citation Quality: 7/10
- Strong peer-reviewed foundation for learning theory and CS education
- Good mix of classic and recent work
- Technical citations are appropriate
- MAJOR issues: 2 corrupted entries, industry sources for key motivation
9. Comparison Table: TinyTorch vs. What Reviewers Expect
| Aspect | What You Have | What Reviewers Expect | Gap |
|---|---|---|---|
| Educational Frameworks | 6 major frameworks cited (micrograd, MiniTorch, tinygrad, d2l.ai, fast.ai, CS231n) | All major educational frameworks | ✅ GOOD |
| Learning Theory | 8 theories cited (constructionism, cognitive apprenticeship, cognitive load, productive failure, threshold concepts, situated learning, + 2 corrupted) | Foundational learning theories with 3-5 key frameworks | ✅ STRONG (once corrupted entries fixed) |
| Systems Thinking | Zero pedagogy citations despite 20+ mentions | At least 1-2 systems thinking education sources | ❌ CRITICAL GAP |
| Compiler Course Model | Mentioned but not cited | Citation to compiler education when claiming to follow model | ⚠️ GAP |
| Recent Work (2023-2024) | 2-3 recent citations | 5-10 recent citations showing awareness of field | ⚠️ GAP |
| Workforce Motivation | 2 industry reports | Academic sources or downplayed motivation | ⚠️ RISKY |
| Scaffolding | Discussed but not directly cited | Vygotsky or Wood/Bruner/Ross when discussing scaffolding | ⚠️ GAP |
| Assessment | NBGrader cited but no Bloom's taxonomy | Bloom's when discussing assessment levels | ⚠️ MINOR GAP |
| Active Learning | Hands-on approach claimed but not grounded | Freeman et al. meta-analysis or similar | ⚠️ MINOR GAP |
| Competitive Positioning | Fair, strategic, differentiated | Fair comparison acknowledging strengths | ✅ EXCELLENT |
10. Final Recommendations
10.1 Must-Do Before Submission (1-2 hours)
- Fix corrupted bruner1960process entry - search for correct Bruner or Wood/Bruner/Ross scaffolding paper
- Fix corrupted perkins1992transfer entry - search for correct Perkins & Salomon transfer of learning paper
- Add systems thinking citation - Meadows (2008) "Thinking in Systems"
- Add compiler course citation - Aho et al. (2006) "Compilers" (Dragon Book)
- Cite vygotsky1978mind when discussing scaffolding/ZPD
- Cite thompson2008bloom when discussing NBGrader assessment
10.2 Should-Do for Strong Submission (4-6 hours)
- Search SIGCSE 2023-2024 for recent ML education papers - add 2-3 if found
- Replace/supplement workforce citations with academic sources:
- Brynjolfsson & Mitchell (2017) Science
- Ransbotham et al. (2020) MIT Sloan
- Add Freeman et al. (2014) active learning meta-analysis
- Add OS course pedagogy (Arpaci-Dusseau OSTEP) to strengthen systems course comparison
10.3 Could-Do for Comprehensive Submission (2-4 hours)
- Add progressive disclosure HCI literature (Nielsen or Mayer)
- Add notional machines (Sorva 2013) for mental models
- Add JAX discussion in related work (functional paradigm)
- Search MLSys 2023-2024 for any education-focused papers
10.4 Reframing Recommendations
Current Introduction Opening:
"Machine learning deployment faces a critical workforce bottleneck: demand for ML systems engineers outstrips supply by over 3:1..."
Suggested Reframe (de-emphasize industry stats):
"Machine learning systems engineering requires tacit knowledge that resists formal instruction: understanding why Adam requires 2× optimizer state memory, when attention's O(N²) scaling becomes prohibitive, how to navigate accuracy-latency-memory tradeoffs in production. These engineering judgment calls depend on mental models of framework internals, traditionally acquired through years of debugging PyTorch or TensorFlow rather than formal courses~\citep{reddi2024mlsysbook}. While workforce demand for such skills has grown substantially~\citep{brynjolfsson2017machine}, current ML education..."
Why: Leads with the "tacit knowledge problem" (your strongest argument), supported by academic citation, with workforce as secondary motivation.
11. Success Metrics
After implementing recommendations, your citation strategy should score:
| Criterion | Current | Target | Priority |
|---|---|---|---|
| Related Work Coverage | 7/10 | 9/10 | HIGH |
| Learning Theory Grounding | 6/10 (corrupted entries) | 9/10 | CRITICAL |
| Systems Education Context | 4/10 | 8/10 | CRITICAL |
| Citation Balance | 7/10 | 8/10 | MEDIUM |
| Competitive Positioning | 9/10 | 9/10 | ✅ MAINTAIN |
| Citation Quality | 7/10 (corrupted entries) | 9/10 | CRITICAL |
| Recency | 6/10 | 8/10 | MEDIUM |
| Academic Rigor | 7/10 (industry sources) | 9/10 | HIGH |
Overall: 6.5/10 → 8.5/10 with recommended changes
12. Reviewer Likelihood of "Missing Citation" Comments
By Venue:
SIGCSE/ICER Submission:
- HIGH RISK (>70% chance):
- "Why no recent SIGCSE/ICER work on ML education?"
- "Why no Bloom's taxonomy when discussing assessment?"
- "Why no scaffolding citations (Vygotsky, Wood/Bruner/Ross)?"
- MEDIUM RISK (30-70%):
- "Why no systems thinking pedagogy despite systems-first claim?"
- "Why no active learning citations?"
- "Fix corrupted Bruner/Perkins entries"
MLSys/Systems Submission:
- HIGH RISK (>70%):
- "Why no compiler course citations when comparing to compiler pedagogy?"
- "Why no systems thinking education citations?"
- MEDIUM RISK (30-70%):
- "Why cite industry reports for workforce claims?"
- "Why no OS course pedagogy for 'build the whole system' claim?"
General CS Education Venues (e.g., TOCE, CSE):
- HIGH RISK:
- Systems thinking pedagogy gap
- Recent work gap (2023-2024)
- Corrupted bib entries
- MEDIUM RISK:
- Scaffolding citations
- Active learning foundations
- Progressive disclosure HCI literature
13. Citation Search Queries for Missing Work
To fill gaps, search these:
Google Scholar Queries:
-
Recent ML Education:
"machine learning education" 2023 2024 SIGCSE"teaching machine learning" 2023 2024 ICER"ML systems education" 2023 2024"deep learning pedagogy" 2023 2024
-
Systems Thinking Education:
"systems thinking" education pedagogy"teaching systems thinking" computer scienceMeadows "Thinking in Systems"
-
Compiler Course Pedagogy:
"compiler course" pedagogy education"teaching compilers" incremental constructionAho "Compilers: Principles, Techniques, and Tools"
-
Scaffolding:
Wood Bruner Ross 1976 scaffoldingVygotsky "zone of proximal development" education
-
AI Workforce (Academic):
Brynjolfsson Mitchell 2017 machine learning workforce"AI skills gap" academic research"machine learning talent" shortage study
-
Active Learning:
Freeman 2014 "active learning increases student performance"
-
Progressive Disclosure:
Nielsen progressive disclosureMayer multimedia learning progressive complexity
ACM Digital Library Queries:
- Search SIGCSE 2023, 2024 proceedings:
"machine learning" OR "deep learning" OR "neural networks" - Search ICER 2023, 2024 proceedings:
"machine learning" OR "ML" OR "framework"
arXiv Queries:
cat:cs.CY "machine learning education" 2023cat:cs.LG "teaching" OR "education" OR "pedagogy" 2023 2024
Conclusion
The TinyTorch paper has solid core citations for educational frameworks and learning theory, excellent competitive positioning, but critical gaps in systems thinking pedagogy, recent work, and two corrupted bib entries.
Priority actions:
- Fix corrupted entries (bruner1960, perkins1992) - blocking issue
- Add systems thinking (Meadows 2008) - supports main claim
- Add compiler course reference (Aho 2006) - supports pedagogical model
- Cite existing entries (vygotsky1978mind, thompson2008bloom)
- Search for 2023-2024 ML education work - shows awareness of field
- Replace/supplement workforce citations with academic sources
With these changes, the citation strategy moves from 6.5/10 to 8.5/10, significantly reducing reviewer pushback risk.
Files Referenced:
/Users/VJ/GitHub/TinyTorch/paper/paper.tex(1035 lines, ~40,000 words)/Users/VJ/GitHub/TinyTorch/paper/references.bib(528 lines, 40 entries)
Assessment Completed: 2025-11-18 Reviewer: Dr. James Patterson (Literature Review Specialist)