cs249r_book/book/quarto/contents/vol1/index.qmd

---
format:
  html:
    title: "Introduction to Machine Learning Systems"
    date: today
    date-format: long
    doi: "v0.5.1"
    doi-title: "Version"
    author:
      name: Vijay Janapa Reddi
      email: vj@eecs.harvard.edu
      url: https://vijay.seas.harvard.edu
      affiliation: Harvard University
---

::: {.content-visible unless-format="html:js"}

# Author's Note {.unnumbered}

::: {style="font-style: italic;"}

The world is rushing to build AI systems. It is not yet engineering them. That gap is what we mean by AI engineering. Who designs the training infrastructure? Who builds serving systems that scale? Who optimizes models to run on a phone or a sensor? Who architects the accelerators those models execute on? That work is AI engineering—the discipline of building efficient, reliable, safe, and robust intelligent systems that operate in the real world, not just models in isolation. And it is not yet recognized as a discipline.

By most industry estimates, the vast majority of AI projects never reach production or fail to deliver the value they promised. The failures are not exotic: a model degrades silently because no one built monitoring for distribution shift; a system that worked in the lab fails at the edge because latency requirements were never communicated to the team selecting architectures; a deployment pipeline breaks because the team retrained a model without versioning the data and cannot reproduce last week's results. These are not research problems. They are engineering problems, and they recur because the field lacks the shared principles, vocabulary, and training that a discipline provides.

Part of the gap is fragmentation. The knowledge to build ML systems exists, but it is scattered across silos. One course teaches compilers. Another covers hardware architecture. A third focuses on data pipelines. A fourth introduces the mathematics of optimization. Each is valuable on its own, but none of them teaches the *system*---and the result is engineers who understand individual components but not how they fit together. Nobody learns how a computer works by studying the processor in one course, the memory hierarchy in another, and the interconnect in a third, without ever building a complete machine. A computer is an integrated machine; understanding it means understanding where bottlenecks form between components, where trade-offs hide, and where one design decision constrains every other. An ML system is no different. Data pipelines, model architectures, software frameworks, and accelerator hardware constrain each other and fail together. The engineer who understands only one piece will build systems that break at the seams.

This book provides that missing integration. Not a survey of tools, not a collection of recipes, but the enduring principles that govern ML systems regardless of which framework or accelerator is fashionable this year. We do not teach operating systems by starting with distributed operating systems or embedded operating systems; we teach the fundamentals---scheduling, memory management, concurrency---and then specialization follows. That philosophy guides this book. Whether you are building a model that runs on a microcontroller or training one across a thousand GPUs, the principles are the same---memory bandwidth limits data movement, arithmetic intensity governs compute utilization, and the interplay between data, models, and hardware determines what is practical to build. The scale changes; the principles do not.

The existence of such principles is precisely what makes a discipline possible. Software engineering and computer engineering each emerged when practitioners recognized that building reliable systems required its own body of knowledge, distinct from the science that produced the underlying ideas. AI engineering is at that same inflection point. The best textbooks I encountered as a student, in both traditions, did not teach me subjects; they taught me how to think. They took vast, messy landscapes and revealed the structure underneath. That is what I have tried to do here.

--- Vijay Janapa Reddi

:::

:::

::: {.content-visible when-format="html:js"}

# Welcome {.unnumbered}

```{=html}
<div class="abstract-section">
  <div class="abstract-content">
    <p>Machine learning has evolved from a research discipline into an engineering practice. Building systems that learn from data requires more than understanding algorithms—it demands expertise spanning data pipelines, model development, optimization for deployment constraints, and operational practices. This book introduces AI engineering: the discipline of building ML systems that work in the real world. The treatment covers four areas: foundations (system characteristics, development workflows), building (deep learning mathematics, architectures, framework internals), optimization (compression, hardware acceleration, benchmarking), and deployment (serving infrastructure, operations, responsible engineering). The emphasis throughout is on engineering trade-offs and quantitative analysis.</p>
  </div>

  <a href="assets/downloads/Machine-Learning-Systems-Vol1.pdf" target="_blank" class="book-card-link" title="Download PDF">
    <div class="book-card">
      <img src="../../assets/images/covers/cover-hardcover-book.png" alt="Machine Learning Systems Book Cover" class="book-image" />
      <p class="book-title">Introduction to Machine Learning Systems</p>
      <p class="book-subtitle">Publisher: The MIT Press (2026)</p>
      <p style="font-size: 0.8em; color: #6c757d; margin-top: 6px; margin-bottom: 0;">📖 Click here to download PDF</p>
    </div>
  </a>
</div>
```

## What You Will Learn {.unnumbered}

The book progresses through four stages:

- **Part I: Foundations** — Build your conceptual foundation with mental models that underpin all effective systems work.
- **Part II: Build** — Engineer complete workflows from data pipelines through training infrastructure.
- **Part III: Optimize** — Transform theoretical understanding into systems that run efficiently in resource-constrained environments.
- **Part IV: Deploy** — Navigate serving, operations, and responsible engineering practices.

## Prerequisites {.unnumbered}

This book assumes:

- **Programming proficiency** in Python with familiarity in NumPy
- **Mathematics foundations** in linear algebra, calculus, and probability at the undergraduate level
- Prior ML experience is helpful but not required; @sec-neural-computation provides essential background

## Support Our Mission {.unnumbered}

```{=html}
<div class="support-mission">
  <p><strong>2026 Goal:</strong> Help 100,000 students learn ML Systems. Sponsors like the <a href="https://edgeaifoundation.org/" target="_blank" rel="noopener noreferrer">EDGE AI Foundation</a> match every star with funding that supports learning.</p>

  <div class="support-actions">
    <span class="star-count" id="star-count">Loading...</span>
    <a href="https://github.com/harvard-edge/cs249r_book" target="_blank" rel="noopener" class="github-star-btn">⭐ Star on GitHub</a>
  </div>

  <p class="support-note">
    <a href="https://opencollective.com/mlsysbook" target="_blank" rel="noopener">Support us on Open Collective →</a>
  </p>
</div>
```

```{=html}
<script>
async function fetchGitHubStars() {
  const starElement = document.getElementById('star-count');

  try {
    const response = await fetch('https://api.github.com/repos/harvard-edge/cs249r_book');
    const data = await response.json();
    const starCount = data.stargazers_count;
    const formattedCount = starCount.toLocaleString();
    starElement.textContent = formattedCount;
    starElement.style.opacity = '1';
  } catch (error) {
    console.error('Failed to fetch GitHub stars:', error);
    starElement.textContent = 'Loading...';
    starElement.style.opacity = '1';
  }
}

document.addEventListener('DOMContentLoaded', fetchGitHubStars);
</script>
```

## Continue Your Journey {.unnumbered}

```{=html}
<div style="background: linear-gradient(135deg, #f8f9fa 0%, #e9ecef 100%); padding: 1.5rem; border-radius: 8px; margin: 1rem 0;">
</div>
```

## Listen to the AI Podcast {.unnumbered}

```{=html}
<div class="podcast-section">
  <p>
    This short podcast, created with Google's Notebook LM and inspired by insights from our <a href="https://web.eng.fiu.edu/gaquan/Papers/ESWEEK24Papers/CPS-Proceedings/pdfs/CODES-ISSS/563900a043/563900a043.pdf" target="_blank" rel="noopener">IEEE education viewpoint paper</a>, offers an accessible overview of the book's key ideas and themes.
  </p>
  <audio controls="controls">
    <source src="../../assets/media/notebooklm_podcast_mlsysbookai.mp3" type="audio/mpeg" />
    Your browser does not support the audio element.
  </audio>
</div>
```

## Want to Help Out? {.unnumbered}

This is a collaborative project, and your input matters. If you'd like to contribute, check out our [contribution guidelines](https://github.com/harvard-edge/cs249r_book/blob/dev/docs/contribute.md). Feedback, corrections, and new ideas are welcome. Simply file a GitHub [issue](https://github.com/harvard-edge/cs249r_book/issues).

:::