Files
TinyTorch/modules/15_acceleration/module.yaml
Vijay Janapa Reddi 45a9cef548 Major reorganization: Remove setup module, renumber all modules, add tito setup command and numeric shortcuts
- Removed 01_setup module (archived to archive/setup_module)
- Renumbered all modules: tensor is now 01, activations is 02, etc.
- Added tito setup command for environment setup and package installation
- Added numeric shortcuts: tito 01, tito 02, etc. for quick module access
- Fixed view command to find dev files correctly
- Updated module dependencies and references
- Improved user experience: immediate ML learning instead of boring setup
2025-09-28 07:02:08 -04:00

41 lines
1.5 KiB
YAML

assessment:
- Understand why naive loops have poor cache performance
- Implement cache-friendly blocked matrix multiplication showing 10-50x speedups
- Recognize why NumPy provides 100x+ speedups over custom implementations
- Build backend system that automatically chooses optimal implementations
- 'Apply the ''free speedup'' principle: use better tools, don''t write faster code'
description: 'Master the easiest optimization: using better backends! Learn why naive
loops are slow, how cache-friendly blocking helps, and why NumPy provides 100x+
speedups.'
difficulty: Advanced
estimated_time: 3-4 hours
exports:
- matmul_naive
- matmul_blocked
- matmul_numpy
- OptimizedBackend
- matmul
- set_backend
learning_objectives:
- Understand CPU cache hierarchy and memory access performance bottlenecks
- Implement cache-friendly blocked matrix multiplication algorithms
- Build vectorized operations with optimized memory access patterns
- Design transparent backend systems for automatic optimization selection
- Measure and quantify real performance improvements scientifically
- Apply systems thinking to optimization decisions in ML workflows
name: acceleration
prerequisites:
- 'Module 2: Tensor operations and NumPy fundamentals'
- 'Module 4: Linear layers and matrix multiplication'
- Understanding of basic algorithmic complexity (O notation)
tags:
- performance
- optimization
- systems
- hardware
- acceleration
- cache
- vectorization
- backends
title: Hardware Acceleration - The Simplest Optimization