mirror of https://github.com/MLSysBook/TinyTorch.git synced 2026-04-29 06:12:34 -05:00

Files

Vijay Janapa Reddi a099469591 Fix gradient flow in examples: Maintain computational graph

Critical fix: Examples now properly maintain the computational graph
for gradient flow by:
1. Using tensor operations (diff, multiplication) instead of numpy
2. Calling backward directly on the loss tensor with gradient argument
3. Properly extracting gradient data for parameter updates

Results:
- Perceptron: Now achieves 100% accuracy (loss decreases from 0.20 to 0.002)
- XOR: Now learning! Gets 3/4 correct after 5000 epochs (vs stuck at 50% before)
- Gradient flow confirmed working through all layers

The issue was breaking the graph by creating new Tensors from numpy arrays
for loss computation. Now using proper tensor operations maintains the graph.

2025-09-28 20:09:48 -04:00

README.md

IMPROVE: Make milestone examples self-contained with clear dataset handling

2025-09-26 13:53:06 -04:00

rosenblatt_perceptron.py

Fix gradient flow in examples: Maintain computational graph

2025-09-28 20:09:48 -04:00

README.md

🧠 Perceptron (1957) - Rosenblatt

What This Demonstrates

The first trainable neural network in history! Using YOUR TinyTorch implementations to recreate Rosenblatt's pioneering perceptron.

Prerequisites

Complete these TinyTorch modules first:

Module 02 (Tensor) - Data structures with gradients
Module 03 (Activations) - Sigmoid activation
Module 04 (Layers) - Linear layer

🚀 Quick Start

# Run the perceptron training
python rosenblatt_perceptron.py

# Test architecture only
python rosenblatt_perceptron.py --test-only

# Custom epochs
python rosenblatt_perceptron.py --epochs 200

📊 Dataset Information

Synthetic Linearly Separable Data

Generated: 1,000 points in 2D space
Classes: Binary (0 or 1)
Property: Linearly separable by design
No Download Required: Data generated on-the-fly

Why Synthetic Data?

The perceptron can only solve linearly separable problems. We generate data that's guaranteed to be separable to demonstrate the algorithm works when its assumptions are met.

🏗️ Architecture

Input (x1, x2) → Linear (2→1) → Sigmoid → Binary Output

Simple but revolutionary - this proved machines could learn!

📈 Expected Results

Training Time: ~30 seconds
Accuracy: 95%+ (problem is linearly separable)
Parameters: Just 3 (2 weights + 1 bias)

💡 Historical Significance

1957: First demonstration of machine learning
Innovation: Weights that adjust based on errors
Limitation: Can't solve XOR (see xor_1969 example)
Legacy: Foundation for all modern neural networks

🔧 Command Line Options

--test-only: Test architecture without training
--epochs N: Number of training epochs (default: 100)

📚 What You Learn

How the first neural network worked
Why gradients enable learning
YOUR Linear layer performs the same math as 1957
Limitations that led to multi-layer networks