Merge transformer-training into dev

Complete Milestone 05 - 2017 Transformer implementation

Major Features:
- TinyTalks interactive dashboard with rich CLI
- Complete gradient flow fixes (13 tests passing)
- Multiple training examples (5-min, 10-min, levels 1-2)
- Milestone celebration card (perceptron style)
- Comprehensive documentation

Gradient Flow Fixes:
- Fixed reshape, matmul (3D), embedding, sqrt, mean, sub, div, GELU
- All transformer components now fully differentiable
- Hybrid attention approach for educational clarity + gradients

Training Results:
- 10-min training: 96.6% loss improvement, 62.5% accuracy
- 5-min training: 97.8% loss improvement, 66.7% accuracy
- Working chatbot with coherent responses

Files Added:
- tinytalks_dashboard.py (main demo)
- tinytalks_chatbot.py, tinytalks_dataset.py
- level1_memorization.py, level2_patterns.py
- Comprehensive docs and test suites

Ready for student use 2>&1
This commit is contained in:
Vijay Janapa Reddi
2025-10-30 17:48:11 -04:00
36 changed files with 7365 additions and 2240 deletions

View File

@@ -3,17 +3,23 @@
{
"cell_type": "code",
"execution_count": null,
"id": "bbeed6a9",
"id": "c20728c2",
"metadata": {},
"outputs": [],
"source": [
"#| default_exp text.tokenization\n",
"#| export"
"#| export\n",
"\n",
"import numpy as np\n",
"from typing import List, Dict, Tuple, Optional, Set\n",
"import json\n",
"import re\n",
"from collections import defaultdict, Counter"
]
},
{
"cell_type": "markdown",
"id": "ab628a0c",
"id": "b005926e",
"metadata": {
"cell_marker": "\"\"\""
},
@@ -45,7 +51,7 @@
},
{
"cell_type": "markdown",
"id": "542171ad",
"id": "d5b93d34",
"metadata": {
"cell_marker": "\"\"\""
},
@@ -70,11 +76,10 @@
{
"cell_type": "code",
"execution_count": null,
"id": "6fe4fe02",
"id": "c89f5e86",
"metadata": {},
"outputs": [],
"source": [
"#| export\n",
"import numpy as np\n",
"from typing import List, Dict, Tuple, Optional, Set\n",
"import json\n",
@@ -87,7 +92,7 @@
},
{
"cell_type": "markdown",
"id": "ba7349a9",
"id": "c139104c",
"metadata": {
"cell_marker": "\"\"\""
},
@@ -144,7 +149,7 @@
},
{
"cell_type": "markdown",
"id": "c39ef970",
"id": "2446a382",
"metadata": {
"cell_marker": "\"\"\""
},
@@ -256,7 +261,7 @@
},
{
"cell_type": "markdown",
"id": "13b74a9d",
"id": "7b6f7e01",
"metadata": {
"cell_marker": "\"\"\""
},
@@ -268,7 +273,7 @@
},
{
"cell_type": "markdown",
"id": "e8613976",
"id": "6da9d664",
"metadata": {
"cell_marker": "\"\"\"",
"lines_to_next_cell": 1
@@ -290,7 +295,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "bb58a938",
"id": "07703775",
"metadata": {
"lines_to_next_cell": 1,
"nbgrader": {
@@ -353,7 +358,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "ddded2c2",
"id": "66f5edec",
"metadata": {
"nbgrader": {
"grade": true,
@@ -391,7 +396,7 @@
},
{
"cell_type": "markdown",
"id": "5f2f6599",
"id": "472f18d8",
"metadata": {
"cell_marker": "\"\"\"",
"lines_to_next_cell": 1
@@ -433,7 +438,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "bdba5211",
"id": "8413441a",
"metadata": {
"lines_to_next_cell": 1,
"nbgrader": {
@@ -571,7 +576,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "037f2a1b",
"id": "5268f9a8",
"metadata": {
"nbgrader": {
"grade": true,
@@ -622,7 +627,7 @@
},
{
"cell_type": "markdown",
"id": "6ba4ae7f",
"id": "389f7a3a",
"metadata": {
"cell_marker": "\"\"\""
},
@@ -638,7 +643,7 @@
},
{
"cell_type": "markdown",
"id": "1e93979f",
"id": "246bba99",
"metadata": {
"cell_marker": "\"\"\"",
"lines_to_next_cell": 1
@@ -729,7 +734,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "89452d55",
"id": "0190c2fc",
"metadata": {
"lines_to_next_cell": 1,
"nbgrader": {
@@ -1016,7 +1021,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "2ceb9e28",
"id": "3f7bd31f",
"metadata": {
"nbgrader": {
"grade": true,
@@ -1071,7 +1076,7 @@
},
{
"cell_type": "markdown",
"id": "8e51f1a4",
"id": "3baf97cf",
"metadata": {
"cell_marker": "\"\"\""
},
@@ -1102,7 +1107,7 @@
},
{
"cell_type": "markdown",
"id": "6d384f02",
"id": "0b06184b",
"metadata": {
"cell_marker": "\"\"\"",
"lines_to_next_cell": 1
@@ -1124,7 +1129,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "20ebcfe2",
"id": "8899f6cd",
"metadata": {
"lines_to_next_cell": 1,
"nbgrader": {
@@ -1236,7 +1241,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "3abc8dcd",
"id": "d4a23373",
"metadata": {
"nbgrader": {
"grade": true,
@@ -1281,7 +1286,7 @@
},
{
"cell_type": "markdown",
"id": "f8b901eb",
"id": "2771ad8d",
"metadata": {
"cell_marker": "\"\"\"",
"lines_to_next_cell": 1
@@ -1295,7 +1300,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "df2ae12e",
"id": "58050b9b",
"metadata": {
"nbgrader": {
"grade": false,
@@ -1346,7 +1351,7 @@
},
{
"cell_type": "markdown",
"id": "f23d4b98",
"id": "11fc9711",
"metadata": {
"cell_marker": "\"\"\""
},
@@ -1442,7 +1447,7 @@
},
{
"cell_type": "markdown",
"id": "a7c5816a",
"id": "a403fac4",
"metadata": {
"cell_marker": "\"\"\"",
"lines_to_next_cell": 1
@@ -1456,7 +1461,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "2f3cfd32",
"id": "4e0168d9",
"metadata": {
"nbgrader": {
"grade": true,
@@ -1548,7 +1553,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "9d68a974",
"id": "2761d570",
"metadata": {},
"outputs": [],
"source": [
@@ -1560,7 +1565,7 @@
},
{
"cell_type": "markdown",
"id": "b7885211",
"id": "92d46fdb",
"metadata": {
"cell_marker": "\"\"\""
},
@@ -1592,7 +1597,7 @@
},
{
"cell_type": "markdown",
"id": "1c62fd5c",
"id": "0bb8fde5",
"metadata": {
"cell_marker": "\"\"\""
},

View File

@@ -15,6 +15,12 @@
#| default_exp text.tokenization
#| export
import numpy as np
from typing import List, Dict, Tuple, Optional, Set
import json
import re
from collections import defaultdict, Counter
# %% [markdown]
"""
# Module 10: Tokenization - Converting Text to Numbers