{ "cells": [ { "cell_type": "markdown", "id": "a84f5309", "metadata": { "cell_marker": "\"\"\"" }, "source": [ "# Setup - TinyTorch System Configuration\n", "\n", "Welcome to TinyTorch! This setup module configures your personal TinyTorch installation and teaches you the NBGrader workflow.\n", "\n", "## Learning Goals\n", "- Configure your personal TinyTorch installation with custom information\n", "- Learn to query system information using Python modules\n", "- Master the NBGrader workflow: implement โ†’ test โ†’ export\n", "- Create functions that become part of your tinytorch package\n", "- Understand solution blocks, hidden tests, and automated grading\n", "\n", "## The Big Picture: Why Configuration Matters in ML Systems\n", "Configuration is the foundation of any production ML system. In this module, you'll learn:\n", "\n", "### 1. **System Awareness**\n", "Real ML systems need to understand their environment:\n", "- **Hardware constraints**: Memory, CPU cores, GPU availability\n", "- **Software dependencies**: Python version, library compatibility\n", "- **Platform differences**: Linux servers, macOS development, Windows deployment\n", "\n", "### 2. **Reproducibility**\n", "Configuration enables reproducible ML:\n", "- **Environment documentation**: Exactly what system was used\n", "- **Dependency management**: Precise versions and requirements\n", "- **Debugging support**: System info helps troubleshoot issues\n", "\n", "### 3. **Professional Development**\n", "Proper configuration shows engineering maturity:\n", "- **Attribution**: Your work is properly credited\n", "- **Collaboration**: Others can understand and extend your setup\n", "- **Maintenance**: Systems can be updated and maintained\n", "\n", "### 4. **ML Systems Context**\n", "This connects to broader ML engineering:\n", "- **Model deployment**: Different environments need different configs\n", "- **Monitoring**: System metrics help track performance\n", "- **Scaling**: Understanding hardware helps optimize training\n", "\n", "Let's build the foundation of your ML systems engineering skills!" ] }, { "cell_type": "code", "execution_count": null, "id": "b608e2e6", "metadata": { "nbgrader": { "grade": false, "grade_id": "setup-imports", "locked": false, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "#| default_exp core.setup\n", "\n", "#| export\n", "import sys\n", "import platform\n", "import psutil\n", "import os\n", "from typing import Dict, Any" ] }, { "cell_type": "code", "execution_count": null, "id": "427aefa2", "metadata": { "nbgrader": { "grade": false, "grade_id": "setup-imports", "locked": false, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "print(\"๐Ÿ”ฅ TinyTorch Setup Module\")\n", "print(f\"Python version: {sys.version_info.major}.{sys.version_info.minor}\")\n", "print(f\"Platform: {platform.system()}\")\n", "print(\"Ready to configure your TinyTorch installation!\")" ] }, { "cell_type": "markdown", "id": "946074ef", "metadata": { "cell_marker": "\"\"\"" }, "source": [ "## ๐Ÿ—๏ธ The Architecture of ML Systems Configuration\n", "\n", "### Configuration Layers in Production ML\n", "Real ML systems have multiple configuration layers:\n", "\n", "```\n", "โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”\n", "โ”‚ Application Config โ”‚ โ† Your personal info\n", "โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค\n", "โ”‚ System Environment โ”‚ โ† Hardware specs\n", "โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค\n", "โ”‚ Runtime Configuration โ”‚ โ† Python, libraries\n", "โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค\n", "โ”‚ Infrastructure Config โ”‚ โ† Cloud, containers\n", "โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜\n", "```\n", "\n", "### Why Each Layer Matters\n", "- **Application**: Identifies who built what and when\n", "- **System**: Determines performance characteristics and limitations\n", "- **Runtime**: Affects compatibility and feature availability\n", "- **Infrastructure**: Enables scaling and deployment strategies\n", "\n", "### Connection to Real ML Frameworks\n", "Every major ML framework has configuration:\n", "- **PyTorch**: `torch.cuda.is_available()`, `torch.get_num_threads()`\n", "- **TensorFlow**: `tf.config.list_physical_devices()`, `tf.sysconfig.get_build_info()`\n", "- **Hugging Face**: Model cards with system requirements and performance metrics\n", "- **MLflow**: Experiment tracking with system context and reproducibility\n", "\n", "### TinyTorch's Approach\n", "We'll build configuration that's:\n", "- **Educational**: Teaches system awareness\n", "- **Practical**: Actually useful for debugging\n", "- **Professional**: Follows industry standards\n", "- **Extensible**: Ready for future ML systems features" ] }, { "cell_type": "markdown", "id": "b2bb27d7", "metadata": { "cell_marker": "\"\"\"" }, "source": [ "## Step 1: What is System Configuration?\n", "\n", "### Definition\n", "**System configuration** is the process of setting up your development environment with personalized information and system diagnostics. In TinyTorch, this means:\n", "\n", "- **Personal Information**: Your name, email, institution for identification\n", "- **System Information**: Hardware specs, Python version, platform details\n", "- **Customization**: Making your TinyTorch installation uniquely yours\n", "\n", "### Why Configuration Matters in ML Systems\n", "Proper system configuration is crucial because:\n", "\n", "#### 1. **Reproducibility** \n", "Your setup can be documented and shared:\n", "```python\n", "# Someone else can recreate your environment\n", "config = {\n", " 'developer': 'Your Name',\n", " 'python_version': '3.9.7',\n", " 'platform': 'Darwin',\n", " 'memory_gb': 16.0\n", "}\n", "```\n", "\n", "#### 2. **Debugging**\n", "System info helps troubleshoot ML performance issues:\n", "- **Memory errors**: \"Do I have enough RAM for this model?\"\n", "- **Performance issues**: \"How many CPU cores can I use?\"\n", "- **Compatibility problems**: \"What Python version am I running?\"\n", "\n", "#### 3. **Professional Development**\n", "Shows proper engineering practices:\n", "- **Attribution**: Your work is properly credited\n", "- **Collaboration**: Others can contact you about your code\n", "- **Documentation**: System context is preserved\n", "\n", "#### 4. **ML Systems Integration**\n", "Connects to broader ML engineering:\n", "- **Model cards**: Document system requirements\n", "- **Experiment tracking**: Record hardware context\n", "- **Deployment**: Match development to production environments\n", "\n", "### Real-World Examples\n", "- **Google Colab**: Shows GPU type, RAM, disk space\n", "- **Kaggle**: Displays system specs for reproducibility\n", "- **MLflow**: Tracks system context with experiments\n", "- **Docker**: Containerizes entire system configuration\n", "\n", "Let's start configuring your TinyTorch system!" ] }, { "cell_type": "markdown", "id": "26b13500", "metadata": { "cell_marker": "\"\"\"", "lines_to_next_cell": 1 }, "source": [ "## Step 2: Personal Information Configuration\n", "\n", "### The Concept: Identity in ML Systems\n", "Your **personal information** identifies you as the developer and configures your TinyTorch installation. This isn't just administrative - it's foundational to professional ML development.\n", "\n", "### Why Personal Info Matters in ML Engineering\n", "\n", "#### 1. **Attribution and Accountability**\n", "- **Model ownership**: Who built this model?\n", "- **Responsibility**: Who should be contacted about issues?\n", "- **Credit**: Proper recognition for your work\n", "\n", "#### 2. **Collaboration and Communication**\n", "- **Team coordination**: Multiple developers on ML projects\n", "- **Knowledge sharing**: Others can learn from your work\n", "- **Bug reports**: Contact info for issues and improvements\n", "\n", "#### 3. **Professional Standards**\n", "- **Industry practice**: All professional software has attribution\n", "- **Open source**: Proper credit in shared code\n", "- **Academic integrity**: Clear authorship in research\n", "\n", "#### 4. **System Customization**\n", "- **Personalized experience**: Your TinyTorch installation\n", "- **Unique identification**: Distinguish your work from others\n", "- **Development tracking**: Link code to developer\n", "\n", "### Real-World Parallels\n", "- **Git commits**: Author name and email in every commit\n", "- **Docker images**: Maintainer information in container metadata\n", "- **Python packages**: Author info in `setup.py` and `pyproject.toml`\n", "- **Model cards**: Creator information for ML models\n", "\n", "### Best Practices for Personal Configuration\n", "- **Use real information**: Not placeholders or fake data\n", "- **Professional email**: Accessible and appropriate\n", "- **Descriptive system name**: Unique and meaningful\n", "- **Consistent formatting**: Follow established conventions\n", "\n", "Now let's implement your personal configuration!" ] }, { "cell_type": "code", "execution_count": null, "id": "ae4d2930", "metadata": { "lines_to_next_cell": 1, "nbgrader": { "grade": false, "grade_id": "personal-info", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "#| export\n", "def personal_info() -> Dict[str, str]:\n", " \"\"\"\n", " Return personal information for this TinyTorch installation.\n", " \n", " This function configures your personal TinyTorch installation with your identity.\n", " It's the foundation of proper ML engineering practices - every system needs\n", " to know who built it and how to contact them.\n", " \n", " TODO: Implement personal information configuration.\n", " \n", " STEP-BY-STEP IMPLEMENTATION:\n", " 1. Create a dictionary with your personal details\n", " 2. Include all required keys: developer, email, institution, system_name, version\n", " 3. Use your actual information (not placeholder text)\n", " 4. Make system_name unique and descriptive\n", " 5. Keep version as '1.0.0' for now\n", " \n", " EXAMPLE OUTPUT:\n", " {\n", " 'developer': 'Vijay Janapa Reddi',\n", " 'email': 'vj@eecs.harvard.edu', \n", " 'institution': 'Harvard University',\n", " 'system_name': 'VJ-TinyTorch-Dev',\n", " 'version': '1.0.0'\n", " }\n", " \n", " IMPLEMENTATION HINTS:\n", " - Replace the example with your real information\n", " - Use a descriptive system_name (e.g., 'YourName-TinyTorch-Dev')\n", " - Keep email format valid (contains @ and domain)\n", " - Make sure all values are strings\n", " - Consider how this info will be used in debugging and collaboration\n", " \n", " LEARNING CONNECTIONS:\n", " - This is like the 'author' field in Git commits\n", " - Similar to maintainer info in Docker images\n", " - Parallels author info in Python packages\n", " - Foundation for professional ML development\n", " \"\"\"\n", " ### BEGIN SOLUTION\n", " return {\n", " 'developer': 'Vijay Janapa Reddi',\n", " 'email': 'vj@eecs.harvard.edu',\n", " 'institution': 'Harvard University',\n", " 'system_name': 'VJ-TinyTorch-Dev',\n", " 'version': '1.0.0'\n", " }\n", " ### END SOLUTION" ] }, { "cell_type": "markdown", "id": "3e8b5d05", "metadata": { "cell_marker": "\"\"\"", "lines_to_next_cell": 1 }, "source": [ "## Step 3: System Information Queries\n", "\n", "### The Concept: Hardware-Aware ML Systems\n", "**System information** provides details about your hardware and software environment. This is crucial for ML development because machine learning is fundamentally about computation, and computation depends on hardware.\n", "\n", "### Why System Information Matters in ML Engineering\n", "\n", "#### 1. **Performance Optimization**\n", "- **CPU cores**: Determines parallelization strategies\n", "- **Memory**: Limits batch size and model size\n", "- **Architecture**: Affects numerical precision and optimization\n", "\n", "#### 2. **Compatibility and Debugging**\n", "- **Python version**: Determines available features and libraries\n", "- **Platform**: Affects file paths, process management, and system calls\n", "- **Architecture**: Influences numerical behavior and optimization\n", "\n", "#### 3. **Resource Planning**\n", "- **Training time estimation**: More cores = faster training\n", "- **Memory requirements**: Avoid out-of-memory errors\n", "- **Deployment matching**: Development should match production\n", "\n", "#### 4. **Reproducibility**\n", "- **Environment documentation**: Exact system specifications\n", "- **Performance comparison**: Same code, different hardware\n", "- **Bug reproduction**: System-specific issues\n", "\n", "### The Python System Query Toolkit\n", "You'll learn to use these essential Python modules:\n", "\n", "#### `sys.version_info` - Python Version\n", "```python\n", "version_info = sys.version_info\n", "python_version = f\"{version_info.major}.{version_info.minor}.{version_info.micro}\"\n", "# Example: \"3.9.7\"\n", "```\n", "\n", "#### `platform.system()` - Operating System\n", "```python\n", "platform_name = platform.system()\n", "# Examples: \"Darwin\" (macOS), \"Linux\", \"Windows\"\n", "```\n", "\n", "#### `platform.machine()` - CPU Architecture\n", "```python\n", "architecture = platform.machine()\n", "# Examples: \"x86_64\", \"arm64\", \"aarch64\"\n", "```\n", "\n", "#### `psutil.cpu_count()` - CPU Cores\n", "```python\n", "cpu_count = psutil.cpu_count()\n", "# Example: 8 (cores available for parallel processing)\n", "```\n", "\n", "#### `psutil.virtual_memory().total` - Total RAM\n", "```python\n", "memory_bytes = psutil.virtual_memory().total\n", "memory_gb = round(memory_bytes / (1024**3), 1)\n", "# Example: 16.0 GB\n", "```\n", "\n", "### Real-World Applications\n", "- **PyTorch**: `torch.get_num_threads()` uses CPU count\n", "- **TensorFlow**: `tf.config.list_physical_devices()` queries hardware\n", "- **Scikit-learn**: `n_jobs=-1` uses all available cores\n", "- **Dask**: Automatically configures workers based on CPU count\n", "\n", "### ML Systems Performance Considerations\n", "- **Memory-bound operations**: Matrix multiplication, large model loading\n", "- **CPU-bound operations**: Data preprocessing, feature engineering\n", "- **I/O-bound operations**: Data loading, model saving\n", "- **Platform-specific optimizations**: SIMD instructions, memory management\n", "\n", "Now let's implement system information queries!" ] }, { "cell_type": "code", "execution_count": null, "id": "f1607388", "metadata": { "lines_to_next_cell": 1, "nbgrader": { "grade": false, "grade_id": "system-info", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "#| export\n", "def system_info() -> Dict[str, Any]:\n", " \"\"\"\n", " Query and return system information for this TinyTorch installation.\n", " \n", " This function gathers crucial hardware and software information that affects\n", " ML performance, compatibility, and debugging. It's the foundation of \n", " hardware-aware ML systems.\n", " \n", " TODO: Implement system information queries.\n", " \n", " STEP-BY-STEP IMPLEMENTATION:\n", " 1. Get Python version using sys.version_info\n", " 2. Get platform using platform.system()\n", " 3. Get architecture using platform.machine()\n", " 4. Get CPU count using psutil.cpu_count()\n", " 5. Get memory using psutil.virtual_memory().total\n", " 6. Convert memory from bytes to GB (divide by 1024^3)\n", " 7. Return all information in a dictionary\n", " \n", " EXAMPLE OUTPUT:\n", " {\n", " 'python_version': '3.9.7',\n", " 'platform': 'Darwin', \n", " 'architecture': 'arm64',\n", " 'cpu_count': 8,\n", " 'memory_gb': 16.0\n", " }\n", " \n", " IMPLEMENTATION HINTS:\n", " - Use f-string formatting for Python version: f\"{major}.{minor}.{micro}\"\n", " - Memory conversion: bytes / (1024^3) = GB\n", " - Round memory to 1 decimal place for readability\n", " - Make sure data types are correct (strings for text, int for cpu_count, float for memory_gb)\n", " \n", " LEARNING CONNECTIONS:\n", " - This is like `torch.cuda.is_available()` in PyTorch\n", " - Similar to system info in MLflow experiment tracking\n", " - Parallels hardware detection in TensorFlow\n", " - Foundation for performance optimization in ML systems\n", " \n", " PERFORMANCE IMPLICATIONS:\n", " - cpu_count affects parallel processing capabilities\n", " - memory_gb determines maximum model and batch sizes\n", " - platform affects file system and process management\n", " - architecture influences numerical precision and optimization\n", " \"\"\"\n", " ### BEGIN SOLUTION\n", " # Get Python version\n", " version_info = sys.version_info\n", " python_version = f\"{version_info.major}.{version_info.minor}.{version_info.micro}\"\n", " \n", " # Get platform information\n", " platform_name = platform.system()\n", " architecture = platform.machine()\n", " \n", " # Get CPU information\n", " cpu_count = psutil.cpu_count()\n", " \n", " # Get memory information (convert bytes to GB)\n", " memory_bytes = psutil.virtual_memory().total\n", " memory_gb = round(memory_bytes / (1024**3), 1)\n", " \n", " return {\n", " 'python_version': python_version,\n", " 'platform': platform_name,\n", " 'architecture': architecture,\n", " 'cpu_count': cpu_count,\n", " 'memory_gb': memory_gb\n", " }\n", " ### END SOLUTION" ] }, { "cell_type": "markdown", "id": "3671c633", "metadata": { "cell_marker": "\"\"\"" }, "source": [ "## ๐Ÿงช Testing Your Configuration Functions\n", "\n", "### The Importance of Testing in ML Systems\n", "Before we test your implementation, let's understand why testing is crucial in ML systems:\n", "\n", "#### 1. **Reliability**\n", "- **Function correctness**: Does your code do what it's supposed to?\n", "- **Edge case handling**: What happens with unexpected inputs?\n", "- **Error detection**: Catch bugs before they cause problems\n", "\n", "#### 2. **Reproducibility**\n", "- **Consistent behavior**: Same inputs always produce same outputs\n", "- **Environment validation**: Ensure setup works across different systems\n", "- **Regression prevention**: New changes don't break existing functionality\n", "\n", "#### 3. **Professional Development**\n", "- **Code quality**: Well-tested code is maintainable code\n", "- **Collaboration**: Others can trust and extend your work\n", "- **Documentation**: Tests serve as executable documentation\n", "\n", "#### 4. **ML-Specific Concerns**\n", "- **Data validation**: Ensure data types and shapes are correct\n", "- **Performance verification**: Check that optimizations work\n", "- **System compatibility**: Verify cross-platform behavior\n", "\n", "### Testing Strategy\n", "We'll use comprehensive testing that checks:\n", "- **Return types**: Are outputs the correct data types?\n", "- **Required fields**: Are all expected keys present?\n", "- **Data validation**: Are values reasonable and properly formatted?\n", "- **System accuracy**: Do queries match actual system state?\n", "\n", "Now let's test your configuration functions!" ] }, { "cell_type": "markdown", "id": "fa14788c", "metadata": { "cell_marker": "\"\"\"" }, "source": [ "### ๐Ÿงช Test Your Configuration Functions\n", "\n", "Once you implement both functions above, run this cell to test them:" ] }, { "cell_type": "code", "execution_count": null, "id": "6c0c8c52", "metadata": { "nbgrader": { "grade": true, "grade_id": "test-personal-info", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Test personal information configuration\n", "print(\"๐Ÿ”ฌ Unit Test: Personal Information...\")\n", "\n", "# Test personal_info function\n", "personal = personal_info()\n", "\n", "# Test return type\n", "assert isinstance(personal, dict), \"personal_info should return a dictionary\"\n", "\n", "# Test required keys\n", "required_keys = ['developer', 'email', 'institution', 'system_name', 'version']\n", "for key in required_keys:\n", " assert key in personal, f\"Dictionary should have '{key}' key\"\n", "\n", "# Test non-empty values\n", "for key, value in personal.items():\n", " assert isinstance(value, str), f\"Value for '{key}' should be a string\"\n", " assert len(value) > 0, f\"Value for '{key}' cannot be empty\"\n", "\n", "# Test email format\n", "assert '@' in personal['email'], \"Email should contain @ symbol\"\n", "assert '.' in personal['email'], \"Email should contain domain\"\n", "\n", "# Test version format\n", "assert personal['version'] == '1.0.0', \"Version should be '1.0.0'\"\n", "\n", "# Test system name (should be unique/personalized)\n", "assert len(personal['system_name']) > 5, \"System name should be descriptive\"\n", "\n", "print(\"โœ… Personal info function tests passed!\")\n", "print(f\"โœ… TinyTorch configured for: {personal['developer']}\")\n", "print(f\"โœ… System: {personal['system_name']}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "7b30693d", "metadata": { "nbgrader": { "grade": true, "grade_id": "test-system-info", "locked": true, "points": 25, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Test system information queries\n", "print(\"๐Ÿ”ฌ Unit Test: System Information...\")\n", "\n", "# Test system_info function\n", "sys_info = system_info()\n", "\n", "# Test return type\n", "assert isinstance(sys_info, dict), \"system_info should return a dictionary\"\n", "\n", "# Test required keys\n", "required_keys = ['python_version', 'platform', 'architecture', 'cpu_count', 'memory_gb']\n", "for key in required_keys:\n", " assert key in sys_info, f\"Dictionary should have '{key}' key\"\n", "\n", "# Test data types\n", "assert isinstance(sys_info['python_version'], str), \"python_version should be string\"\n", "assert isinstance(sys_info['platform'], str), \"platform should be string\"\n", "assert isinstance(sys_info['architecture'], str), \"architecture should be string\"\n", "assert isinstance(sys_info['cpu_count'], int), \"cpu_count should be integer\"\n", "assert isinstance(sys_info['memory_gb'], (int, float)), \"memory_gb should be number\"\n", "\n", "# Test reasonable values\n", "assert sys_info['cpu_count'] > 0, \"CPU count should be positive\"\n", "assert sys_info['memory_gb'] > 0, \"Memory should be positive\"\n", "assert len(sys_info['python_version']) > 0, \"Python version should not be empty\"\n", "\n", "# Test that values are actually queried (not hardcoded)\n", "actual_version = f\"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}\"\n", "assert sys_info['python_version'] == actual_version, \"Python version should match actual system\"\n", "\n", "print(\"โœ… System info function tests passed!\")\n", "print(f\"โœ… Python: {sys_info['python_version']} on {sys_info['platform']}\")\n", "print(f\"โœ… Memory: {sys_info['memory_gb']} GB, CPUs: {sys_info['cpu_count']}\")" ] }, { "cell_type": "markdown", "id": "c44390b2", "metadata": { "cell_marker": "\"\"\"", "lines_to_next_cell": 1 }, "source": [ "### ๐Ÿงช Inline Test Functions\n", "\n", "These test functions provide immediate feedback when developing your solutions:" ] }, { "cell_type": "code", "execution_count": null, "id": "404c5605", "metadata": { "lines_to_next_cell": 1 }, "outputs": [], "source": [ "def test_personal_info():\n", " \"\"\"Test personal_info function implementation.\"\"\"\n", " print(\"๐Ÿ”ฌ Unit Test: Personal Information...\")\n", " \n", " # Test personal_info function\n", " personal = personal_info()\n", " \n", " # Test return type\n", " assert isinstance(personal, dict), \"personal_info should return a dictionary\"\n", " \n", " # Test required keys\n", " required_keys = ['developer', 'email', 'institution', 'system_name', 'version']\n", " for key in required_keys:\n", " assert key in personal, f\"Dictionary should have '{key}' key\"\n", " \n", " # Test non-empty values\n", " for key, value in personal.items():\n", " assert isinstance(value, str), f\"Value for '{key}' should be a string\"\n", " assert len(value) > 0, f\"Value for '{key}' cannot be empty\"\n", " \n", " # Test email format\n", " assert '@' in personal['email'], \"Email should contain @ symbol\"\n", " assert '.' in personal['email'], \"Email should contain domain\"\n", " \n", " # Test version format\n", " assert personal['version'] == '1.0.0', \"Version should be '1.0.0'\"\n", " \n", " # Test system name (should be unique/personalized)\n", " assert len(personal['system_name']) > 5, \"System name should be descriptive\"\n", " \n", " print(\"โœ… Personal info function tests passed!\")\n", " print(f\"โœ… TinyTorch configured for: {personal['developer']}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "5ab7c64b", "metadata": { "lines_to_next_cell": 1 }, "outputs": [], "source": [ "def test_system_info():\n", " \"\"\"Test system_info function implementation.\"\"\"\n", " print(\"๐Ÿ”ฌ Unit Test: System Information...\")\n", " \n", " # Test system_info function\n", " sys_info = system_info()\n", " \n", " # Test return type\n", " assert isinstance(sys_info, dict), \"system_info should return a dictionary\"\n", " \n", " # Test required keys\n", " required_keys = ['python_version', 'platform', 'architecture', 'cpu_count', 'memory_gb']\n", " for key in required_keys:\n", " assert key in sys_info, f\"Dictionary should have '{key}' key\"\n", " \n", " # Test data types\n", " assert isinstance(sys_info['python_version'], str), \"python_version should be string\"\n", " assert isinstance(sys_info['platform'], str), \"platform should be string\"\n", " assert isinstance(sys_info['architecture'], str), \"architecture should be string\"\n", " assert isinstance(sys_info['cpu_count'], int), \"cpu_count should be integer\"\n", " assert isinstance(sys_info['memory_gb'], (int, float)), \"memory_gb should be number\"\n", " \n", " # Test reasonable values\n", " assert sys_info['cpu_count'] > 0, \"CPU count should be positive\"\n", " assert sys_info['memory_gb'] > 0, \"Memory should be positive\"\n", " assert len(sys_info['python_version']) > 0, \"Python version should not be empty\"\n", " \n", " # Test that values are actually queried (not hardcoded)\n", " actual_version = f\"{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}\"\n", " assert sys_info['python_version'] == actual_version, \"Python version should match actual system\"\n", " \n", " print(\"โœ… System info function tests passed!\")\n", " print(f\"โœ… Python: {sys_info['python_version']} on {sys_info['platform']}\")" ] }, { "cell_type": "markdown", "id": "54d58db1", "metadata": { "cell_marker": "\"\"\"" }, "source": [ "## ๐ŸŽฏ Professional ML Engineering Skills\n", "\n", "You've successfully configured your TinyTorch installation and learned the foundations of ML systems engineering:\n", "\n", "### What You've Accomplished\n", "โœ… **Personal Configuration**: Set up your identity and custom system name \n", "โœ… **System Queries**: Learned to gather hardware and software information \n", "โœ… **NBGrader Workflow**: Mastered solution blocks and automated testing \n", "โœ… **Code Export**: Created functions that become part of your tinytorch package \n", "โœ… **Professional Setup**: Established proper development practices \n", "\n", "### Key Concepts You've Learned\n", "\n", "#### 1. **System Awareness**\n", "- **Hardware constraints**: Understanding CPU, memory, and architecture limitations\n", "- **Software dependencies**: Python version and platform compatibility\n", "- **Performance implications**: How system specs affect ML workloads\n", "\n", "#### 2. **Configuration Management**\n", "- **Personal identification**: Professional attribution and contact information\n", "- **Environment documentation**: Reproducible system specifications\n", "- **Professional standards**: Industry-standard development practices\n", "\n", "#### 3. **ML Systems Foundations**\n", "- **Reproducibility**: System context for experiment tracking\n", "- **Debugging**: Hardware info for performance troubleshooting\n", "- **Collaboration**: Proper attribution and contact information\n", "\n", "#### 4. **Development Workflow**\n", "- **NBGrader integration**: Automated testing and grading\n", "- **Code export**: Functions become part of production package\n", "- **Testing practices**: Comprehensive validation of functionality\n", "\n", "### Connections to Real ML Systems\n", "\n", "This module connects to broader ML engineering practices:\n", "\n", "#### **Industry Parallels**\n", "- **Docker containers**: System configuration and reproducibility\n", "- **MLflow tracking**: Experiment context and system metadata\n", "- **Model cards**: Documentation of system requirements and performance\n", "- **CI/CD pipelines**: Automated testing and environment validation\n", "\n", "#### **Production Considerations**\n", "- **Deployment matching**: Development environment should match production\n", "- **Resource planning**: Understanding hardware constraints for scaling\n", "- **Monitoring**: System metrics for performance optimization\n", "- **Debugging**: System context for troubleshooting issues\n", "\n", "### Next Steps in Your ML Systems Journey\n", "\n", "#### **Immediate Actions**\n", "1. **Export your code**: `tito module export 01_setup`\n", "2. **Test your installation**: \n", " ```python\n", " from tinytorch.core.setup import personal_info, system_info\n", " print(personal_info()) # Your personal details\n", " print(system_info()) # System information\n", " ```\n", "3. **Verify package integration**: Ensure your functions work in the tinytorch package\n", "\n", "#### **Looking Ahead**\n", "- **Module 1 (Tensor)**: Build the fundamental data structure for ML\n", "- **Module 2 (Activations)**: Add nonlinearity for complex learning\n", "- **Module 3 (Layers)**: Create the building blocks of neural networks\n", "- **Module 4 (Networks)**: Compose layers into powerful architectures\n", "\n", "#### **Course Progression**\n", "You're now ready to build a complete ML system from scratch:\n", "```\n", "Setup โ†’ Tensor โ†’ Activations โ†’ Layers โ†’ Networks โ†’ CNN โ†’ DataLoader โ†’ \n", "Autograd โ†’ Optimizers โ†’ Training โ†’ Compression โ†’ Kernels โ†’ Benchmarking โ†’ MLOps\n", "```\n", "\n", "### Professional Development Milestone\n", "\n", "You've taken your first step in ML systems engineering! This module taught you:\n", "- **System thinking**: Understanding hardware and software constraints\n", "- **Professional practices**: Proper attribution, testing, and documentation\n", "- **Tool mastery**: NBGrader workflow and package development\n", "- **Foundation building**: Creating reusable, tested, documented code\n", "\n", "**Ready for the next challenge?** Let's build the foundation of ML systems with tensors!" ] }, { "cell_type": "markdown", "id": "fdb8068c", "metadata": { "cell_marker": "\"\"\"" }, "source": [ "## Step 4: Environment Validation\n", "\n", "### The Concept: Dependency Management in ML Systems\n", "**Environment validation** ensures your system has the necessary packages and versions for ML development. This is crucial because ML systems have complex dependency chains that can break in subtle ways.\n", "\n", "### Why Environment Validation Matters\n", "\n", "#### 1. **Compatibility Assurance**\n", "- **Version conflicts**: Different packages may require incompatible versions\n", "- **API changes**: New versions might break existing code\n", "- **Feature availability**: Some features require specific versions\n", "\n", "#### 2. **Reproducibility**\n", "- **Environment documentation**: Exact package versions for reproduction\n", "- **Dependency tracking**: Understanding what's installed and why\n", "- **Debugging support**: Version info helps troubleshoot issues\n", "\n", "#### 3. **Professional Development**\n", "- **Deployment safety**: Ensure development matches production\n", "- **Collaboration**: Team members need compatible environments\n", "- **Quality assurance**: Validate setup before beginning work\n", "\n", "### Essential ML Dependencies\n", "We'll check for core packages that ML systems depend on:\n", "- **numpy**: Fundamental numerical computing\n", "- **matplotlib**: Visualization and plotting\n", "- **psutil**: System information and monitoring\n", "- **jupyter**: Interactive development environment\n", "- **nbdev**: Package development tools\n", "- **pytest**: Testing framework\n", "\n", "### Real-World Applications\n", "- **Docker**: Container images include dependency validation\n", "- **CI/CD**: Automated testing validates environment setup\n", "- **MLflow**: Tracks package versions with experiment metadata\n", "- **Kaggle**: Validates package availability in competition environments\n", "\n", "Let's implement environment validation!" ] }, { "cell_type": "code", "execution_count": null, "id": "7e36a801", "metadata": { "lines_to_next_cell": 1, "nbgrader": { "grade": false, "grade_id": "environment-validation", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "#| export\n", "import importlib\n", "import pkg_resources\n", "from typing import Dict, List, Optional\n", "\n", "def validate_environment() -> Dict[str, Any]:\n", " \"\"\"\n", " Validate ML development environment and check essential dependencies.\n", " \n", " This function checks that your system has the necessary packages for ML development.\n", " It's like a pre-flight check before you start building ML systems.\n", " \n", " TODO: Implement environment validation.\n", " \n", " STEP-BY-STEP IMPLEMENTATION:\n", " 1. Define list of essential ML packages to check\n", " 2. For each package, try to import it and get version\n", " 3. Track which packages are available vs missing\n", " 4. Calculate environment health score\n", " 5. Return comprehensive environment report\n", " \n", " ESSENTIAL PACKAGES TO CHECK:\n", " - numpy: Numerical computing foundation\n", " - matplotlib: Visualization and plotting\n", " - psutil: System monitoring\n", " - jupyter: Interactive development\n", " - nbdev: Package development\n", " - pytest: Testing framework\n", " \n", " IMPLEMENTATION HINTS:\n", " - Use try/except to handle missing packages gracefully\n", " - Use pkg_resources.get_distribution(package).version for versions\n", " - Calculate health_score as (available_packages / total_packages) * 100\n", " - Round health_score to 1 decimal place\n", " \"\"\"\n", " ### BEGIN SOLUTION\n", " essential_packages = [\n", " 'numpy', 'matplotlib', 'psutil', 'jupyter', 'nbdev', 'pytest'\n", " ]\n", " \n", " available = {}\n", " missing = []\n", " \n", " for package in essential_packages:\n", " try:\n", " # Try to import the package\n", " importlib.import_module(package)\n", " # Get version information\n", " version = pkg_resources.get_distribution(package).version\n", " available[package] = version\n", " except (ImportError, pkg_resources.DistributionNotFound):\n", " missing.append(package)\n", " \n", " # Calculate health score\n", " total_packages = len(essential_packages)\n", " available_packages = len(available)\n", " health_score = round((available_packages / total_packages) * 100, 1)\n", " \n", " return {\n", " 'available_packages': available,\n", " 'missing_packages': missing,\n", " 'health_score': health_score,\n", " 'total_checked': total_packages,\n", " 'status': 'healthy' if health_score >= 80 else 'needs_attention'\n", " }\n", " ### END SOLUTION" ] }, { "cell_type": "markdown", "id": "4547fb8d", "metadata": { "cell_marker": "\"\"\"" }, "source": [ "## Step 5: Performance Benchmarking\n", "\n", "### The Concept: Hardware Performance Profiling\n", "**Performance benchmarking** measures your system's computational capabilities for ML workloads. This helps you understand your hardware limits and optimize your development workflow.\n", "\n", "### Why Performance Benchmarking Matters\n", "\n", "#### 1. **Resource Planning**\n", "- **Training time estimation**: How long will model training take?\n", "- **Memory allocation**: What's the maximum batch size you can handle?\n", "- **Parallelization**: How many cores can you effectively use?\n", "\n", "#### 2. **Optimization Guidance**\n", "- **Bottleneck identification**: Is your system CPU-bound or memory-bound?\n", "- **Hardware upgrades**: What would improve performance most?\n", "- **Algorithm selection**: Which algorithms suit your hardware?\n", "\n", "#### 3. **Performance Comparison**\n", "- **Baseline establishment**: Track performance over time\n", "- **System comparison**: Compare different development environments\n", "- **Deployment planning**: Match development to production performance\n", "\n", "### Benchmarking Strategy\n", "We'll test key ML operations:\n", "- **CPU computation**: Matrix operations that stress the processor\n", "- **Memory bandwidth**: Large data transfers that test memory speed\n", "- **Overall system**: Combined CPU and memory performance\n", "\n", "### Real-World Applications\n", "- **MLPerf**: Industry-standard ML benchmarks\n", "- **Cloud providers**: Performance metrics for instance selection\n", "- **Hardware vendors**: Benchmark comparisons for purchasing decisions\n", "\n", "Let's implement performance benchmarking!" ] }, { "cell_type": "code", "execution_count": null, "id": "c80ba038", "metadata": { "lines_to_next_cell": 1, "nbgrader": { "grade": false, "grade_id": "performance-benchmark", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "#| export\n", "import time\n", "import random\n", "\n", "def benchmark_performance() -> Dict[str, Any]:\n", " \"\"\"\n", " Benchmark system performance for ML workloads.\n", " \n", " This function measures computational performance to help you understand\n", " your system's capabilities and optimize your ML development workflow.\n", " \n", " TODO: Implement performance benchmarking.\n", " \n", " STEP-BY-STEP IMPLEMENTATION:\n", " 1. CPU Test: Time a computationally intensive operation\n", " 2. Memory Test: Time a memory-intensive operation\n", " 3. Calculate performance scores based on execution time\n", " 4. Determine overall system performance rating\n", " 5. Return comprehensive benchmark results\n", " \n", " BENCHMARK TESTS:\n", " - CPU: Nested loop calculation (computational intensity)\n", " - Memory: Large list operations (memory bandwidth)\n", " - Combined: Overall system performance score\n", " \n", " IMPLEMENTATION HINTS:\n", " - Use time.time() to measure execution time\n", " - CPU test: nested loops with mathematical operations\n", " - Memory test: large list creation and manipulation\n", " - Lower execution time = better performance\n", " - Calculate scores as inverse of time (e.g., 1/time * 1000)\n", " \"\"\"\n", " ### BEGIN SOLUTION\n", " benchmarks = {}\n", " \n", " # CPU Performance Test\n", " print(\"โšก Running CPU benchmark...\")\n", " start_time = time.time()\n", " \n", " # CPU-intensive calculation\n", " result = 0\n", " for i in range(100000):\n", " result += i * i + i / 2\n", " \n", " cpu_time = time.time() - start_time\n", " benchmarks['cpu_time'] = round(cpu_time, 3)\n", " benchmarks['cpu_score'] = round(1000 / cpu_time, 1)\n", " \n", " # Memory Performance Test\n", " print(\"๐Ÿง  Running memory benchmark...\")\n", " start_time = time.time()\n", " \n", " # Memory-intensive operations\n", " large_list = list(range(1000000))\n", " large_list.reverse()\n", " large_list.sort()\n", " \n", " memory_time = time.time() - start_time\n", " benchmarks['memory_time'] = round(memory_time, 3)\n", " benchmarks['memory_score'] = round(1000 / memory_time, 1)\n", " \n", " # Overall Performance Score\n", " overall_score = round((benchmarks['cpu_score'] + benchmarks['memory_score']) / 2, 1)\n", " benchmarks['overall_score'] = overall_score\n", " \n", " # Performance Rating\n", " if overall_score >= 80:\n", " rating = 'excellent'\n", " elif overall_score >= 60:\n", " rating = 'good'\n", " elif overall_score >= 40:\n", " rating = 'fair'\n", " else:\n", " rating = 'needs_optimization'\n", " \n", " benchmarks['performance_rating'] = rating\n", " \n", " return benchmarks\n", " ### END SOLUTION" ] }, { "cell_type": "markdown", "id": "666b386a", "metadata": { "cell_marker": "\"\"\"" }, "source": [ "## Step 6: Development Environment Setup\n", "\n", "### The Concept: Professional Development Configuration\n", "**Development environment setup** configures essential tools and settings for professional ML development. This includes Git configuration, Jupyter settings, and other tools that make development more efficient.\n", "\n", "### Why Development Setup Matters\n", "\n", "#### 1. **Professional Standards**\n", "- **Version control**: Proper Git configuration for collaboration\n", "- **Code quality**: Consistent formatting and style\n", "- **Documentation**: Automatic documentation generation\n", "\n", "#### 2. **Productivity Optimization**\n", "- **Tool configuration**: Optimized settings for efficiency\n", "- **Workflow automation**: Reduce repetitive tasks\n", "- **Error prevention**: Catch issues before they become problems\n", "\n", "#### 3. **Collaboration Readiness**\n", "- **Team compatibility**: Consistent development environment\n", "- **Code sharing**: Proper attribution and commit messages\n", "- **Project standards**: Follow established conventions\n", "\n", "### Essential Development Tools\n", "We'll configure key tools for ML development:\n", "- **Git**: Version control and collaboration\n", "- **Jupyter**: Interactive development environment\n", "- **Python**: Code formatting and quality tools\n", "\n", "Let's implement development environment setup!" ] }, { "cell_type": "code", "execution_count": null, "id": "a34ebb28", "metadata": { "lines_to_next_cell": 1, "nbgrader": { "grade": false, "grade_id": "development-setup", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "#| export\n", "import subprocess\n", "import json\n", "from pathlib import Path\n", "\n", "def setup_development_environment() -> Dict[str, Any]:\n", " \"\"\"\n", " Configure development environment for professional ML development.\n", " \n", " This function sets up essential tools and configurations to make your\n", " development workflow more efficient and professional.\n", " \n", " TODO: Implement development environment setup.\n", " \n", " STEP-BY-STEP IMPLEMENTATION:\n", " 1. Check if Git is installed and configured\n", " 2. Verify Jupyter installation and configuration\n", " 3. Check Python development tools\n", " 4. Configure any missing tools\n", " 5. Return setup status and recommendations\n", " \n", " DEVELOPMENT TOOLS TO CHECK:\n", " - Git: Version control system\n", " - Jupyter: Interactive development\n", " - Python tools: Code quality and formatting\n", " \n", " IMPLEMENTATION HINTS:\n", " - Use subprocess.run() to check tool availability\n", " - Use try/except to handle missing tools gracefully\n", " - Provide helpful recommendations for missing tools\n", " - Focus on tools that improve ML development workflow\n", " \"\"\"\n", " ### BEGIN SOLUTION\n", " setup_status = {}\n", " recommendations = []\n", " \n", " # Check Git installation and configuration\n", " try:\n", " git_version = subprocess.run(['git', '--version'], \n", " capture_output=True, text=True, check=True)\n", " setup_status['git_installed'] = True\n", " setup_status['git_version'] = git_version.stdout.strip()\n", " \n", " # Check Git configuration\n", " try:\n", " git_name = subprocess.run(['git', 'config', 'user.name'], \n", " capture_output=True, text=True, check=True)\n", " git_email = subprocess.run(['git', 'config', 'user.email'], \n", " capture_output=True, text=True, check=True)\n", " setup_status['git_configured'] = True\n", " setup_status['git_name'] = git_name.stdout.strip()\n", " setup_status['git_email'] = git_email.stdout.strip()\n", " except subprocess.CalledProcessError:\n", " setup_status['git_configured'] = False\n", " recommendations.append(\"Configure Git: git config --global user.name 'Your Name'\")\n", " recommendations.append(\"Configure Git: git config --global user.email 'your.email@domain.com'\")\n", " \n", " except (subprocess.CalledProcessError, FileNotFoundError):\n", " setup_status['git_installed'] = False\n", " recommendations.append(\"Install Git: https://git-scm.com/downloads\")\n", " \n", " # Check Jupyter installation\n", " try:\n", " jupyter_version = subprocess.run(['jupyter', '--version'], \n", " capture_output=True, text=True, check=True)\n", " setup_status['jupyter_installed'] = True\n", " setup_status['jupyter_version'] = jupyter_version.stdout.strip()\n", " except (subprocess.CalledProcessError, FileNotFoundError):\n", " setup_status['jupyter_installed'] = False\n", " recommendations.append(\"Install Jupyter: pip install jupyter\")\n", " \n", " # Check Python tools\n", " python_tools = ['pip', 'python']\n", " for tool in python_tools:\n", " try:\n", " tool_version = subprocess.run([tool, '--version'], \n", " capture_output=True, text=True, check=True)\n", " setup_status[f'{tool}_installed'] = True\n", " setup_status[f'{tool}_version'] = tool_version.stdout.strip()\n", " except (subprocess.CalledProcessError, FileNotFoundError):\n", " setup_status[f'{tool}_installed'] = False\n", " recommendations.append(f\"Install {tool}: Check Python installation\")\n", " \n", " # Calculate setup health\n", " total_tools = 4 # git, jupyter, pip, python\n", " installed_tools = sum([\n", " setup_status.get('git_installed', False),\n", " setup_status.get('jupyter_installed', False),\n", " setup_status.get('pip_installed', False),\n", " setup_status.get('python_installed', False)\n", " ])\n", " \n", " setup_score = round((installed_tools / total_tools) * 100, 1)\n", " \n", " return {\n", " 'setup_status': setup_status,\n", " 'recommendations': recommendations,\n", " 'setup_score': setup_score,\n", " 'status': 'ready' if setup_score >= 75 else 'needs_configuration'\n", " }\n", " ### END SOLUTION" ] }, { "cell_type": "markdown", "id": "c27d83df", "metadata": { "cell_marker": "\"\"\"" }, "source": [ "## Step 7: Comprehensive System Report\n", "\n", "### The Concept: Integrated System Analysis\n", "**Comprehensive system reporting** combines all your configuration and diagnostic information into a single, actionable report. This is like a \"health check\" for your ML development environment.\n", "\n", "### Why Comprehensive Reporting Matters\n", "\n", "#### 1. **Holistic View**\n", "- **Complete picture**: All system information in one place\n", "- **Dependency analysis**: How different components interact\n", "- **Performance context**: Understanding system capabilities\n", "\n", "#### 2. **Troubleshooting Support**\n", "- **Debugging aid**: Complete environment information for issue resolution\n", "- **Performance analysis**: Identify bottlenecks and optimization opportunities\n", "- **Compatibility checking**: Ensure all components work together\n", "\n", "#### 3. **Professional Documentation**\n", "- **Environment documentation**: Complete system specification\n", "- **Reproducibility**: All information needed to recreate environment\n", "- **Sharing**: Easy to share system information with collaborators\n", "\n", "Let's create a comprehensive system report!" ] }, { "cell_type": "code", "execution_count": null, "id": "89b9aac3", "metadata": { "lines_to_next_cell": 1, "nbgrader": { "grade": false, "grade_id": "system-report", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "#| export\n", "from datetime import datetime\n", "\n", "def generate_system_report() -> Dict[str, Any]:\n", " \"\"\"\n", " Generate comprehensive system report for ML development.\n", " \n", " This function combines all configuration and diagnostic information\n", " into a single, actionable report for your ML development environment.\n", " \n", " TODO: Implement comprehensive system reporting.\n", " \n", " STEP-BY-STEP IMPLEMENTATION:\n", " 1. Gather personal information\n", " 2. Collect system information\n", " 3. Validate environment\n", " 4. Run performance benchmarks\n", " 5. Check development setup\n", " 6. Generate overall health score\n", " 7. Create comprehensive report with recommendations\n", " \n", " REPORT SECTIONS:\n", " - Personal configuration\n", " - System specifications\n", " - Environment validation\n", " - Performance benchmarks\n", " - Development setup\n", " - Overall health assessment\n", " - Recommendations for improvement\n", " \n", " IMPLEMENTATION HINTS:\n", " - Call all previously implemented functions\n", " - Combine results into comprehensive report\n", " - Calculate overall health score from all components\n", " - Provide actionable recommendations\n", " \"\"\"\n", " ### BEGIN SOLUTION\n", " print(\"๐Ÿ“Š Generating comprehensive system report...\")\n", " \n", " # Gather all information\n", " personal = personal_info()\n", " system = system_info()\n", " environment = validate_environment()\n", " performance = benchmark_performance()\n", " development = setup_development_environment()\n", " \n", " # Calculate overall health score (normalize performance score to 0-100 range)\n", " normalized_performance = min(performance['overall_score'], 100) # Cap at 100\n", " \n", " health_components = [\n", " environment['health_score'],\n", " normalized_performance,\n", " development['setup_score']\n", " ]\n", " \n", " overall_health = round(sum(health_components) / len(health_components), 1)\n", " \n", " # Generate status\n", " if overall_health >= 85:\n", " status = 'excellent'\n", " elif overall_health >= 70:\n", " status = 'good'\n", " elif overall_health >= 50:\n", " status = 'fair'\n", " else:\n", " status = 'needs_attention'\n", " \n", " # Compile recommendations\n", " recommendations = []\n", " \n", " if environment['health_score'] < 80:\n", " recommendations.extend([f\"Install missing package: {pkg}\" for pkg in environment['missing_packages']])\n", " \n", " if performance['overall_score'] < 50:\n", " recommendations.append(\"Consider hardware upgrade for better ML performance\")\n", " \n", " recommendations.extend(development['recommendations'])\n", " \n", " # Create comprehensive report\n", " report = {\n", " 'timestamp': datetime.now().isoformat(),\n", " 'personal_info': personal,\n", " 'system_info': system,\n", " 'environment_validation': environment,\n", " 'performance_benchmarks': performance,\n", " 'development_setup': development,\n", " 'overall_health': overall_health,\n", " 'status': status,\n", " 'recommendations': recommendations,\n", " 'report_version': '1.0.0'\n", " }\n", " \n", " return report\n", " ### END SOLUTION" ] }, { "cell_type": "markdown", "id": "9063a17e", "metadata": {}, "source": [ "\"\"\"\n", "## ๐Ÿงช Unit Test: Enhanced Setup Functions\n", "\n", "Test all the new enhanced setup functions:\n", "\"\"\"\n", "\n", "Old function removed - using shared test runner pattern" ] }, { "cell_type": "code", "execution_count": null, "id": "4b48e976", "metadata": { "lines_to_next_cell": 1 }, "outputs": [], "source": [ "def test_performance_benchmark():\n", " \"\"\"Test performance benchmarking function.\"\"\"\n", " print(\"๐Ÿ”ฌ Unit Test: Performance Benchmarking...\")\n", " \n", " benchmark_report = benchmark_performance()\n", " \n", " # Test return type and structure\n", " assert isinstance(benchmark_report, dict), \"benchmark_performance should return a dictionary\"\n", " \n", " # Test required keys\n", " required_keys = ['cpu_time', 'cpu_score', 'memory_time', 'memory_score', 'overall_score', 'performance_rating']\n", " for key in required_keys:\n", " assert key in benchmark_report, f\"Report should have '{key}' key\"\n", " \n", " # Test data types\n", " assert isinstance(benchmark_report['cpu_time'], (int, float)), \"cpu_time should be number\"\n", " assert isinstance(benchmark_report['cpu_score'], (int, float)), \"cpu_score should be number\"\n", " assert isinstance(benchmark_report['memory_time'], (int, float)), \"memory_time should be number\"\n", " assert isinstance(benchmark_report['memory_score'], (int, float)), \"memory_score should be number\"\n", " assert isinstance(benchmark_report['overall_score'], (int, float)), \"overall_score should be number\"\n", " assert isinstance(benchmark_report['performance_rating'], str), \"performance_rating should be string\"\n", " \n", " # Test reasonable values\n", " assert benchmark_report['cpu_time'] > 0, \"cpu_time should be positive\"\n", " assert benchmark_report['memory_time'] > 0, \"memory_time should be positive\"\n", " assert benchmark_report['cpu_score'] > 0, \"cpu_score should be positive\"\n", " assert benchmark_report['memory_score'] > 0, \"memory_score should be positive\"\n", " assert benchmark_report['overall_score'] > 0, \"overall_score should be positive\"\n", " \n", " valid_ratings = ['excellent', 'good', 'fair', 'needs_optimization']\n", " assert benchmark_report['performance_rating'] in valid_ratings, \"performance_rating should be valid\"\n", " \n", " print(\"โœ… Performance benchmark tests passed!\")\n", " print(f\"โœ… Performance rating: {benchmark_report['performance_rating']}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "7b09b6ad", "metadata": { "lines_to_next_cell": 1 }, "outputs": [], "source": [ "def test_development_setup():\n", " \"\"\"Test development environment setup function.\"\"\"\n", " print(\"๐Ÿ”ฌ Unit Test: Development Environment Setup...\")\n", " \n", " setup_report = setup_development_environment()\n", " \n", " # Test return type and structure\n", " assert isinstance(setup_report, dict), \"setup_development_environment should return a dictionary\"\n", " \n", " # Test required keys\n", " required_keys = ['setup_status', 'recommendations', 'setup_score', 'status']\n", " for key in required_keys:\n", " assert key in setup_report, f\"Report should have '{key}' key\"\n", " \n", " # Test data types\n", " assert isinstance(setup_report['setup_status'], dict), \"setup_status should be dict\"\n", " assert isinstance(setup_report['recommendations'], list), \"recommendations should be list\"\n", " assert isinstance(setup_report['setup_score'], (int, float)), \"setup_score should be number\"\n", " assert isinstance(setup_report['status'], str), \"status should be string\"\n", " \n", " # Test reasonable values\n", " assert 0 <= setup_report['setup_score'] <= 100, \"setup_score should be between 0 and 100\"\n", " assert setup_report['status'] in ['ready', 'needs_configuration'], \"status should be valid\"\n", " \n", " print(\"โœ… Development setup tests passed!\")\n", " print(f\"โœ… Setup score: {setup_report['setup_score']}%\")" ] }, { "cell_type": "code", "execution_count": null, "id": "68475c70", "metadata": {}, "outputs": [], "source": [ "def test_system_report():\n", " \"\"\"Test comprehensive system report function.\"\"\"\n", " print(\"๐Ÿ”ฌ Unit Test: System Report Generation...\")\n", " \n", " report = generate_system_report()\n", " \n", " # Test return type and structure\n", " assert isinstance(report, dict), \"generate_system_report should return a dictionary\"\n", " \n", " # Test required keys\n", " required_keys = ['timestamp', 'personal_info', 'system_info', 'environment_validation', \n", " 'performance_benchmarks', 'development_setup', 'overall_health', \n", " 'status', 'recommendations', 'report_version']\n", " for key in required_keys:\n", " assert key in report, f\"Report should have '{key}' key\"\n", " \n", " # Test data types\n", " assert isinstance(report['timestamp'], str), \"timestamp should be string\"\n", " assert isinstance(report['personal_info'], dict), \"personal_info should be dict\"\n", " assert isinstance(report['system_info'], dict), \"system_info should be dict\"\n", " assert isinstance(report['environment_validation'], dict), \"environment_validation should be dict\"\n", " assert isinstance(report['performance_benchmarks'], dict), \"performance_benchmarks should be dict\"\n", " assert isinstance(report['development_setup'], dict), \"development_setup should be dict\"\n", " assert isinstance(report['overall_health'], (int, float)), \"overall_health should be number\"\n", " assert isinstance(report['status'], str), \"status should be string\"\n", " assert isinstance(report['recommendations'], list), \"recommendations should be list\"\n", " assert isinstance(report['report_version'], str), \"report_version should be string\"\n", " \n", " # Test reasonable values\n", " assert 0 <= report['overall_health'] <= 100, \"overall_health should be between 0 and 100\"\n", " valid_statuses = ['excellent', 'good', 'fair', 'needs_attention']\n", " assert report['status'] in valid_statuses, \"status should be valid\"\n", " \n", " print(\"โœ… System report tests passed!\")\n", " print(f\"โœ… Overall system health: {report['overall_health']}%\")\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "id": "ba1bcd18", "metadata": { "lines_to_next_cell": 1 }, "outputs": [], "source": [ "def test_personal_info():\n", " \"\"\"Test personal information function comprehensively.\"\"\"\n", " personal = personal_info()\n", " assert isinstance(personal, dict), \"personal_info should return a dictionary\"\n", " assert 'developer' in personal, \"Dictionary should have 'developer' key\"\n", " assert '@' in personal['email'], \"Email should contain @ symbol\"\n", " print(\"โœ… Personal information function works\")\n", "\n", "def test_system_info():\n", " \"\"\"Test system information function comprehensively.\"\"\"\n", " system = system_info()\n", " assert isinstance(system, dict), \"system_info should return a dictionary\"\n", " assert 'python_version' in system, \"Dictionary should have 'python_version' key\"\n", " assert system['memory_gb'] > 0, \"Memory should be positive\"\n", " print(\"โœ… System information function works\")\n", "\n", "def test_environment_validation():\n", " \"\"\"Test environment validation function comprehensively.\"\"\"\n", " env = validate_environment()\n", " assert isinstance(env, dict), \"validate_environment should return a dictionary\"\n", " assert 'health_score' in env, \"Dictionary should have 'health_score' key\"\n", " print(\"โœ… Environment validation function works\")\n", "\n", "def test_performance_benchmark():\n", " \"\"\"Test performance benchmarking function comprehensively.\"\"\"\n", " perf = benchmark_performance()\n", " assert isinstance(perf, dict), \"benchmark_performance should return a dictionary\"\n", " assert 'cpu_score' in perf, \"Dictionary should have 'cpu_score' key\"\n", " print(\"โœ… Performance benchmarking function works\")\n", "\n", "def test_development_setup():\n", " \"\"\"Test development setup function comprehensively.\"\"\"\n", " dev = setup_development_environment()\n", " assert isinstance(dev, dict), \"setup_development_environment should return a dictionary\"\n", " assert 'setup_score' in dev, \"Dictionary should have 'setup_score' key\"\n", " print(\"โœ… Development setup function works\")\n", "\n", "def test_system_report():\n", " \"\"\"Test system report comprehensive function.\"\"\"\n", " report = generate_system_report()\n", " assert isinstance(report, dict), \"generate_system_report should return a dictionary\"\n", " assert 'overall_health' in report, \"Dictionary should have 'overall_health' key\"\n", " print(\"โœ… System report function works\")" ] }, { "cell_type": "markdown", "id": "2415d2ab", "metadata": { "cell_marker": "\"\"\"" }, "source": [ "## ๐Ÿงช Module Testing\n", "\n", "Time to test your implementation! This section uses TinyTorch's standardized testing framework to ensure your implementation works correctly.\n", "\n", "**This testing section is locked** - it provides consistent feedback across all modules and cannot be modified." ] }, { "cell_type": "code", "execution_count": null, "id": "526c9009", "metadata": { "nbgrader": { "grade": false, "grade_id": "standardized-testing", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# =============================================================================\n", "# STANDARDIZED MODULE TESTING - DO NOT MODIFY\n", "# This cell is locked to ensure consistent testing across all TinyTorch modules\n", "# =============================================================================\n", "\n", "if __name__ == \"__main__\":\n", " from tito.tools.testing import run_module_tests_auto\n", " \n", " # Automatically discover and run all tests in this module\n", " success = run_module_tests_auto(\"Setup\")" ] }, { "cell_type": "markdown", "id": "35feea10", "metadata": { "cell_marker": "\"\"\"" }, "source": [ "## ๐ŸŽฏ Module Summary: Development Environment Setup Complete!\n", "\n", "Congratulations! You've successfully set up your TinyTorch development environment:\n", "\n", "### What You've Accomplished\n", "โœ… **Personal Configuration**: Developer information and preferences\n", "โœ… **System Analysis**: Hardware and software environment validation\n", "โœ… **Environment Validation**: Python packages and dependencies\n", "โœ… **Performance Benchmarking**: CPU and memory performance testing\n", "โœ… **Development Setup**: IDE configuration and tooling\n", "โœ… **Comprehensive Reporting**: System health and recommendations\n", "\n", "### Key Concepts You've Learned\n", "- **Environment Management**: How to validate and configure development environments\n", "- **Performance Analysis**: Benchmarking system capabilities for ML workloads\n", "- **System Diagnostics**: Comprehensive health checking and reporting\n", "- **Development Best Practices**: Professional setup for ML development\n", "\n", "### Next Steps\n", "1. **Export your code**: `tito package nbdev --export 00_setup`\n", "2. **Test your implementation**: `tito test 00_setup`\n", "3. **Use your environment**: Start building with confidence in a validated setup\n", "4. **Move to Module 1**: Begin implementing the core tensor system!\n", "\n", "**Ready for the ML journey?** Your development environment is now optimized for building neural networks from scratch!" ] } ], "metadata": { "jupytext": { "main_language": "python" } }, "nbformat": 4, "nbformat_minor": 5 }