mirror of
https://github.com/harvard-edge/cs249r_book.git
synced 2026-04-28 16:48:30 -05:00
Refactors concept maps for volume 1 chapters
Updates concept map YAML files for various chapters in volume 1, including introduction, benchmarking, data engineering, data selection, frameworks, hardware acceleration, ML systems, MLOps, ML workflow, model serving, NN architectures, NN computation, optimizations, responsible engineering, and training. Replaces the old YAML structure with a new structure that focuses on primary, secondary concepts, technical terms, methodologies, and formulas. The change emphasizes the core concepts and their relationships within each chapter. The generated dates are updated to reflect a future date.
This commit is contained in:
@@ -1,120 +1,26 @@
|
||||
concept_map:
|
||||
source: benchmarking.qmd
|
||||
generated_date: 2025-01-12
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- AI Benchmarking
|
||||
- Performance Evaluation
|
||||
- Benchmark Design
|
||||
- Evaluation Metrics
|
||||
- System Performance
|
||||
- Model Benchmarking
|
||||
- Hardware Benchmarking
|
||||
- Standardized Testing
|
||||
- Performance Analysis
|
||||
- Comparative Evaluation
|
||||
- Three-Dimensional Benchmarking (System, Model, Data)
|
||||
- MLPerf Standardized Testing
|
||||
- Benchmarking Granularity (Micro, Macro, End-to-End)
|
||||
- Benchmark vs Production Gap
|
||||
secondary_concepts:
|
||||
- Throughput Measurement
|
||||
- Latency Analysis
|
||||
- Accuracy Assessment
|
||||
- Energy Efficiency
|
||||
- Resource Utilization
|
||||
- Scalability Testing
|
||||
- Robustness Evaluation
|
||||
- Fairness Assessment
|
||||
- Generalization Testing
|
||||
- Real-world Performance
|
||||
- Benchmark Suites
|
||||
- Test Datasets
|
||||
- Evaluation Protocols
|
||||
- Performance Baselines
|
||||
- Regression Testing
|
||||
- A/B Testing
|
||||
- Statistical Significance
|
||||
- Confidence Intervals
|
||||
- Performance Variability
|
||||
- Cross-platform Comparison
|
||||
- Thermal Throttling impact
|
||||
- Statistical Significance in ML
|
||||
- Performance Regression detection
|
||||
- Training vs Inference benchmarks
|
||||
technical_terms:
|
||||
- MLPerf
|
||||
- SPEC benchmarks
|
||||
- ImageNet
|
||||
- GLUE/SuperGLUE
|
||||
- BLEU Score
|
||||
- F1 Score
|
||||
- Mean Average Precision (mAP)
|
||||
- Area Under Curve (AUC)
|
||||
- Top-k Accuracy
|
||||
- Perplexity
|
||||
- Inference Time
|
||||
- Training Time
|
||||
- Memory Usage
|
||||
- Power Consumption
|
||||
- FLOPS (Floating Point Operations per Second)
|
||||
- Queries Per Second (QPS)
|
||||
- Samples Per Second
|
||||
- Batch Size
|
||||
- Model Size
|
||||
- Parameter Count
|
||||
- FLOPs per Operation
|
||||
- Memory Bandwidth
|
||||
- Cache Hit Rate
|
||||
- Thermal Design Power (TDP)
|
||||
- p50/p95/p99 Latency
|
||||
- QPS / TPS
|
||||
- TTFT (Time-to-First-Token)
|
||||
- Jitter
|
||||
methodologies:
|
||||
- Benchmark Development
|
||||
- Test Case Design
|
||||
- Data Collection Protocols
|
||||
- Statistical Analysis
|
||||
- Performance Profiling
|
||||
- Bottleneck Identification
|
||||
- Comparative Analysis
|
||||
- Trend Analysis
|
||||
- Performance Modeling
|
||||
- Workload Characterization
|
||||
- Stress Testing
|
||||
- Load Testing
|
||||
- Endurance Testing
|
||||
- Regression Analysis
|
||||
- Variance Analysis
|
||||
- Outlier Detection
|
||||
- Performance Optimization
|
||||
- Result Validation
|
||||
- Reproducibility Testing
|
||||
- Cross-validation
|
||||
applications:
|
||||
- Hardware Evaluation
|
||||
- Model Selection
|
||||
- System Optimization
|
||||
- Performance Monitoring
|
||||
- Quality Assurance
|
||||
- Research Validation
|
||||
- Product Development
|
||||
- Procurement Decisions
|
||||
- Performance Tracking
|
||||
- Capacity Planning
|
||||
- Resource Allocation
|
||||
- Cost-Performance Analysis
|
||||
- Competitive Analysis
|
||||
- Technology Assessment
|
||||
- Compliance Testing
|
||||
- Standards Development
|
||||
- Academic Research
|
||||
- Industry Collaboration
|
||||
- Certification Programs
|
||||
- Performance Certification
|
||||
keywords: [AI benchmarking, performance evaluation, MLPerf, benchmark design, evaluation metrics, throughput, latency, accuracy assessment, hardware benchmarking, system performance, standardized testing, comparative evaluation, performance analysis, energy efficiency, scalability testing]
|
||||
topics_covered:
|
||||
- topic: Benchmark Fundamentals
|
||||
subtopics: [benchmarking principles, evaluation frameworks, metric selection, test design, data collection, analysis methods]
|
||||
- topic: Performance Metrics
|
||||
subtopics: [accuracy metrics, efficiency metrics, throughput measurement, latency analysis, resource utilization, energy consumption]
|
||||
- topic: Benchmark Suites and Standards
|
||||
subtopics: [MLPerf benchmarks, domain-specific benchmarks, standardized datasets, evaluation protocols, industry standards]
|
||||
- topic: Hardware Benchmarking
|
||||
subtopics: [processor evaluation, accelerator testing, memory performance, interconnect analysis, power efficiency, thermal characteristics]
|
||||
- topic: Model and Algorithm Evaluation
|
||||
subtopics: [model comparison, algorithm assessment, generalization testing, robustness evaluation, fairness analysis]
|
||||
- topic: System-Level Benchmarking
|
||||
subtopics: [end-to-end performance, scalability testing, distributed system evaluation, cloud benchmarking, edge performance]
|
||||
- topic: Statistical Analysis and Interpretation
|
||||
subtopics: [statistical methods, significance testing, confidence intervals, variance analysis, trend analysis, result interpretation]
|
||||
- topic: Benchmark Design and Implementation
|
||||
subtopics: [benchmark development, test case creation, validation procedures, reproducibility, standardization, best practices]
|
||||
- Standardized evaluation protocols
|
||||
- Power measurement boundaries
|
||||
formulas:
|
||||
- Scaling Efficiency
|
||||
- Throughput valid under SLO
|
||||
- Thermal performance reduction (%)
|
||||
|
||||
@@ -1,120 +1,24 @@
|
||||
concept_map:
|
||||
source: conclusion.qmd
|
||||
generated_date: 2025-01-12
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- ML Systems Future
|
||||
- Emerging Technologies
|
||||
- AI Evolution
|
||||
- System Integration
|
||||
- Technological Convergence
|
||||
- Future Challenges
|
||||
- Research Directions
|
||||
- Industry Trends
|
||||
- Innovation Pathways
|
||||
- System Maturity
|
||||
- Synthesis of the AI Triad (Data, Algorithm, Machine)
|
||||
- Systems-First Engineering Philosophy
|
||||
- Emerging Paradigms (Inference-time compute, System 2)
|
||||
- New Golden Age of ML Systems
|
||||
secondary_concepts:
|
||||
- Technology Roadmaps
|
||||
- Research Frontiers
|
||||
- Scaling Challenges
|
||||
- Infrastructure Evolution
|
||||
- Hardware Advances
|
||||
- Software Evolution
|
||||
- Algorithmic Progress
|
||||
- System Optimization
|
||||
- Performance Improvement
|
||||
- Efficiency Gains
|
||||
- Deployment Trends
|
||||
- Adoption Patterns
|
||||
- Market Evolution
|
||||
- Ecosystem Development
|
||||
- Standards Evolution
|
||||
- Regulatory Landscape
|
||||
- Educational Needs
|
||||
- Skill Development
|
||||
- Career Pathways
|
||||
- Professional Development
|
||||
- LLM Scaling Limits
|
||||
- Tail at Scale (amplification effects)
|
||||
- Technological Convergence
|
||||
- Career Pathways in ML Systems
|
||||
technical_terms:
|
||||
- Quantum Machine Learning
|
||||
- Neuromorphic Computing
|
||||
- Brain-Computer Interfaces
|
||||
- Edge-Cloud Continuum
|
||||
- Autonomous Systems
|
||||
- Human-AI Collaboration
|
||||
- Multimodal AI
|
||||
- Foundation Models
|
||||
- Large Language Models
|
||||
- Generative AI
|
||||
- Self-Supervised Learning
|
||||
- Meta-Learning
|
||||
- Continual Learning
|
||||
- Causal AI
|
||||
- Embodied AI
|
||||
- Swarm Intelligence
|
||||
- Collective Intelligence
|
||||
- Hybrid Intelligence
|
||||
- Augmented Intelligence
|
||||
- Computational Creativity
|
||||
- AI Democratization
|
||||
- No-Code/Low-Code AI
|
||||
- Automated ML
|
||||
- Neural Architecture Search
|
||||
- System 2 Compute
|
||||
- Hardware-Software Symbiosis
|
||||
methodologies:
|
||||
- Future Scenario Planning
|
||||
- Technology Forecasting
|
||||
- Trend Analysis
|
||||
- Innovation Management
|
||||
- Research Strategy
|
||||
- Technology Assessment
|
||||
- Roadmap Development
|
||||
- Gap Analysis
|
||||
- Opportunity Identification
|
||||
- Risk Assessment
|
||||
- Strategic Planning
|
||||
- Investment Planning
|
||||
- Resource Allocation
|
||||
- Collaboration Strategies
|
||||
- Partnership Development
|
||||
- Ecosystem Building
|
||||
- Community Development
|
||||
- Knowledge Sharing
|
||||
- Best Practice Development
|
||||
- Continuous Learning
|
||||
applications:
|
||||
- Next-Generation Systems
|
||||
- Intelligent Infrastructure
|
||||
- Smart Environments
|
||||
- Autonomous Ecosystems
|
||||
- Personalized Services
|
||||
- Adaptive Systems
|
||||
- Predictive Systems
|
||||
- Self-Healing Systems
|
||||
- Context-Aware Computing
|
||||
- Ambient Intelligence
|
||||
- Digital Twins
|
||||
- Virtual Assistants
|
||||
- Intelligent Automation
|
||||
- Cognitive Computing
|
||||
- Decision Support Systems
|
||||
- Knowledge Management
|
||||
- Innovation Platforms
|
||||
- Research Tools
|
||||
- Educational Systems
|
||||
- Healthcare Systems
|
||||
keywords: [ML systems future, emerging technologies, AI evolution, technological convergence, research directions, industry trends, system integration, quantum ML, neuromorphic computing, autonomous systems, foundation models, generative AI, human-AI collaboration, innovation pathways]
|
||||
topics_covered:
|
||||
- topic: Technological Evolution and Trends
|
||||
subtopics: [emerging technologies, hardware advances, software evolution, algorithmic progress, performance trends, efficiency improvements]
|
||||
- topic: Research Frontiers and Innovation
|
||||
subtopics: [research directions, scientific breakthroughs, innovation opportunities, technology roadmaps, investment priorities, collaboration strategies]
|
||||
- topic: System Integration and Convergence
|
||||
subtopics: [technology convergence, system integration, platform evolution, ecosystem development, standards evolution, interoperability]
|
||||
- topic: Future Applications and Use Cases
|
||||
subtopics: [next-generation applications, intelligent systems, autonomous systems, human-AI collaboration, societal applications, industry transformation]
|
||||
- topic: Challenges and Opportunities
|
||||
subtopics: [scaling challenges, technical barriers, resource requirements, skill gaps, regulatory challenges, ethical considerations]
|
||||
- topic: Education and Workforce Development
|
||||
subtopics: [skill requirements, educational needs, training programs, career pathways, professional development, lifelong learning]
|
||||
- topic: Industry and Market Evolution
|
||||
subtopics: [market trends, adoption patterns, business models, competitive landscape, investment patterns, commercialization strategies]
|
||||
- topic: Societal Impact and Implications
|
||||
subtopics: [social implications, economic impact, policy considerations, governance frameworks, global cooperation, sustainable development]
|
||||
- Cross-stack optimization synthesis
|
||||
- Future scenario planning
|
||||
formulas:
|
||||
- Tail Latency Ratio (P99/Mean)
|
||||
- Technology Adoption Curve
|
||||
|
||||
@@ -1,117 +1,26 @@
|
||||
concept_map:
|
||||
source: data_engineering.qmd
|
||||
generated_date: 2025-01-12
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- Data Engineering
|
||||
- Data Pipelines
|
||||
- Data Quality
|
||||
- Data as Source Code
|
||||
- Data Cascades
|
||||
- Data Sources
|
||||
- Data Ingestion
|
||||
- Data Processing
|
||||
- Data Labeling
|
||||
- Data Storage
|
||||
- Data Governance
|
||||
secondary_concepts:
|
||||
- Problem Definition
|
||||
- Pipeline Basics
|
||||
- ETL vs ELT
|
||||
- Data Validation
|
||||
- Error Management
|
||||
- Cleaning Techniques
|
||||
- Transformation Techniques
|
||||
- Feature Engineering
|
||||
- Annotation Techniques
|
||||
- Label Quality Assessment
|
||||
- Storage System Types
|
||||
- Feature Storage
|
||||
- Caching Techniques
|
||||
- Data Access Patterns
|
||||
- Privacy and Security
|
||||
- Compliance and Regulation
|
||||
- Documentation and Lineage
|
||||
- Quality Monitoring
|
||||
- Version Control
|
||||
- Performance Optimization
|
||||
technical_terms:
|
||||
- Data Cascades
|
||||
- Keyword Spotting (KWS)
|
||||
- Web Scraping
|
||||
- Crowdsourcing
|
||||
- Synthetic Data Generation
|
||||
- Anonymization Techniques
|
||||
- Data Pipeline Architecture
|
||||
- Batch Ingestion
|
||||
- Stream Processing
|
||||
- Data Validation Checks
|
||||
- Data Gravity
|
||||
- ETL vs. ELT patterns
|
||||
- Feature Stores
|
||||
- Data Warehouses
|
||||
- Data Lakes
|
||||
- Data Marts
|
||||
- GDPR (General Data Protection Regulation)
|
||||
- HIPAA (Health Insurance Portability and Accountability Act)
|
||||
- Training-Serving Consistency
|
||||
secondary_concepts:
|
||||
- Data Locality
|
||||
- Data Drift (detection)
|
||||
- Idempotent transformations
|
||||
technical_terms:
|
||||
- Feature Store
|
||||
- Data Lineage
|
||||
- Data Versioning
|
||||
- Data Drift
|
||||
- Schema Evolution
|
||||
- Data Profiling
|
||||
- Data Catalog
|
||||
- Metadata Management
|
||||
- Data Lake
|
||||
- Feature Catalog
|
||||
- Data Quality (Four Pillars)
|
||||
methodologies:
|
||||
- Systematic Problem Definition
|
||||
- Requirements Gathering
|
||||
- Stakeholder Engagement
|
||||
- Data Quality Assessment
|
||||
- Pipeline Design Patterns
|
||||
- Validation and Testing
|
||||
- Error Handling Strategies
|
||||
- Data Processing Workflows
|
||||
- Labeling Workflows
|
||||
- Quality Control Processes
|
||||
- Storage Architecture Design
|
||||
- Performance Optimization
|
||||
- Governance Implementation
|
||||
- Compliance Management
|
||||
- Documentation Practices
|
||||
- Monitoring and Alerting
|
||||
- Data Lifecycle Management
|
||||
- Backup and Recovery
|
||||
- Access Control
|
||||
- Privacy-Preserving Techniques
|
||||
applications:
|
||||
- Keyword Spotting Systems
|
||||
- Voice Recognition
|
||||
- Computer Vision
|
||||
- Medical Image Analysis
|
||||
- Recommendation Systems
|
||||
- Fraud Detection
|
||||
- Natural Language Processing
|
||||
- Time Series Analysis
|
||||
- IoT Data Processing
|
||||
- Social Media Analytics
|
||||
- E-commerce Systems
|
||||
- Healthcare AI
|
||||
- Autonomous Vehicles
|
||||
- Financial Services
|
||||
- Manufacturing Quality Control
|
||||
- Customer Analytics
|
||||
- Real-time Systems
|
||||
- Batch Processing Systems
|
||||
keywords: [data engineering, data pipelines, data quality, data cascades, data sources, data ingestion, data processing, data labeling, data storage, data governance, ETL, ELT, feature engineering, data validation, synthetic data, crowdsourcing, web scraping, keyword spotting, data warehouses, data lakes, GDPR, HIPAA, data lineage, metadata management]
|
||||
topics_covered:
|
||||
- topic: Problem Definition and Requirements
|
||||
subtopics: [problem identification, clear objectives, success benchmarks, stakeholder engagement, constraints and limitations, keyword spotting example, iterative refinement]
|
||||
- topic: Data Pipeline Architecture
|
||||
subtopics: [pipeline basics, modular design, data flow, processing layers, governance integration, scalability considerations]
|
||||
- topic: Data Sources and Collection
|
||||
subtopics: [existing datasets, web scraping, crowdsourcing, synthetic data creation, anonymization techniques, data source evaluation, quality assessment]
|
||||
- topic: Data Ingestion and Integration
|
||||
subtopics: [ingestion patterns, ETL vs ELT, batch vs stream processing, source integration, validation techniques, error management]
|
||||
- topic: Data Processing and Transformation
|
||||
subtopics: [cleaning techniques, quality assessment, transformation methods, feature engineering, preprocessing workflows, performance optimization]
|
||||
- topic: Data Labeling and Annotation
|
||||
subtopics: [annotation techniques, quality assessment, AI-assisted labeling, labeling challenges, workflow management, quality control]
|
||||
- topic: Data Storage and Management
|
||||
subtopics: [storage system types, performance considerations, feature stores, caching strategies, access patterns, lifecycle management]
|
||||
- topic: Data Governance and Compliance
|
||||
subtopics: [privacy and security, compliance regulations, documentation practices, data lineage, quality monitoring, ethical considerations]
|
||||
- Systematic data debugging
|
||||
- Quality assurance pipelines
|
||||
formulas:
|
||||
- Energy-Movement Invariant (Emove >> Ecomp)
|
||||
- Data Gravity transfer time
|
||||
|
||||
@@ -0,0 +1,27 @@
|
||||
concept_map:
|
||||
source: data_selection.qmd
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- Heterogeneity of Data Value
|
||||
- Information-Compute Ratio (ICR)
|
||||
- The Data Wall
|
||||
- Coreset Selection
|
||||
- Active Learning
|
||||
- Curriculum Learning
|
||||
secondary_concepts:
|
||||
- Data selection and pruning
|
||||
- Synthetic data generation
|
||||
- Selection Inequality
|
||||
- Foundation Model amortization
|
||||
technical_terms:
|
||||
- Deduplication
|
||||
- Quality pruning
|
||||
- Uncertainty sampling
|
||||
- Core-set
|
||||
methodologies:
|
||||
- ICR-based data diet design
|
||||
- Static vs dynamic selection strategies
|
||||
formulas:
|
||||
- Selection Overhead (Cost_train + Cost_select)
|
||||
- Scaling Asymmetry (Compute vs Data growth)
|
||||
- Data quality multiplier
|
||||
|
||||
@@ -1,119 +1,26 @@
|
||||
concept_map:
|
||||
source: frameworks.qmd
|
||||
generated_date: 2025-01-12
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- Machine Learning Frameworks
|
||||
- Computational Graphs
|
||||
- Tensor Operations
|
||||
- Automatic Differentiation
|
||||
- Framework Evolution
|
||||
- TensorFlow
|
||||
- PyTorch
|
||||
- JAX
|
||||
- Framework Specialization
|
||||
- Hardware Abstraction
|
||||
- Execution Models (Eager vs Static vs JIT)
|
||||
- Automatic Differentiation (Reverse-mode)
|
||||
- Framework Abstractions (nn.Module)
|
||||
- Dispatch Overhead
|
||||
- Compilation Continuum
|
||||
secondary_concepts:
|
||||
- Static Graphs
|
||||
- Dynamic Graphs
|
||||
- Eager Execution
|
||||
- Deferred Execution
|
||||
- Device Placement
|
||||
- Memory Management
|
||||
- Distributed Computing
|
||||
- Model Deployment
|
||||
- Framework Selection
|
||||
- Performance Optimization
|
||||
- ONNX (Open Neural Network Exchange)
|
||||
- TensorRT
|
||||
- Keras
|
||||
- Framework Comparison
|
||||
- Hardware Acceleration
|
||||
- Graph Optimization
|
||||
- Model Serialization
|
||||
- Runtime Environments
|
||||
- Ecosystem Integration
|
||||
- Developer Experience
|
||||
technical_terms:
|
||||
- BLAS (Basic Linear Algebra Subprograms)
|
||||
- LAPACK
|
||||
- NumPy
|
||||
- SciPy
|
||||
- Tensors
|
||||
- Computational Graph
|
||||
- Automatic Differentiation (Autodiff)
|
||||
- Gradient Computation
|
||||
- Backpropagation
|
||||
- Static Computation Graph
|
||||
- Dynamic Computation Graph
|
||||
- Eager Execution
|
||||
- Graph Mode
|
||||
- JIT (Just-In-Time) Compilation
|
||||
- XLA (Accelerated Linear Algebra)
|
||||
- CUDA
|
||||
- cuDNN
|
||||
- OpenMP
|
||||
- Device API
|
||||
- Computational Graph (DAG)
|
||||
- Kernel Fusion
|
||||
- Memory Pool
|
||||
- Graph Optimization
|
||||
- Model Checkpointing
|
||||
- Hardware-optimized BLAS
|
||||
- Framework selection trade-offs
|
||||
technical_terms:
|
||||
- Tensors
|
||||
- Autograd
|
||||
- XLA / TorchCompile
|
||||
- Graph Capture
|
||||
- Kernel Launch
|
||||
methodologies:
|
||||
- Graph Construction
|
||||
- Model Definition
|
||||
- Training Loop Implementation
|
||||
- Optimization Strategies
|
||||
- Distributed Training
|
||||
- Model Serving
|
||||
- Framework Migration
|
||||
- Performance Profiling
|
||||
- Memory Optimization
|
||||
- Hardware Acceleration
|
||||
- Model Quantization
|
||||
- Cross-Platform Deployment
|
||||
- Framework Integration
|
||||
- Debugging and Testing
|
||||
- Model Versioning
|
||||
- Experimentation Workflows
|
||||
- Production Deployment
|
||||
- Resource Management
|
||||
- Scalability Planning
|
||||
- Ecosystem Navigation
|
||||
applications:
|
||||
- Deep Learning Research
|
||||
- Production ML Systems
|
||||
- Computer Vision
|
||||
- Natural Language Processing
|
||||
- Recommendation Systems
|
||||
- Time Series Analysis
|
||||
- Reinforcement Learning
|
||||
- Generative Models
|
||||
- Cloud-Based ML
|
||||
- Edge Computing
|
||||
- Mobile ML Applications
|
||||
- TinyML Systems
|
||||
- Distributed Training
|
||||
- Model Serving
|
||||
- Real-time Inference
|
||||
- Batch Processing
|
||||
- Scientific Computing
|
||||
- Research Prototyping
|
||||
- Enterprise Solutions
|
||||
- IoT Applications
|
||||
keywords: [machine learning frameworks, TensorFlow, PyTorch, JAX, computational graphs, tensor operations, automatic differentiation, static graphs, dynamic graphs, ONNX, TensorRT, Keras, hardware acceleration, distributed computing, model deployment, framework selection, performance optimization, eager execution, JIT compilation, XLA, CUDA, memory management, device placement]
|
||||
topics_covered:
|
||||
- topic: Framework Evolution and History
|
||||
subtopics: [early numerical libraries, BLAS and LAPACK, first-generation frameworks, deep learning frameworks, hardware impact, timeline progression]
|
||||
- topic: Fundamental Framework Concepts
|
||||
subtopics: [computational graphs, tensor operations, automatic differentiation, memory management, device abstraction, execution models]
|
||||
- topic: Static vs Dynamic Graphs
|
||||
subtopics: [static graph advantages, dynamic graph benefits, execution strategies, optimization trade-offs, development workflow impacts]
|
||||
- topic: Major Framework Analysis
|
||||
subtopics: [TensorFlow ecosystem, PyTorch ecosystem, JAX functional programming, framework comparison, strengths and limitations]
|
||||
- topic: Framework Specialization
|
||||
subtopics: [cloud-based frameworks, edge computing, mobile frameworks, TinyML systems, deployment considerations, hardware optimization]
|
||||
- topic: Framework Selection and Optimization
|
||||
subtopics: [model requirements, software dependencies, hardware constraints, performance optimization, deployment scalability, selection criteria]
|
||||
- topic: System-Level Considerations
|
||||
subtopics: [memory management, device placement, distributed execution, hardware acceleration, performance profiling, resource optimization]
|
||||
- topic: Development and Production Workflows
|
||||
subtopics: [research prototyping, production deployment, model serving, cross-platform compatibility, ecosystem integration, migration strategies]
|
||||
- Mixed-precision implementation
|
||||
- Operator fusion optimization
|
||||
formulas:
|
||||
- Dispatch tax (overhead vs compute time)
|
||||
- Kernel launch latency (~5-10us)
|
||||
|
||||
@@ -1,120 +1,27 @@
|
||||
concept_map:
|
||||
source: hw_acceleration.qmd
|
||||
generated_date: 2025-01-12
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- AI Hardware Acceleration
|
||||
- Specialized Computing
|
||||
- Domain-Specific Architectures
|
||||
- AI Compute Primitives
|
||||
- The Memory Wall
|
||||
- Specialized Data Paths (Systolic Arrays, Tensor Cores)
|
||||
- Dataflow Strategies (Stationary patterns)
|
||||
- Amdahl's Law for AI
|
||||
- Hardware-Software Co-design
|
||||
- GPU Computing
|
||||
- TPU Architecture
|
||||
- FPGA Acceleration
|
||||
- Neural Processing Units
|
||||
- Edge AI Accelerators
|
||||
secondary_concepts:
|
||||
- Parallel Computing
|
||||
- Vector Operations
|
||||
- Matrix Multiplication Units
|
||||
- Tensor Processing
|
||||
- Memory Hierarchy
|
||||
- Data Movement Optimization
|
||||
- Compute-Memory Balance
|
||||
- Bandwidth Optimization
|
||||
- Latency Optimization
|
||||
- Energy Efficiency
|
||||
- Throughput Optimization
|
||||
- Custom Silicon
|
||||
- ASIC Design
|
||||
- Neuromorphic Computing
|
||||
- Quantum Computing
|
||||
- In-Memory Computing
|
||||
- Dataflow Architectures
|
||||
- Systolic Arrays
|
||||
- Pipeline Processing
|
||||
- Heterogeneous Computing
|
||||
- Domain-Specific Architectures (DSA)
|
||||
- Memory Hierarchy and Bandwidth
|
||||
- Interconnect Hierarchy
|
||||
- Arithmetic Intensity vs Ridge Point
|
||||
technical_terms:
|
||||
- CUDA Cores
|
||||
- Tensor Cores
|
||||
- Streaming Multiprocessors
|
||||
- Compute Units
|
||||
- Processing Elements
|
||||
- Memory Controllers
|
||||
- Cache Hierarchy
|
||||
- Register Files
|
||||
- Shared Memory
|
||||
- Global Memory
|
||||
- High Bandwidth Memory (HBM)
|
||||
- GDDR Memory
|
||||
- PCIe Interface
|
||||
- NVLink
|
||||
- Infinity Fabric
|
||||
- Interconnect Networks
|
||||
- SIMD (Single Instruction Multiple Data)
|
||||
- SIMT (Single Instruction Multiple Thread)
|
||||
- Warp/Wavefront
|
||||
- Thread Blocks
|
||||
- Occupancy
|
||||
- Memory Coalescing
|
||||
- Bank Conflicts
|
||||
- Compute Capability
|
||||
- Compute Density
|
||||
- SM (Streaming Multiprocessor)
|
||||
- NVLink / PCIe
|
||||
- High-Bandwidth Memory (HBM)
|
||||
- MAC (Multiply-Accumulate)
|
||||
methodologies:
|
||||
- Performance Modeling
|
||||
- Roofline Analysis
|
||||
- Compute-bound vs Memory-bound Analysis
|
||||
- Kernel Optimization
|
||||
- Memory Access Pattern Optimization
|
||||
- Data Layout Optimization
|
||||
- Tiling Strategies
|
||||
- Loop Unrolling
|
||||
- Vectorization
|
||||
- Parallelization Strategies
|
||||
- Load Balancing
|
||||
- Synchronization Optimization
|
||||
- Pipeline Optimization
|
||||
- Prefetching
|
||||
- Caching Strategies
|
||||
- Compression Techniques
|
||||
- Approximate Computing
|
||||
- Mixed Precision Computing
|
||||
- Quantization Hardware Support
|
||||
- Sparsity Acceleration
|
||||
applications:
|
||||
- Deep Learning Training
|
||||
- Neural Network Inference
|
||||
- Computer Vision
|
||||
- Natural Language Processing
|
||||
- Scientific Computing
|
||||
- High Performance Computing
|
||||
- Real-time AI Applications
|
||||
- Autonomous Vehicles
|
||||
- Robotics
|
||||
- Edge AI Systems
|
||||
- Mobile AI
|
||||
- Cloud Computing
|
||||
- Datacenter AI
|
||||
- Supercomputing
|
||||
- Cryptocurrency Mining
|
||||
- Game Rendering
|
||||
- Video Processing
|
||||
- Signal Processing
|
||||
- Financial Modeling
|
||||
- Weather Simulation
|
||||
keywords: [AI acceleration, GPU computing, TPU, FPGA, specialized computing, domain-specific architectures, parallel computing, vector operations, tensor processing, memory hierarchy, CUDA, hardware-software co-design, neural processing units, edge accelerators, compute primitives, throughput optimization, energy efficiency]
|
||||
topics_covered:
|
||||
- topic: Hardware Evolution and Specialization
|
||||
subtopics: [computing evolution, specialized processors, domain-specific architectures, application-specific accelerators, hardware trends, future directions]
|
||||
- topic: AI Compute Primitives
|
||||
subtopics: [vector operations, matrix operations, tensor computations, primitive optimization, hardware mapping, execution models]
|
||||
- topic: GPU Architecture and Programming
|
||||
subtopics: [GPU architecture, CUDA programming, memory hierarchy, thread organization, kernel optimization, performance analysis]
|
||||
- topic: Specialized AI Accelerators
|
||||
subtopics: [TPU design, NPU architectures, FPGA implementations, custom silicon, neuromorphic chips, quantum accelerators]
|
||||
- topic: Memory Systems and Data Movement
|
||||
subtopics: [memory hierarchy, bandwidth optimization, data movement, caching strategies, memory access patterns, storage systems]
|
||||
- topic: Performance Optimization
|
||||
subtopics: [performance modeling, bottleneck analysis, optimization techniques, parallelization strategies, energy efficiency, throughput maximization]
|
||||
- topic: Hardware-Software Co-design
|
||||
subtopics: [co-design principles, compiler optimizations, runtime systems, programming models, abstraction layers, performance portability]
|
||||
- topic: Deployment and Integration
|
||||
subtopics: [system integration, deployment strategies, scalability considerations, cost-performance trade-offs, reliability, maintenance]
|
||||
- Roofline modeling
|
||||
- Hardware primtive alignment
|
||||
formulas:
|
||||
- Amdahl's Speedup
|
||||
- Roofline Bound (min(Peak, BW*AI))
|
||||
- Bandwidth taper ratios
|
||||
|
||||
@@ -1,129 +1,36 @@
|
||||
concept_map:
|
||||
source: introduction.qmd
|
||||
generated_date: 2025-01-12
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- Machine Learning Systems Engineering
|
||||
- AI Pervasiveness
|
||||
- AI and ML Fundamentals
|
||||
- AI Evolution and History
|
||||
- AI Winters
|
||||
- Paradigm Shifts in AI
|
||||
- Production System Challenges
|
||||
- Research to Deployment Lifecycle
|
||||
|
||||
- AI Triad (Data, Algorithm, Machine)
|
||||
- D·A·M Taxonomy
|
||||
- Software 1.0 vs Software 2.0
|
||||
- The Bitter Lesson
|
||||
- Iron Law of ML Systems
|
||||
- Silent Degradation
|
||||
- Verification Gap
|
||||
- Five-Pillar Framework
|
||||
secondary_concepts:
|
||||
- Industrial Revolution comparison
|
||||
- Digital Revolution comparison
|
||||
- AI Revolution characteristics
|
||||
- Theoretical vs Practical AI
|
||||
- Intelligent Behavior
|
||||
- Pattern Recognition
|
||||
- Adaptive Systems
|
||||
- Historical AI Milestones
|
||||
- Societal Impact of AI
|
||||
- Global Scale Applications
|
||||
- Individual Level Applications
|
||||
- Organizational Transformation
|
||||
|
||||
- AI evolution history (Symbolic, Expert Systems, Statistical, Deep Learning)
|
||||
- Dual Mandate
|
||||
- Silicon Contract
|
||||
- Samples per Dollar
|
||||
- Verification Gap
|
||||
technical_terms:
|
||||
- Perceptron (1957)
|
||||
- ELIZA chatbot (1966)
|
||||
- Dartmouth Workshop (1956)
|
||||
- Turing Test (1950)
|
||||
- Deep Blue (1997)
|
||||
- AlphaGo (2016)
|
||||
- GPT-3 (2020)
|
||||
- GPT-4 (2023)
|
||||
- Neural Networks
|
||||
- Symbolic AI
|
||||
- Statistical Learning
|
||||
- Perceptron
|
||||
- Deep Learning
|
||||
- Expert Systems
|
||||
- Knowledge-Based Systems
|
||||
|
||||
- Neural Networks
|
||||
- Verification Gap
|
||||
- Samples per Dollar
|
||||
methodologies:
|
||||
- System Version Management
|
||||
- Data Quality Assurance
|
||||
- Performance Monitoring
|
||||
- Experimentation Frameworks
|
||||
- Privacy Compliance
|
||||
- Failure Recovery
|
||||
- Traffic Scaling
|
||||
- Resilient Architecture Design
|
||||
- Development Lifecycle Management
|
||||
- Production Deployment
|
||||
|
||||
applications:
|
||||
- Medical Image Analysis
|
||||
- Traffic Flow Management
|
||||
- Power Grid Optimization
|
||||
- Wireless Communication
|
||||
- Scientific Discovery
|
||||
- Space Exploration
|
||||
- Molecular Simulation
|
||||
- Disease Diagnosis
|
||||
- Climate Change Modeling
|
||||
- Drug Discovery
|
||||
- Personalized Experiences
|
||||
- Decision Support Systems
|
||||
|
||||
keywords:
|
||||
- artificial intelligence
|
||||
- machine learning
|
||||
- systems engineering
|
||||
- production deployment
|
||||
- AI evolution
|
||||
- paradigm shift
|
||||
- perceptron
|
||||
- neural networks
|
||||
- deep learning
|
||||
- AI winters
|
||||
- Turing test
|
||||
- intelligent systems
|
||||
- pattern recognition
|
||||
- adaptive behavior
|
||||
- system reliability
|
||||
- scalability
|
||||
- data quality
|
||||
- model deployment
|
||||
- AI transformation
|
||||
- societal impact
|
||||
|
||||
topics_covered:
|
||||
- topic: AI and ML Foundations
|
||||
subtopics:
|
||||
- Definition of AI and ML
|
||||
- Relationship between AI and ML
|
||||
- Theoretical vs practical approaches
|
||||
- Intelligence replication
|
||||
|
||||
- topic: Historical Evolution
|
||||
subtopics:
|
||||
- Timeline of AI development
|
||||
- Key milestones (1950-2023)
|
||||
- AI winters and resurgence
|
||||
- Paradigm shifts in approach
|
||||
|
||||
- topic: Systems Engineering
|
||||
subtopics:
|
||||
- Research to production transition
|
||||
- Data quality management
|
||||
- System versioning
|
||||
- Performance optimization
|
||||
- Failure recovery
|
||||
- Scalability challenges
|
||||
|
||||
- topic: Societal Impact
|
||||
subtopics:
|
||||
- Individual level applications
|
||||
- Organizational transformation
|
||||
- Global challenges
|
||||
- Technological revolution comparison
|
||||
|
||||
- topic: Real-World Applications
|
||||
subtopics:
|
||||
- Healthcare and medicine
|
||||
- Transportation systems
|
||||
- Energy management
|
||||
- Scientific research
|
||||
- Communication networks
|
||||
- D·A·M diagnosis
|
||||
- Performance decomposition
|
||||
formulas:
|
||||
- Iron Law of ML Systems
|
||||
- Degradation Equation
|
||||
lighthouse_models:
|
||||
- ResNet-50
|
||||
- GPT-2 / Llama
|
||||
- DLRM
|
||||
- MobileNetV2
|
||||
- Keyword Spotting
|
||||
|
||||
@@ -1,120 +1,28 @@
|
||||
concept_map:
|
||||
source: ops.qmd
|
||||
generated_date: 2025-01-12
|
||||
source: ml_ops.qmd
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- MLOps (Machine Learning Operations)
|
||||
- ML System Operations
|
||||
- Model Lifecycle Management
|
||||
- Continuous Integration/Continuous Deployment (CI/CD)
|
||||
- Model Monitoring
|
||||
- Production ML Systems
|
||||
- DevOps for ML
|
||||
- Model Versioning
|
||||
- Automated ML Pipelines
|
||||
- Infrastructure as Code
|
||||
- Silent Failure Management
|
||||
- Operational Mismatch (ML vs. Traditional Software)
|
||||
- MLOps Infinity Loop
|
||||
- Retraining Cadence
|
||||
- Deployment Patterns (Canary, Blue-Green, Shadow)
|
||||
secondary_concepts:
|
||||
- Feature Consistency (Feature Stores)
|
||||
- Environment Parity
|
||||
- Statistical Telemetry
|
||||
- ML Node architecture
|
||||
- MLOps maturity levels
|
||||
technical_terms:
|
||||
- Model Registry
|
||||
- Feature Store
|
||||
- Data Pipeline Management
|
||||
- Model Serving
|
||||
- A/B Testing
|
||||
- Canary Deployments
|
||||
- Blue-Green Deployments
|
||||
- Model Rollback
|
||||
- Performance Monitoring
|
||||
- Data Drift Detection
|
||||
- Model Drift Detection
|
||||
- Alert Systems
|
||||
- Logging and Observability
|
||||
- Containerization
|
||||
- Orchestration
|
||||
- Microservices Architecture
|
||||
- API Management
|
||||
- Load Balancing
|
||||
- Auto-scaling
|
||||
- Resource Management
|
||||
technical_terms:
|
||||
- Docker Containers
|
||||
- Kubernetes
|
||||
- Apache Airflow
|
||||
- Kubeflow
|
||||
- MLflow
|
||||
- DVC (Data Version Control)
|
||||
- Model Artifacts
|
||||
- Experiment Tracking
|
||||
- Data Drift / Concept Drift
|
||||
- Training-Serving Skew
|
||||
- Metadata Management
|
||||
- Service Mesh
|
||||
- REST APIs
|
||||
- gRPC
|
||||
- Message Queues
|
||||
- Event Streaming
|
||||
- Batch Processing
|
||||
- Real-time Processing
|
||||
- Model Endpoints
|
||||
- Health Checks
|
||||
- Circuit Breakers
|
||||
- Rate Limiting
|
||||
- Caching Layers
|
||||
- CDN (Content Delivery Network)
|
||||
- Prometheus
|
||||
- Grafana
|
||||
methodologies:
|
||||
- DevOps Principles
|
||||
- Agile Development
|
||||
- Infrastructure Automation
|
||||
- Configuration Management
|
||||
- Deployment Automation
|
||||
- Testing Automation
|
||||
- Monitoring and Alerting
|
||||
- Incident Response
|
||||
- Capacity Planning
|
||||
- Performance Optimization
|
||||
- Security Best Practices
|
||||
- Compliance Management
|
||||
- Change Management
|
||||
- Release Management
|
||||
- Rollback Strategies
|
||||
- Disaster Recovery
|
||||
- Backup and Restore
|
||||
- Documentation
|
||||
- Knowledge Management
|
||||
- Team Collaboration
|
||||
applications:
|
||||
- Production ML Systems
|
||||
- Real-time Inference Services
|
||||
- Batch Prediction Systems
|
||||
- Recommendation Engines
|
||||
- Computer Vision Applications
|
||||
- Natural Language Processing
|
||||
- Time Series Forecasting
|
||||
- Fraud Detection Systems
|
||||
- Personalization Systems
|
||||
- Search and Ranking
|
||||
- Healthcare AI Systems
|
||||
- Financial Services
|
||||
- E-commerce Platforms
|
||||
- Social Media Platforms
|
||||
- Autonomous Systems
|
||||
- IoT Applications
|
||||
- Edge Computing
|
||||
- Cloud Services
|
||||
- Mobile Applications
|
||||
- Web Applications
|
||||
keywords: [MLOps, model lifecycle management, CI/CD, model monitoring, production ML systems, DevOps, model versioning, automated pipelines, infrastructure as code, model serving, A/B testing, deployment strategies, data drift, model drift, containerization, Kubernetes, monitoring, observability]
|
||||
topics_covered:
|
||||
- topic: MLOps Fundamentals
|
||||
subtopics: [MLOps principles, lifecycle management, development practices, automation strategies, team collaboration, organizational aspects]
|
||||
- topic: Model Development and Versioning
|
||||
subtopics: [experiment tracking, model registry, version control, artifact management, reproducibility, collaboration tools]
|
||||
- topic: CI/CD for Machine Learning
|
||||
subtopics: [continuous integration, continuous deployment, automated testing, pipeline orchestration, quality gates, release management]
|
||||
- topic: Model Deployment and Serving
|
||||
subtopics: [deployment strategies, model serving, API management, load balancing, scaling, containerization, orchestration]
|
||||
- topic: Monitoring and Observability
|
||||
subtopics: [performance monitoring, data drift detection, model drift detection, alerting systems, logging, metrics collection, observability tools]
|
||||
- topic: Infrastructure Management
|
||||
subtopics: [infrastructure as code, resource management, auto-scaling, capacity planning, cost optimization, security management]
|
||||
- topic: Data and Feature Management
|
||||
subtopics: [feature stores, data pipelines, data quality, data governance, feature engineering, data lineage]
|
||||
- topic: Operations and Maintenance
|
||||
subtopics: [incident response, troubleshooting, maintenance procedures, backup and recovery, disaster recovery, compliance]
|
||||
- Continuous monitoring and alerting
|
||||
- Automated retraining pipelines
|
||||
formulas:
|
||||
- Retraining ROI
|
||||
- Staleness Loss economics
|
||||
- Drift divergence (KL)
|
||||
|
||||
@@ -1,152 +1,25 @@
|
||||
concept_map:
|
||||
source: ml_systems.qmd
|
||||
generated_date: 2025-01-12
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- Cloud Machine Learning (Cloud ML)
|
||||
- Edge Machine Learning (Edge ML)
|
||||
- Mobile Machine Learning (Mobile ML)
|
||||
- Tiny Machine Learning (TinyML)
|
||||
- Hybrid Machine Learning systems
|
||||
- Physical constraints (Speed of Light, Power Wall, Memory Wall)
|
||||
- Deployment Spectrum (Cloud, Edge, Mobile, TinyML)
|
||||
- Hybrid integration patterns
|
||||
- Distributed Intelligence Spectrum
|
||||
- ML System paradigms
|
||||
- Deployment Spectrum Trade-offs
|
||||
- Resource-Constrained Computing
|
||||
- System Architecture Design
|
||||
|
||||
secondary_concepts:
|
||||
- Data centers and cloud infrastructure
|
||||
- Edge computing and local processing
|
||||
- Neural Processing Units (NPUs)
|
||||
- System-on-Chip (SoC) architectures
|
||||
- Microcontrollers and embedded systems
|
||||
- Federated learning
|
||||
- Hierarchical processing
|
||||
- Progressive deployment
|
||||
- Collaborative learning
|
||||
- Network connectivity requirements
|
||||
- Thermal management
|
||||
- Battery life optimization
|
||||
|
||||
- Bottleneck Principle
|
||||
- Resource-Constrained Computing
|
||||
- Deployment Spectrum Trade-offs
|
||||
technical_terms:
|
||||
- Tensor Processing Unit (TPU)
|
||||
- Graphics Processing Units (GPUs)
|
||||
- Application Programming Interfaces (APIs)
|
||||
- Internet of Things (IoT)
|
||||
- Machine learning inference
|
||||
- Model quantization
|
||||
- Model compression
|
||||
- Energy efficiency
|
||||
- Latency optimization
|
||||
- Data privacy
|
||||
- Hyperscale data centers
|
||||
- Neural Engine
|
||||
- Google Tensor chip
|
||||
- Memory constraints (KB/MB/GB)
|
||||
- Power consumption (mW/W/MW)
|
||||
|
||||
- TPU (Tensor Processing Unit)
|
||||
- NPU (Neural Processing Unit)
|
||||
- SoC (System on Chip)
|
||||
- PUE (Power Usage Effectiveness)
|
||||
- Latency vs. Throughput
|
||||
methodologies:
|
||||
- Train-serve split pattern
|
||||
- Centralized vs decentralized processing
|
||||
- Resource management strategies
|
||||
- Model optimization techniques
|
||||
- Power consumption optimization
|
||||
- Real-time processing methods
|
||||
- Offline capability implementation
|
||||
- Scalability approaches
|
||||
- Distributed training
|
||||
- Edge caching strategies
|
||||
- Model pruning
|
||||
- Knowledge distillation
|
||||
|
||||
applications:
|
||||
- Virtual assistants (Siri, Alexa)
|
||||
- Recommendation systems
|
||||
- Fraud detection
|
||||
- Autonomous vehicles
|
||||
- Smart homes and cities
|
||||
- Industrial IoT and predictive maintenance
|
||||
- Computational photography
|
||||
- Voice recognition
|
||||
- Health monitoring
|
||||
- Environmental monitoring
|
||||
- Anomaly detection
|
||||
- Search engines (189K searches/sec)
|
||||
|
||||
keywords:
|
||||
- machine learning systems
|
||||
- cloud ML
|
||||
- edge ML
|
||||
- mobile ML
|
||||
- TinyML
|
||||
- distributed intelligence
|
||||
- neural processing units
|
||||
- model quantization
|
||||
- federated learning
|
||||
- IoT
|
||||
- real-time processing
|
||||
- energy efficiency
|
||||
- latency
|
||||
- data privacy
|
||||
- system architecture
|
||||
- resource management
|
||||
- hybrid systems
|
||||
- deployment spectrum
|
||||
- computational constraints
|
||||
- power efficiency
|
||||
- memory limitations
|
||||
- thermal constraints
|
||||
|
||||
topics_covered:
|
||||
- topic: Cloud Machine Learning
|
||||
subtopics:
|
||||
- Data center infrastructure
|
||||
- Scalable training
|
||||
- Collaborative development
|
||||
- Pay-as-you-go pricing
|
||||
- Computational power
|
||||
- Latency challenges
|
||||
|
||||
- topic: Edge Machine Learning
|
||||
subtopics:
|
||||
- Local processing
|
||||
- Reduced latency
|
||||
- Enhanced privacy
|
||||
- Bandwidth reduction
|
||||
- Edge devices
|
||||
- IoT hubs
|
||||
|
||||
- topic: Mobile Machine Learning
|
||||
subtopics:
|
||||
- Smartphone processing
|
||||
- NPU acceleration
|
||||
- On-device inference
|
||||
- Battery optimization
|
||||
- Mobile frameworks
|
||||
- Offline functionality
|
||||
|
||||
- topic: Tiny Machine Learning
|
||||
subtopics:
|
||||
- Microcontroller deployment
|
||||
- Ultra-low power
|
||||
- Resource constraints
|
||||
- Embedded sensors
|
||||
- Predictive maintenance
|
||||
- Environmental monitoring
|
||||
|
||||
- topic: Hybrid ML Systems
|
||||
subtopics:
|
||||
- Design patterns
|
||||
- Hierarchical processing
|
||||
- Federated learning
|
||||
- Progressive deployment
|
||||
- Collaborative learning
|
||||
- System integration
|
||||
|
||||
- topic: System Comparison and Trade-offs
|
||||
subtopics:
|
||||
- Performance characteristics
|
||||
- Operational aspects
|
||||
- Cost considerations
|
||||
- Development complexity
|
||||
- Deployment strategies
|
||||
- Resource allocation
|
||||
- Bottleneck diagnosis
|
||||
- Paradigm selection
|
||||
formulas:
|
||||
- Power Efficiency (Performance/Watt)
|
||||
- Arithmetic Intensity (FLOPs/Byte)
|
||||
- Ridge Point
|
||||
|
||||
@@ -1,106 +1,25 @@
|
||||
concept_map:
|
||||
source: workflow.qmd
|
||||
generated_date: 2025-01-12
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- Machine Learning Lifecycle
|
||||
- AI Workflow
|
||||
- Problem Definition
|
||||
- Data Collection and Preparation
|
||||
- Model Development and Training
|
||||
- Evaluation and Validation
|
||||
- Deployment and Integration
|
||||
- Monitoring and Maintenance
|
||||
- ML Lifecycle (6 stages)
|
||||
- Feedback Loops
|
||||
- Iterative Development
|
||||
- Iteration Velocity
|
||||
- Iteration Tax
|
||||
- Constraint Propagation
|
||||
secondary_concepts:
|
||||
- Requirements Engineering
|
||||
- Data Infrastructure
|
||||
- Data Validation
|
||||
- Model Requirements
|
||||
- Development Workflow
|
||||
- Scale and Distribution
|
||||
- Robustness and Reliability
|
||||
- Systems Thinking
|
||||
- Lifecycle Implications
|
||||
- Collaboration in AI
|
||||
- Role Interplay
|
||||
- Proactive Maintenance
|
||||
- Performance Monitoring
|
||||
- Continuous Integration
|
||||
- Data Quality Assurance
|
||||
- Deployment constraints back-propagation
|
||||
- Iteration compounding
|
||||
technical_terms:
|
||||
- CRISP-DM (Cross-Industry Standard Process)
|
||||
- MLOps (Machine Learning Operations)
|
||||
- CI/CD for Machine Learning
|
||||
- CRISP-DM
|
||||
- Experiment Tracking
|
||||
- Data Versioning
|
||||
- Model Versioning
|
||||
- Data Pipeline
|
||||
- Feature Engineering
|
||||
- Model Registry
|
||||
- Deployment Pipeline
|
||||
- A/B Testing
|
||||
- Canary Deployment
|
||||
- Blue-Green Deployment
|
||||
- Data Drift
|
||||
- Model Drift
|
||||
- Performance Metrics
|
||||
- KPIs (Key Performance Indicators)
|
||||
- Model Validation
|
||||
- Cross-validation
|
||||
- Hyperparameter Tuning
|
||||
- Model Selection
|
||||
methodologies:
|
||||
- Structured Development Process
|
||||
- Iterative Experimentation
|
||||
- Data-driven Decision Making
|
||||
- Systematic Problem Definition
|
||||
- Agile ML Development
|
||||
- DevOps for ML
|
||||
- Continuous Monitoring
|
||||
- Automated Testing
|
||||
- Model Governance
|
||||
- Quality Assurance
|
||||
- Risk Management
|
||||
- Performance Optimization
|
||||
- Scalability Planning
|
||||
- Infrastructure as Code
|
||||
- Containerization
|
||||
- Microservices Architecture
|
||||
- Data Pipeline Orchestration
|
||||
- Feature Store Management
|
||||
applications:
|
||||
- Medical AI Systems
|
||||
- Diabetic Retinopathy Screening
|
||||
- Healthcare Image Analysis
|
||||
- Production ML Systems
|
||||
- Real-time Inference Systems
|
||||
- Batch Processing Systems
|
||||
- Computer Vision Applications
|
||||
- Natural Language Processing
|
||||
- Time Series Forecasting
|
||||
- Recommendation Systems
|
||||
- Fraud Detection Systems
|
||||
- Autonomous Systems
|
||||
- IoT and Edge Computing
|
||||
- Mobile ML Applications
|
||||
- Cloud-based ML Services
|
||||
- Enterprise AI Solutions
|
||||
keywords: [machine learning lifecycle, AI workflow, MLOps, deployment, monitoring, data engineering, model development, validation, continuous integration, feedback loops, iterative development, problem definition, data collection, model training, evaluation, maintenance, collaboration, systems thinking, production systems, healthcare AI]
|
||||
topics_covered:
|
||||
- topic: ML Lifecycle Overview
|
||||
subtopics: [definition, traditional vs AI lifecycles, systematic approach, interconnected stages, feedback loops, continuous improvement]
|
||||
- topic: Problem Definition
|
||||
subtopics: [requirements engineering, system impact, definition workflow, scale considerations, systems thinking, lifecycle implications]
|
||||
- topic: Data Collection and Preparation
|
||||
subtopics: [data requirements, data infrastructure, data validation, scale and distribution, quality assurance, privacy considerations]
|
||||
- topic: Model Development and Training
|
||||
subtopics: [model requirements, development workflow, experimentation, scale and distribution, systems thinking, lifecycle implications]
|
||||
- topic: Evaluation and Validation
|
||||
subtopics: [performance metrics, validation strategies, robustness testing, system validation, quality assurance, regulatory compliance]
|
||||
- topic: Deployment and Integration
|
||||
subtopics: [deployment requirements, deployment workflow, scale considerations, robustness and reliability, systems thinking, production readiness]
|
||||
- topic: Monitoring and Maintenance
|
||||
subtopics: [monitoring requirements, maintenance workflow, proactive maintenance, performance tracking, system health, lifecycle management]
|
||||
- topic: AI Lifecycle Roles and Collaboration
|
||||
subtopics: [team collaboration, role interplay, interdisciplinary coordination, stakeholder engagement, communication strategies, project management]
|
||||
- Systematic problem definition
|
||||
- Agile ML development
|
||||
formulas:
|
||||
- Iron Law of Workflow
|
||||
- Constraint Propagation ($2^{N-1}$ cost escalation)
|
||||
|
||||
28
book/quarto/contents/vol1/model_serving/serving_concepts.yml
Normal file
28
book/quarto/contents/vol1/model_serving/serving_concepts.yml
Normal file
@@ -0,0 +1,28 @@
|
||||
concept_map:
|
||||
source: model_serving.qmd
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- Serving Inversion (Throughput to Latency)
|
||||
- Latency Budget (Preprocessing, Inference, Postprocessing)
|
||||
- Queuing Theory (Little's Law, M/M/1)
|
||||
- Dynamic Batching
|
||||
- Training-Serving Skew (Preprocessing divergence)
|
||||
secondary_concepts:
|
||||
- Deployment Spectrum (Cloud to TinyML)
|
||||
- Cold Start Dynamics
|
||||
- Resource Isolation (Pinning, Locking)
|
||||
- Serialization Bottlenecks (JSON vs Protobuf)
|
||||
- LLM Serving (TTFT, TPOT, PagedAttention)
|
||||
technical_terms:
|
||||
- SLO / SLA
|
||||
- Inference Server (Triton, TF Serving)
|
||||
- gRPC / REST
|
||||
- NCHW / NHWC
|
||||
- Zero-Copy Inference
|
||||
methodologies:
|
||||
- Capacity planning
|
||||
- Tail-tolerant execution (Hedging, Canary)
|
||||
formulas:
|
||||
- Little's Law (L = λ * W)
|
||||
- M/M/1 Wait Time
|
||||
- p99 Latency (Tail explosion)
|
||||
@@ -1,104 +1,25 @@
|
||||
concept_map:
|
||||
source: nn_architectures.qmd
|
||||
generated_date: 2025-01-12
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- Inductive Biases (Spatial, Sequential, Relational)
|
||||
- Representational Power vs. Efficiency
|
||||
- Architectural Building Blocks (Skip connections, Normalization, Gating)
|
||||
secondary_concepts:
|
||||
- Multi-Layer Perceptrons (MLPs)
|
||||
- Convolutional Neural Networks (CNNs)
|
||||
- Recurrent Neural Networks (RNNs)
|
||||
- Transformer Architecture
|
||||
- Attention Mechanisms
|
||||
- Dense Pattern Processing
|
||||
- Spatial Pattern Processing
|
||||
- Sequential Pattern Processing
|
||||
- Dynamic Pattern Processing
|
||||
- Universal Approximation Theorem
|
||||
secondary_concepts:
|
||||
- Fully-connected layers
|
||||
- Convolution operations
|
||||
- Feature maps and filters
|
||||
- Pooling operations
|
||||
- Recurrent connections
|
||||
- Hidden states
|
||||
- Self-attention
|
||||
- Query-Key-Value mechanisms
|
||||
- Multi-head attention
|
||||
- Positional encoding
|
||||
- Translation invariance
|
||||
- Receptive fields
|
||||
- Hierarchical feature extraction
|
||||
- Temporal dependencies
|
||||
- Memory states
|
||||
- Gradient flow
|
||||
- Layer normalization
|
||||
- Residual connections
|
||||
technical_terms:
|
||||
- Feature extraction
|
||||
- Translation invariance
|
||||
- Receptive fields
|
||||
- Sliding window operations
|
||||
- Temporal dependencies
|
||||
- Vanishing gradient problem
|
||||
- LSTM (Long Short-Term Memory)
|
||||
- GRU (Gated Recurrent Units)
|
||||
- Attention weights
|
||||
- Softmax normalization
|
||||
- Scaled dot-product attention
|
||||
- Layer normalization
|
||||
- Residual connections
|
||||
- Kernel size
|
||||
- Stride
|
||||
- Padding
|
||||
- Activation functions
|
||||
- Backpropagation through time
|
||||
- Forget gates
|
||||
- Input gates
|
||||
- Output gates
|
||||
- Cell state
|
||||
- Self-attention
|
||||
- Query-Key-Value mechanisms
|
||||
methodologies:
|
||||
- Dense connectivity patterns
|
||||
- Spatial convolution operations
|
||||
- Sequential state updates
|
||||
- Parallel attention computation
|
||||
- Matrix multiplication optimization
|
||||
- Memory access pattern optimization
|
||||
- Weight sharing and reuse
|
||||
- Batch processing strategies
|
||||
- Computational graph organization
|
||||
- Hardware mapping techniques
|
||||
- Feature map computation
|
||||
- Pooling strategies
|
||||
- Sequence modeling
|
||||
- Attention scoring
|
||||
- Multi-head computation
|
||||
- Position embedding
|
||||
applications:
|
||||
- Image classification
|
||||
- Object detection
|
||||
- Computer vision tasks
|
||||
- Natural language processing
|
||||
- Machine translation
|
||||
- Speech recognition
|
||||
- Time series forecasting
|
||||
- Sequence-to-sequence modeling
|
||||
- Language modeling
|
||||
- Graph analysis
|
||||
- Protein structure prediction
|
||||
- Medical imaging
|
||||
- Video processing
|
||||
- Audio signal processing
|
||||
- Sentiment analysis
|
||||
- Document classification
|
||||
keywords: [deep learning architectures, CNNs, RNNs, transformers, attention mechanisms, MLPs, convolution, recurrent connections, spatial processing, sequential processing, dynamic processing, feature extraction, neural network design, computational patterns, system implications, matrix operations, memory management, parallel computation]
|
||||
topics_covered:
|
||||
- topic: Multi-Layer Perceptrons
|
||||
subtopics: [dense pattern processing, algorithmic structure, computational mapping, system implications, memory requirements, computation needs, data movement, universal approximation]
|
||||
- topic: Convolutional Neural Networks
|
||||
subtopics: [spatial pattern processing, convolution operations, feature maps, pooling, translation invariance, hierarchical feature extraction, computational mapping, kernel operations]
|
||||
- topic: Recurrent Neural Networks
|
||||
subtopics: [sequential pattern processing, temporal dependencies, hidden states, recurrent connections, computational mapping, system implications, LSTM, GRU, memory mechanisms]
|
||||
- topic: Attention Mechanisms and Transformers
|
||||
subtopics: [dynamic pattern processing, self-attention, query-key-value, multi-head attention, scaled dot-product attention, computational patterns, parallel processing, position encoding]
|
||||
- topic: Architectural Building Blocks
|
||||
subtopics: [common components, design patterns, optimization strategies, trade-offs, scalability considerations, modularity principles]
|
||||
- topic: System-Level Considerations
|
||||
subtopics: [memory access patterns, computational characteristics, data movement requirements, resource utilization, hardware mapping, optimization strategies, performance analysis]
|
||||
- Architecture selection framework
|
||||
- Pareto efficiency analysis
|
||||
formulas:
|
||||
- Transformer quadratic scaling (O(n^2 * d))
|
||||
- CNN kernel parameter count
|
||||
|
||||
@@ -1,95 +1,25 @@
|
||||
concept_map:
|
||||
source: nn_computation.qmd
|
||||
generated_date: 2025-01-12
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- Deep Learning
|
||||
- Artificial Neural Networks
|
||||
- Biological Neural Networks
|
||||
- Perceptron
|
||||
- Multilayer Perceptrons (MLPs)
|
||||
- Forward Propagation
|
||||
- Backpropagation
|
||||
- Training vs Inference
|
||||
- Gradient Descent
|
||||
- Activation Functions
|
||||
secondary_concepts:
|
||||
- Weight Matrices
|
||||
- Bias Terms
|
||||
- Network Topology
|
||||
- Layer Architecture
|
||||
- Feature Learning
|
||||
- Mathematical Primitives (MAC atoms)
|
||||
- Forward/Backpropagation
|
||||
- Activation Functions (ReLU, Sigmoid, Tanh, Softmax)
|
||||
- Training/Inference asymmetry
|
||||
- Representation Learning
|
||||
- Pattern Recognition
|
||||
- Non-linear Transformations
|
||||
- Loss Functions
|
||||
- Neural Network Fundamentals
|
||||
secondary_concepts:
|
||||
- Gradient Instabilities (Vanishing/Exploding)
|
||||
- Loss Functions (Cross-entropy)
|
||||
- Optimization Process
|
||||
- Batch Processing
|
||||
- Learning Rate
|
||||
- Parameter Initialization
|
||||
- Memory Management
|
||||
technical_terms:
|
||||
- Neurons (artificial nodes)
|
||||
- Synapses (weights)
|
||||
- Soma (summation function)
|
||||
- Axon (output)
|
||||
- Dendrites (inputs)
|
||||
- ReLU (Rectified Linear Unit)
|
||||
- Sigmoid function
|
||||
- Tanh function
|
||||
- Softmax function
|
||||
- Cross-entropy loss
|
||||
- FLOPS (Floating Point Operations per Second)
|
||||
- Matrix multiplication
|
||||
- Vanishing gradients
|
||||
- Exploding gradients
|
||||
- Overfitting
|
||||
- Epoch
|
||||
- Mini-batch gradient descent
|
||||
- Chain rule
|
||||
- Computational graph
|
||||
- Hyperparameters
|
||||
- FLOPS
|
||||
- Chain Rule
|
||||
methodologies:
|
||||
- Supervised Learning
|
||||
- Feature Engineering vs Automatic Feature Learning
|
||||
- Data preprocessing and normalization
|
||||
- Model initialization techniques
|
||||
- Numerical stability optimization
|
||||
- Gradient computation
|
||||
- Weight update algorithms
|
||||
- Memory optimization
|
||||
- Numerical precision optimization
|
||||
- Confidence thresholding
|
||||
- Data augmentation
|
||||
- Model validation
|
||||
- Pipeline optimization
|
||||
applications:
|
||||
- Handwritten digit recognition (MNIST)
|
||||
- USPS ZIP code recognition
|
||||
- Computer vision tasks
|
||||
- Image classification
|
||||
- Natural language processing
|
||||
- Speech recognition
|
||||
- Optical character recognition (OCR)
|
||||
- Mail sorting automation
|
||||
- Pattern recognition systems
|
||||
- Real-time prediction systems
|
||||
- Automated classification systems
|
||||
- Industrial process automation
|
||||
keywords: [deep learning, neural networks, perceptron, backpropagation, activation functions, gradient descent, feature learning, MNIST, biological inspiration, forward propagation, inference, training, weight matrices, bias terms, multilayer perceptrons, pattern recognition, supervised learning, computer vision, USPS case study, loss functions, optimization, batch processing, network architecture]
|
||||
topics_covered:
|
||||
- topic: Evolution to Deep Learning
|
||||
subtopics: [rule-based programming, classical machine learning, representation learning, neural system implications, computational paradigm shift, scalability advantages]
|
||||
- topic: Biological to Artificial Neurons
|
||||
subtopics: [biological intelligence, transition to artificial neurons, computational translation, system requirements, parameter organization, energy efficiency]
|
||||
- topic: Neural Network Fundamentals
|
||||
subtopics: [basic architecture, neurons and activations, layers and connections, data flow and transformations, weight matrices, bias terms]
|
||||
- topic: Network Topology and Design
|
||||
subtopics: [basic structure, input/hidden/output layers, MNIST architecture example, design trade-offs, connection patterns, parameter considerations]
|
||||
- topic: Learning Process
|
||||
subtopics: [training overview, forward propagation, loss functions, backward propagation, gradient flow, optimization process]
|
||||
- topic: Training vs Inference
|
||||
subtopics: [computational differences, parameter freezing, memory requirements, resource optimization, deployment considerations, performance characteristics]
|
||||
- topic: Complete ML Pipeline
|
||||
subtopics: [preprocessing, neural computation, postprocessing, system integration, hybrid computing architectures, practical deployment]
|
||||
- topic: USPS Case Study
|
||||
subtopics: [real-world problem, system development, complete pipeline, results and impact, key takeaways, production deployment lessons]
|
||||
formulas:
|
||||
- Backprop memory cost (O(N * L))
|
||||
- Computational Intensity (MatMul vs Element-wise)
|
||||
|
||||
@@ -1,121 +1,26 @@
|
||||
concept_map:
|
||||
source: optimizations.qmd
|
||||
generated_date: 2025-01-12
|
||||
source: model_compression.qmd
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- Model Optimization
|
||||
- Neural Network Pruning
|
||||
- Model Quantization
|
||||
- Knowledge Distillation
|
||||
- Model Compression
|
||||
- Sparsity
|
||||
- Numerical Precision
|
||||
- Architectural Efficiency
|
||||
- Model Acceleration
|
||||
- Deployment Optimization
|
||||
secondary_concepts:
|
||||
- Structured Pruning
|
||||
- Unstructured Pruning
|
||||
- Magnitude-based Pruning
|
||||
- Gradual Pruning
|
||||
- Lottery Ticket Hypothesis
|
||||
- Post-training Quantization
|
||||
- Quantization-aware Training
|
||||
- Mixed Precision Training
|
||||
- Teacher-Student Networks
|
||||
- Soft Targets
|
||||
- Response-based Distillation
|
||||
- Feature-based Distillation
|
||||
- Attention Transfer
|
||||
- Model Compression Ratio
|
||||
- Inference Acceleration
|
||||
- Memory Footprint Reduction
|
||||
- Hardware-aware Optimization
|
||||
- Edge Deployment
|
||||
- Real-time Constraints
|
||||
- Optimization Framework (Representation, Precision, Architecture)
|
||||
- Accuracy-Efficiency Trade-offs
|
||||
- Pruning (Structured vs Unstructured)
|
||||
- Quantization (PTQ vs QAT)
|
||||
- Knowledge Distillation
|
||||
secondary_concepts:
|
||||
- Magnitude-based pruning
|
||||
- Lottery Ticket Hypothesis
|
||||
- Teacher-Student Networks
|
||||
- INT8/INT4 numerical formats
|
||||
technical_terms:
|
||||
- Weight Pruning
|
||||
- Activation Pruning
|
||||
- Gradient Pruning
|
||||
- Sensitivity Analysis
|
||||
- Pruning Ratio
|
||||
- Sparsity Pattern
|
||||
- Block Sparsity
|
||||
- Channel Pruning
|
||||
- Filter Pruning
|
||||
- INT8 Quantization
|
||||
- INT4 Quantization
|
||||
- Binary Neural Networks
|
||||
- Ternary Quantization
|
||||
- Dynamic Quantization
|
||||
- Static Quantization
|
||||
- Calibration Dataset
|
||||
- Quantization Error
|
||||
- Bit-width Reduction
|
||||
- Fixed-point Arithmetic
|
||||
- Floating-point Precision
|
||||
- Knowledge Transfer
|
||||
- Temperature Scaling
|
||||
- Dark Knowledge
|
||||
- Model Ensemble
|
||||
- Neural Architecture Search
|
||||
- Sparsity
|
||||
- Model compression ratio
|
||||
- Calibration dataset
|
||||
- Dark knowledge
|
||||
- Soft targets
|
||||
methodologies:
|
||||
- Iterative Pruning
|
||||
- One-shot Pruning
|
||||
- Global Pruning
|
||||
- Layer-wise Pruning
|
||||
- Importance Scoring
|
||||
- Sensitivity-based Pruning
|
||||
- Gradual Magnitude Pruning
|
||||
- SNIP (Single-shot Network Pruning)
|
||||
- GraSP (Gradient Signal Preservation)
|
||||
- Progressive Knowledge Distillation
|
||||
- Online Distillation
|
||||
- Self-distillation
|
||||
- Multi-teacher Distillation
|
||||
- Attention-guided Distillation
|
||||
- Feature Map Distillation
|
||||
- Compressed Sensing
|
||||
- Matrix Factorization
|
||||
- Low-rank Approximation
|
||||
- Huffman Coding
|
||||
- Vector Quantization
|
||||
applications:
|
||||
- Mobile AI Applications
|
||||
- Edge Computing Devices
|
||||
- IoT Systems
|
||||
- Real-time Inference
|
||||
- Resource-constrained Environments
|
||||
- Embedded Systems
|
||||
- Autonomous Vehicles
|
||||
- Computer Vision
|
||||
- Natural Language Processing
|
||||
- Speech Recognition
|
||||
- Recommendation Systems
|
||||
- Medical AI
|
||||
- Industrial Automation
|
||||
- Smart Cameras
|
||||
- Wearable Devices
|
||||
- Drone Applications
|
||||
- Robotics
|
||||
- Smart Home Devices
|
||||
- Surveillance Systems
|
||||
- Augmented Reality
|
||||
keywords: [model optimization, pruning, quantization, knowledge distillation, model compression, sparsity, neural network acceleration, deployment optimization, numerical precision, architectural efficiency, edge deployment, resource constraints, inference acceleration, memory optimization, hardware-aware optimization]
|
||||
topics_covered:
|
||||
- topic: Model Optimization Fundamentals
|
||||
subtopics: [optimization dimensions, accuracy-efficiency trade-offs, system constraints, deployment requirements, performance metrics, optimization frameworks]
|
||||
- topic: Neural Network Pruning
|
||||
subtopics: [structured vs unstructured pruning, magnitude-based pruning, gradual pruning, lottery ticket hypothesis, sensitivity analysis, pruning strategies]
|
||||
- topic: Model Quantization Techniques
|
||||
subtopics: [post-training quantization, quantization-aware training, mixed precision, INT8 quantization, binary networks, dynamic quantization]
|
||||
- topic: Knowledge Distillation
|
||||
subtopics: [teacher-student frameworks, soft targets, attention transfer, feature distillation, progressive distillation, self-distillation]
|
||||
- topic: Sparsity and Compression
|
||||
subtopics: [sparse neural networks, sparsity patterns, block sparsity, compression algorithms, storage optimization, sparse computations]
|
||||
- topic: Hardware-Aware Optimization
|
||||
subtopics: [hardware constraints, acceleration techniques, memory optimization, latency optimization, energy efficiency, deployment considerations]
|
||||
- topic: Advanced Optimization Strategies
|
||||
subtopics: [neural architecture search, automated optimization, multi-objective optimization, optimization pipelines, performance evaluation]
|
||||
- topic: Real-World Deployment
|
||||
subtopics: [mobile deployment, edge computing, IoT applications, real-time constraints, resource management, optimization validation]
|
||||
- Post-training quantization
|
||||
- Iterative pruning
|
||||
formulas:
|
||||
- Compression ratio
|
||||
- Energy reduction per operation (picojoule ratios)
|
||||
|
||||
@@ -1,111 +1,28 @@
|
||||
concept_map:
|
||||
source: responsible_engr.qmd
|
||||
generated_date: 2026-01-07
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- Responsible ML Systems Engineering
|
||||
- Silent Failure Modes
|
||||
- Engineering Responsibility Gap
|
||||
- Technical Correctness vs Responsible Outcomes
|
||||
- Bias Amplification
|
||||
- Fairness Metrics
|
||||
- Environmental Impact of AI
|
||||
- Model Documentation Standards
|
||||
- Environmental Sustainability
|
||||
- Total Cost of Ownership (TCO)
|
||||
|
||||
secondary_concepts:
|
||||
- Disaggregated Evaluation
|
||||
- Proxy Signals
|
||||
- Feedback Loops in Recommendation
|
||||
- Disparity in Error Rates
|
||||
- Technical vs Social Objectives
|
||||
- Reliability vs Safety (control loops)
|
||||
- Responsibility Gap
|
||||
- Green AI vs Red AI
|
||||
- Brain Energy Efficiency
|
||||
- Hierarchical Distributed Intelligence
|
||||
|
||||
- Data Governance and Documentation
|
||||
technical_terms:
|
||||
- Silent Bias
|
||||
- Proxy Variables
|
||||
- Demographic Parity
|
||||
- Equal Opportunity
|
||||
- Equalized Odds
|
||||
- False Positive Rate (FPR)
|
||||
- True Positive Rate (TPR)
|
||||
- Confusion Matrix
|
||||
- Model Cards
|
||||
- Datasheets for Datasets
|
||||
- TCO (Total Cost of Ownership)
|
||||
- Green AI
|
||||
- Carbon Footprint
|
||||
- Disaggregated Metrics
|
||||
|
||||
- Equal Opportunity / Equalized Odds
|
||||
- Carbon Footprint (CO2e)
|
||||
- Model Cards / Datasheets
|
||||
methodologies:
|
||||
- Pre-Deployment Assessment
|
||||
- Incident Response Preparation
|
||||
- Continuous Fairness Monitoring
|
||||
- Stratified Evaluation
|
||||
- Intersectional Analysis
|
||||
- Carbon-Aware Training
|
||||
- Disaggregated Evaluation (Slicing)
|
||||
- Adversarial Debiasing
|
||||
- TCO Calculation Methodology
|
||||
|
||||
applications:
|
||||
- Amazon Recruiting Tool (Gender Bias Case)
|
||||
- COMPAS Recidivism Prediction
|
||||
- YouTube Recommendation Feedback Loops
|
||||
- Gender Shades Facial Recognition Study
|
||||
- Twitter Image Cropping Analysis
|
||||
- Loan Approval Fairness Analysis
|
||||
- Medical Diagnosis Screening
|
||||
|
||||
keywords:
|
||||
- responsible engineering
|
||||
- AI ethics
|
||||
- machine learning fairness
|
||||
- silent failure
|
||||
- bias amplification
|
||||
- model cards
|
||||
- demographic parity
|
||||
- equal opportunity
|
||||
- sustainability
|
||||
- Green AI
|
||||
- environmental impact
|
||||
- carbon footprint
|
||||
- total cost of ownership
|
||||
- TCO
|
||||
- disaggregated evaluation
|
||||
- model documentation
|
||||
- incident response
|
||||
- feedback loops
|
||||
- AI democratization
|
||||
|
||||
topics_covered:
|
||||
- topic: Foundations of Responsible Engineering
|
||||
subtopics:
|
||||
- Definition of responsible engineering
|
||||
- Difference between technical correctness and responsible outcomes
|
||||
- Why engineers must lead on responsibility
|
||||
- Silent failure modes in ML systems
|
||||
|
||||
- topic: Case Studies in Engineering Failures
|
||||
subtopics:
|
||||
- Amazon's biased recruiting tool
|
||||
- COMPAS recidivism prediction disparities
|
||||
- YouTube's recommendation feedback loops
|
||||
- Gender Shades facial recognition disparities
|
||||
|
||||
- topic: Frameworks for Responsibility
|
||||
subtopics:
|
||||
- Pre-deployment assessment checklists
|
||||
- Model cards for documentation
|
||||
- Datasheets for datasets
|
||||
- Incident response procedures
|
||||
|
||||
- topic: Quantitative Fairness Measurement
|
||||
subtopics:
|
||||
- Demographic parity
|
||||
- Equality of opportunity
|
||||
- Equalized odds
|
||||
- Disaggregated metrics and relative disparity
|
||||
|
||||
- topic: Environmental and Economic Sustainability
|
||||
subtopics:
|
||||
- Computational resource costs
|
||||
- Carbon footprint of training and inference
|
||||
- Total Cost of Ownership (TCO) methodology
|
||||
- Biological inspiration for energy efficiency (The Brain)
|
||||
formulas:
|
||||
- Disparate Impact Ratio
|
||||
- Carbon Intensity (Energy * Intensity)
|
||||
- Fairness-Accuracy Pareto Frontier
|
||||
|
||||
@@ -1,117 +1,27 @@
|
||||
concept_map:
|
||||
source: training.qmd
|
||||
generated_date: 2025-01-12
|
||||
generated_date: 2026-02-19
|
||||
primary_concepts:
|
||||
- AI Training Systems
|
||||
- Gradient Descent
|
||||
- Backpropagation
|
||||
- Neural Network Computation
|
||||
- Training Pipelines
|
||||
- Distributed Training
|
||||
- Training Optimization
|
||||
- Memory Management
|
||||
- Computational Efficiency
|
||||
- Model Convergence
|
||||
- Optimization Algorithms (SGD, Adam, AdamW)
|
||||
- Distributed Training (Data, Model, Pipeline Parallelism)
|
||||
- Memory/Throughput Trade-offs
|
||||
- Mixed-precision Training
|
||||
- Activation Checkpointing
|
||||
secondary_concepts:
|
||||
- Stochastic Gradient Descent (SGD)
|
||||
- Batch Processing
|
||||
- Mini-batch Training
|
||||
- Learning Rate Scheduling
|
||||
- Gradient Computation
|
||||
- Forward Pass
|
||||
- Backward Pass
|
||||
- Parameter Updates
|
||||
- Loss Functions
|
||||
- Optimization Algorithms
|
||||
- Data Parallelism
|
||||
- Model Parallelism
|
||||
- Pipeline Parallelism
|
||||
- Training Bottlenecks
|
||||
- Resource Utilization
|
||||
- Numerical Stability
|
||||
- Training Pipelines
|
||||
- System Architecture
|
||||
technical_terms:
|
||||
- Matrix-Matrix Multiplication
|
||||
- Tensor Operations
|
||||
- Activation Functions
|
||||
- Automatic Differentiation
|
||||
- Computational Graph
|
||||
- Gradient Accumulation
|
||||
- Warmup strategies
|
||||
- Convergence monitoring
|
||||
- Training bottlenecks (Compute vs Memory vs Data bound)
|
||||
technical_terms:
|
||||
- All-Reduce
|
||||
- Parameter Server
|
||||
- Gradient Clipping
|
||||
- Batch Normalization
|
||||
- Dropout
|
||||
- Regularization
|
||||
- Weight Decay
|
||||
- Momentum
|
||||
- Adam Optimizer
|
||||
- RMSprop
|
||||
- Adagrad
|
||||
- Learning Rate Decay
|
||||
- Warmup Strategies
|
||||
- Mixed Precision Training
|
||||
- Gradient Synchronization
|
||||
- All-Reduce Operations
|
||||
- Parameter Servers
|
||||
- Ring All-Reduce
|
||||
- NCCL (NVIDIA Collective Communications Library)
|
||||
methodologies:
|
||||
- Training Loop Design
|
||||
- Data Loading Strategies
|
||||
- Memory Optimization Techniques
|
||||
- Gradient Computation Methods
|
||||
- Distributed Training Strategies
|
||||
- Synchronous Training
|
||||
- Asynchronous Training
|
||||
- Federated Learning
|
||||
- Transfer Learning
|
||||
- Fine-tuning
|
||||
- Curriculum Learning
|
||||
- Progressive Training
|
||||
- Checkpointing
|
||||
- Model Validation
|
||||
- Hyperparameter Tuning
|
||||
- Performance Profiling
|
||||
- Resource Scheduling
|
||||
- Load Balancing
|
||||
- Fault Tolerance
|
||||
- Training Monitoring
|
||||
applications:
|
||||
- Deep Learning Model Training
|
||||
- Large Language Models
|
||||
- Computer Vision Models
|
||||
- Convolutional Neural Networks
|
||||
- Recurrent Neural Networks
|
||||
- Transformer Models
|
||||
- Generative Models
|
||||
- Reinforcement Learning
|
||||
- Multi-task Learning
|
||||
- Self-supervised Learning
|
||||
- Contrastive Learning
|
||||
- Neural Architecture Search
|
||||
- Adversarial Training
|
||||
- Domain Adaptation
|
||||
- Few-shot Learning
|
||||
- Meta-learning
|
||||
- Continual Learning
|
||||
- Edge Model Training
|
||||
- Mobile ML Training
|
||||
- Scientific Computing
|
||||
keywords: [AI training, gradient descent, backpropagation, distributed training, neural networks, optimization algorithms, SGD, batch processing, training pipelines, memory management, computational efficiency, model convergence, data parallelism, model parallelism, training systems, automatic differentiation, mixed precision, gradient synchronization]
|
||||
topics_covered:
|
||||
- topic: Training Systems Architecture
|
||||
subtopics: [system evolution, hardware adaptation, computational requirements, memory hierarchies, resource coordination, performance optimization]
|
||||
- topic: Mathematical Foundations
|
||||
subtopics: [neural network computation, matrix operations, activation functions, gradient computation, automatic differentiation, numerical stability]
|
||||
- topic: Training Algorithms and Optimization
|
||||
subtopics: [gradient descent variants, optimization algorithms, learning rate scheduling, regularization techniques, convergence analysis, hyperparameter tuning]
|
||||
- topic: Training Pipeline Design
|
||||
subtopics: [data loading, preprocessing, forward pass, backward pass, parameter updates, validation, checkpointing, monitoring]
|
||||
- topic: Memory and Computation Management
|
||||
subtopics: [memory allocation, gradient accumulation, mixed precision training, memory optimization, computational efficiency, resource utilization]
|
||||
- topic: Distributed and Parallel Training
|
||||
subtopics: [data parallelism, model parallelism, pipeline parallelism, gradient synchronization, distributed architectures, scaling strategies]
|
||||
- topic: Advanced Training Techniques
|
||||
subtopics: [transfer learning, fine-tuning, curriculum learning, progressive training, adversarial training, self-supervised learning, meta-learning]
|
||||
- topic: Training System Performance
|
||||
subtopics: [bottleneck analysis, performance profiling, optimization strategies, scalability considerations, fault tolerance, monitoring and debugging]
|
||||
- Profile-diagnose-fix-reprofile
|
||||
- Pipeline overlapping
|
||||
formulas:
|
||||
- Effective Batch Size
|
||||
- Training time estimation
|
||||
- Roofline analysis in training
|
||||
|
||||
Reference in New Issue
Block a user