Created subfolders within images/ based on filetype

Better organization for the future to build a PDF etc. cause images need to be pulled from the right type for quality rendering. Currently, not being used but will be useful in the future and plus the organization now doesn't hurt by any means, only makes the "code" cleaner.
This commit is contained in:
Vijay Janapa Reddi
2023-12-10 15:19:47 -05:00
parent 594441bfd8
commit d3dc9f17d4
401 changed files with 364 additions and 364 deletions

View File

@@ -1,6 +1,6 @@
# AI Frameworks
![_DALL·E 3 Prompt: Illustration in a rectangular format, designed for a professional textbook, where the content spans the entire width. The vibrant chart represents training and inference frameworks for ML. Icons for TensorFlow, Keras, PyTorch, ONNX, and TensorRT are spread out, filling the entire horizontal space, and aligned vertically. Each icon is accompanied by brief annotations detailing their features. The lively colors like blues, greens, and oranges highlight the icons and sections against a soft gradient background. The distinction between training and inference frameworks is accentuated through color-coded sections, with clean lines and modern typography maintaining clarity and focus._](./images/cover_ml_frameworks.png)
![_DALL·E 3 Prompt: Illustration in a rectangular format, designed for a professional textbook, where the content spans the entire width. The vibrant chart represents training and inference frameworks for ML. Icons for TensorFlow, Keras, PyTorch, ONNX, and TensorRT are spread out, filling the entire horizontal space, and aligned vertically. Each icon is accompanied by brief annotations detailing their features. The lively colors like blues, greens, and oranges highlight the icons and sections against a soft gradient background. The distinction between training and inference frameworks is accentuated through color-coded sections, with clean lines and modern typography maintaining clarity and focus._](./images/png/cover_ml_frameworks.png)
In this chapter, we explore the landscape of AI frameworks that serve as the foundation for developing machine learning systems. AI frameworks provide the essential tools, libraries, and environments necessary to design, train, and deploy machine learning models. We delve into the evolutionary trajectory of these frameworks, dissect the workings of TensorFlow, and provide insights into the core components and advanced features that define these frameworks.
@@ -68,7 +68,7 @@ Each generation of frameworks unlocked new capabilities that powered advancement
In recent years, there has been a convergence on the frameworks. @fig-ml-framework shows that TensorFlow and PyTorch have become the overwhelmingly dominant ML frameworks, representing more than 95% of ML frameworks used in research and production. Keras was integrated into TensorFlow in 2019; Preferred Networks transitioned Chainer to PyTorch in 2019; and Microsoft stopped actively developing CNTK in 2022 in favor of supporting PyTorch on Windows.
![Popularity of ML frameworks in the United States as measured by Google web searches](images/image6.png){#fig-ml-framework}
![Popularity of ML frameworks in the United States as measured by Google web searches](images/png/image6.png){#fig-ml-framework}
However, a one-size-fits-all approach does not work well across the spectrum from cloud to tiny edge devices. Different frameworks represent various philosophies around graph execution, declarative versus imperative APIs, and more. Declarative defines what the program should do while imperative focuses on how it should do it step-by-step. For instance, TensorFlow uses graph execution and declarative-style modeling while PyTorch adopts eager execution and imperative modeling for more Pythonic flexibility. Each approach carries tradeoffs that we will discuss later in the Basic Components section.
@@ -100,7 +100,7 @@ TensorFlow is both a training and inference framework and provides built-in func
9. [TensorFlow Extended (TFX)](https://www.tensorflow.org/tfx): end-to-end platform designed to deploy and manage machine learning pipelines in production settings. TFX encompasses components for data validation, preprocessing, model training, validation, and serving.
![Architecture overview of TensorFlow 2.0 (Source: [Tensorflow](https://blog.tensorflow.org/2019/01/whats-coming-in-tensorflow-2-0.html))](images/tensorflow.png){#fig-tensorflow-architecture}
![Architecture overview of TensorFlow 2.0 (Source: [Tensorflow](https://blog.tensorflow.org/2019/01/whats-coming-in-tensorflow-2-0.html))](images/png/tensorflow.png){#fig-tensorflow-architecture}
TensorFlow was developed to address the limitations of DistBelief [@abadi2016tensorflow]---the framework in use at Google from 2011 to 2015---by providing flexibility along three axes: 1) defining new layers, 2) refining training algorithms, and 3) defining new training algorithms. To understand what limitations in DistBelief led to the development of TensorFlow, we will first give a brief overview of the Parameter Server Architecture that DistBelief employed [@dean2012large].
@@ -169,7 +169,7 @@ Here's a summarizing comparative analysis:
To understand tensors, let us start from the familiar concepts in linear algebra. As demonstrated in @fig-tensor-data-structure, vectors can be represented as a stack of numbers in a 1-dimensional array. Matrices follow the same idea, and one can think of them as many vectors being stacked on each other, making it 2 dimensional. Higher dimensional tensors work the same way. A 3-dimensional tensor is simply a set of matrices stacked on top of each other in another direction. Therefore, vectors and matrices can be considered special cases of tensors, with 1D and 2D dimensions respectively.
![Visualization of Tensor Data Structure](images/image2.png){#fig-tensor-data-structure}
![Visualization of Tensor Data Structure](images/png/image2.png){#fig-tensor-data-structure}
Defining formally, in machine learning, tensors are a multi-dimensional array of numbers. The number of dimensions defines the rank of the tensor. As a generalization of linear algebra, the study of tensors is called multilinear algebra. There are noticeable similarities between matrices and higher ranked tensors. First, it is possible to extend the definitions given in linear algebra to tensors, such as with eigenvalues, eigenvectors, and rank (in the linear algebra sense) . Furthermore, with the way that we have defined tensors, it is possible to turn higher dimensional tensors into matrices. This turns out to be very critical in practice, as multiplication of abstract representations of higher dimensional tensors are often completed by first converting them into matrices for multiplication.
@@ -183,7 +183,7 @@ Computational graphs are a key component of deep learning frameworks like Tensor
For example, a node might represent a matrix multiplication operation, taking two input matrices (or tensors) and producing an output matrix (or tensor). To visualize this, consider the simple example in @fig-computational-graph. The directed acyclic graph above computes $z = x \times y$, where each of the variables are just numbers.
![Basic Example of Computational Graph](images/image1.png){#fig-computational-graph width="50%" height="auto" align="center"}
![Basic Example of Computational Graph](images/png/image1.png){#fig-computational-graph width="50%" height="auto" align="center"}
Underneath the hood, the computational graphs represent abstractions for common layers like convolutional, pooling, recurrent, and dense layers, with data including activations, weights, biases, are represented in tensors. Convolutional layers form the backbone of CNN models for computer vision. They detect spatial patterns in input data through learned filters. Recurrent layers like LSTMs and GRUs enable processing sequential data for tasks like language translation. Attention layers are used in transformers to draw global context from the entire input.
@@ -370,7 +370,7 @@ While TPU's can drastically reduce training times, it also has disadvantages. Fo
Today, NVIDIA GPUs dominate training, aided by software libraries like [CUDA](https://developer.nvidia.com/cuda-toolkit), [cuDNN](https://developer.nvidia.com/cudnn), and [TensorRT.](https://developer.nvidia.com/tensorrt#:~:text=NVIDIA TensorRT-LLM is an,knowledge of C++ or CUDA.) Frameworks also tend to include optimizations to maximize performance on these hardware types, like pruning unimportant connections and fusing layers. Combining these techniques with hardware acceleration provides greater efficiency. For inference, hardware is increasingly moving towards optimized ASICs and SoCs. Google\'s TPUs accelerate models in data centers. Apple, Qualcomm, and others now produce AI-focused mobile chips. The NVIDIA Jetson family targets autonomous robots.
![Examples of machine learning hardware accelerators (Source: [365](https://www.info-assas-in.top/ProductDetail.aspx?iid=148457818&pr=40.88))](images/hardware_accelerator.png){#fig-hardware-accelerator}
![Examples of machine learning hardware accelerators (Source: [365](https://www.info-assas-in.top/ProductDetail.aspx?iid=148457818&pr=40.88))](images/png/hardware_accelerator.png){#fig-hardware-accelerator}
## Advanced Features {#sec-ai_frameworks-advanced}
@@ -422,7 +422,7 @@ There are additional challenges associated with federated learning. The number o
The heterogeneity of device resources is another hurdle. Devices participating in Federated Learning can have varying computational powers and memory capacities. This diversity makes it challenging to design algorithms that are efficient across all devices. Privacy and security issues are not a guarantee for federated learning. Techniques such as inversion gradient attacks can be used to extract information about the training data from the model parameters. Despite these challenges, the large amount of potential benefits continue to make it a popular research area. Open source programs such as [Flower](https://flower.dev/) have been developed to make it simpler to implement federated learning with a variety of machine learning frameworks.
![A centralized-server approach to federated learning (Source: [NVIDIA](https://blogs.nvidia.com/blog/what-is-federated-learning/))](images/federated_learning.png){#fig-federated-learning}
![A centralized-server approach to federated learning (Source: [NVIDIA](https://blogs.nvidia.com/blog/what-is-federated-learning/))](images/png/federated_learning.png){#fig-federated-learning}
## Framework Specialization
@@ -600,7 +600,7 @@ Through all these various custom techniques like static compilation, model-based
Choosing the right machine learning framework for a given application requires carefully evaluating models, hardware, and software considerations. By analyzing these three aspects - models, hardware, and software - ML engineers can select the optimal framework and customize as needed for efficient and performant on-device ML applications. The goal is to balance model complexity, hardware limitations, and software integration to design a tailored ML pipeline for embedded and edge devices.
![TensorFlow Framework Comparison - General](images/image4.png){#fig-tf-comparison width="100%" height="auto" align="center" caption="TensorFlow Framework Comparison - General"}
![TensorFlow Framework Comparison - General](images/png/image4.png){#fig-tf-comparison width="100%" height="auto" align="center" caption="TensorFlow Framework Comparison - General"}
### Model
@@ -608,13 +608,13 @@ TensorFlow supports significantly more ops than TensorFlow Lite and TensorFlow L
### Software
![TensorFlow Framework Comparison - Software](images/image5.png){#fig-tf-sw-comparison width="100%" height="auto" align="center" caption="TensorFlow Framework Comparison - Model"}
![TensorFlow Framework Comparison - Software](images/png/image5.png){#fig-tf-sw-comparison width="100%" height="auto" align="center" caption="TensorFlow Framework Comparison - Model"}
TensorFlow Lite Micro does not have OS support, while TensorFlow and TensorFlow Lite do, in order to reduce memory overhead, make startup times faster, and consume less energy (@fig-tf-sw-comparison). TensorFlow Lite Micro can be used in conjunction with real-time operating systems (RTOS) like FreeRTOS, Zephyr, and Mbed OS. TensorFlow Lite and TensorFlow Lite Micro support model memory mapping, allowing models to be directly accessed from flash storage rather than loaded into RAM, whereas TensorFlow does not. TensorFlow and TensorFlow Lite support accelerator delegation to schedule code to different accelerators, whereas TensorFlow Lite Micro does not, as embedded systems tend not to have a rich array of specialized accelerators.
### Hardware
![TensorFlow Framework Comparison - Hardware](images/image3.png){#fig-tf-hw-comparison width="100%" height="auto" align="center" caption="TensorFlow Framework Comparison - Hardware"}
![TensorFlow Framework Comparison - Hardware](images/png/image3.png){#fig-tf-hw-comparison width="100%" height="auto" align="center" caption="TensorFlow Framework Comparison - Hardware"}
TensorFlow Lite and TensorFlow Lite Micro have significantly smaller base binary sizes and base memory footprints compared to TensorFlow (@fig-tf-hw-comparison). For example, a typical TensorFlow Lite Micro binary is less than 200KB, whereas TensorFlow is much larger. This is due to the resource-constrained environments of embedded systems. TensorFlow provides support for x86, TPUs, and GPUs like NVIDIA, AMD, and Intel. TensorFlow Lite provides support for Arm Cortex A and x86 processors commonly used in mobile and tablets. The latter is stripped out of all the training logic that is not necessary for ondevice deployment. TensorFlow Lite Micro provides support for microcontroller-focused Arm Cortex M cores like M0, M3, M4, and M7, as well as DSPs like Hexagon and SHARC and MCUs like STM32, NXP Kinetis, Microchip AVR.
@@ -654,7 +654,7 @@ Community support plays another essential factor. Frameworks with active and eng
Currently, the ML system stack consists of four abstractions (@fig-mlsys-stack), namely (1) computational graphs, (2) tensor programs, (3) libraries and runtimes, and (4) hardware primitives.
![Four Abstractions in Current ML System Stack](images/image8.png){#fig-mlsys-stack align="center" caption="Four Abstractions in Current ML System Stack"}
![Four Abstractions in Current ML System Stack](images/png/image8.png){#fig-mlsys-stack align="center" caption="Four Abstractions in Current ML System Stack"}
This has led to vertical (i.e. between abstraction levels) and horizontal (i.e. library-driven vs. compilation-driven approaches to tensor computation) boundaries, which hinder innovation for ML. Future work in ML frameworks can look toward breaking these boundaries. In December 2021, [Apache TVM](https://tvm.apache.org/2021/12/15/tvm-unity) Unity was proposed, which aimed to facilitate interactions between the different abstraction levels (as well as the people behind them, such as ML scientists, ML engineers, and hardware engineers) and co-optimize decisions in all four abstraction levels.