Algorithm-accelerator Co-design for High-performance and Secure Deep Learning

2022 Weizhe Hua

Author : Weizhe Hua
Release : 2022
Genre :
Kind :
Book Rating : /5 ( reviews)

Algorithm-accelerator Co-design for High-performance and Secure Deep Learning - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Algorithm-accelerator Co-design for High-performance and Secure Deep Learning write by Weizhe Hua. This book was released on 2022. Algorithm-accelerator Co-design for High-performance and Secure Deep Learning available in PDF, EPUB and Kindle. Deep learning has emerged as a new engine for many of today's artificial intelligence/machine learning systems, leading to several recent breakthroughs in vision and natural language processing tasks.However, as we move into the era of deep learning with billions and even trillions of parameters, meeting the computational and memory requirements to train and serve state-of-the-art models has become extremely challenging. Optimizing the computational cost and memory footprint of deep learning models for better system performance is critical to the widespread deployment of deep learning. Moreover, a massive amount of sensitive and private user data is exposed to the deep learning system during the training or serving process. Therefore, it is essential to investigate potential vulnerabilities in existing deep learning hardware, and then design secure deep learning systems that provide strong privacy guarantees for user data and the models that learn from the data. In this dissertation, we propose to co-design the deep learning algorithms and hardware architectural techniques to improve both the performance and security/privacy of deep learning systems. On high-performance deep learning, we first introduce channel gating neural network (CGNet), which exploits the dynamic sparsity of specific inputs to reduce computation of convolutional neural networks. We also co-develop an ASIC accelerator for CGNet that can turn theoretical FLOP reduction into wall-clock speedup. Secondly, we present Fast Linear Attention with a Single Head (FLASH), a state-of-the-art language model specifically designed for Google's TPU that can achieve transformer-level quality with linear complexity with respect to the sequence length. Through our empirical studies on masked language modeling, auto-regressive language modeling, and fine-tuning for question answering, FLASH achieves at least similar if not better quality compared to the augmented transformer, while being significantly faster (e.g., up to 12 times faster). On the security of deep learning, we study the side-channel vulnerabilities of existing deep learning accelerators. We then introduce a secure accelerator architecture for privacy-preserving deep learning, named GuardNN. GuardNN provides a trusted execution environment (TEE) with specialized protection for deep learning, and achieves a small trusted computing base and low protection overhead at the same time. The FPGA prototype of GuardNN achieves a maximum performance overhead of 2.4\% across four different modern DNNs models for ImageNet.

Accelerator Architecture for Secure and Energy Efficient Machine Learning

2022 Mohammad Hossein Samavatian

Author : Mohammad Hossein Samavatian
Release : 2022
Genre : Computer architecture
Kind :
Book Rating : /5 ( reviews)

Accelerator Architecture for Secure and Energy Efficient Machine Learning - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Accelerator Architecture for Secure and Energy Efficient Machine Learning write by Mohammad Hossein Samavatian. This book was released on 2022. Accelerator Architecture for Secure and Energy Efficient Machine Learning available in PDF, EPUB and Kindle. ML applications are driving the next computing revolution. In this context both performance and security are crucial. We propose hardware/software co-design solutions for addressing both. First, we propose RNNFast, an accelerator for Recurrent Neural Networks (RNNs). RNNs are particularly well suited for machine learning problems in which context is important, such as language translation. RNNFast leverages an emerging class of non-volatile memory called domain-wall memory (DWM). We show that DWM is very well suited for RNN acceleration due to its very high density and low read/write energy. RNNFast is very efficient and highly scalable, with a flexible mapping of logical neurons to RNN hardware blocks. The accelerator is designed to minimize data movement by closely interleaving DWM storage and computation. We compare our design with a state-of-the-art GPGPU and find 21.8X higher performance with 70X lower energy. Second, we brought ML security into ML accelerator design for more efficiency and robustness. Deep Neural Networks (DNNs) are employed in an increasing number of applications, some of which are safety-critical. Unfortunately, DNNs are known to be vulnerable to so-called adversarial attacks. In general, the proposed defenses have high overhead, some require attack-specific re-training of the model or careful tuning to adapt to different attacks. We show that these approaches, while successful for a range of inputs, are insufficient to address stronger, high-confidence adversarial attacks. To address this, we propose HASI and DNNShield, two hardware-accelerated defenses that adapt the strength of the response to the confidence of the adversarial input. Both techniques rely on approximation or random noise deliberately introduced into the model. HASI uses direct noise injection into the model at inference. DNNShield uses approximation that relies on dynamic and random sparsification of the DNN model to achieve inference approximation efficiently and with fine-grain control over the approximation error. Both techniques use the output distribution characteristics of noisy/sparsified inference compared to a baseline output to detect adversarial inputs. We show an adversarial detection rate of 86% when applied to VGG16 and 88% when applied to ResNet50, which exceeds the detection rate of the state-of-the-art approaches, with a much lower overhead. We demonstrate a software/hardware-accelerated FPGA prototype, which reduces the performance impact of HASI and DNNShield relative to software-only CPU and GPU implementations.

Deep Learning for Computer Architects

2017-08-22 Brandon Reagen

Author : Brandon Reagen
Release : 2017-08-22
Genre : Computers
Kind :
Book Rating : 857/5 ( reviews)

Deep Learning for Computer Architects - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Deep Learning for Computer Architects write by Brandon Reagen. This book was released on 2017-08-22. Deep Learning for Computer Architects available in PDF, EPUB and Kindle. This is a primer written for computer architects in the new and rapidly evolving field of deep learning. It reviews how machine learning has evolved since its inception in the 1960s and tracks the key developments leading up to the emergence of the powerful deep learning techniques that emerged in the last decade. Machine learning, and specifically deep learning, has been hugely disruptive in many fields of computer science. The success of deep learning techniques in solving notoriously difficult classification and regression problems has resulted in their rapid adoption in solving real-world problems. The emergence of deep learning is widely attributed to a virtuous cycle whereby fundamental advancements in training deeper models were enabled by the availability of massive datasets and high-performance computer hardware. It also reviews representative workloads, including the most commonly used datasets and seminal networks across a variety of domains. In addition to discussing the workloads themselves, it also details the most popular deep learning tools and show how aspiring practitioners can use the tools with the workloads to characterize and optimize DNNs. The remainder of the book is dedicated to the design and optimization of hardware and architectures for machine learning. As high-performance hardware was so instrumental in the success of machine learning becoming a practical solution, this chapter recounts a variety of optimizations proposed recently to further improve future designs. Finally, it presents a review of recent research published in the area as well as a taxonomy to help readers understand how various contributions fall in context.

Data Orchestration in Deep Learning Accelerators

2022-05-31 Tushar Krishna

Author : Tushar Krishna
Release : 2022-05-31
Genre : Technology & Engineering
Kind :
Book Rating : 676/5 ( reviews)

Data Orchestration in Deep Learning Accelerators - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Data Orchestration in Deep Learning Accelerators write by Tushar Krishna. This book was released on 2022-05-31. Data Orchestration in Deep Learning Accelerators available in PDF, EPUB and Kindle. This Synthesis Lecture focuses on techniques for efficient data orchestration within DNN accelerators. The End of Moore's Law, coupled with the increasing growth in deep learning and other AI applications has led to the emergence of custom Deep Neural Network (DNN) accelerators for energy-efficient inference on edge devices. Modern DNNs have millions of hyper parameters and involve billions of computations; this necessitates extensive data movement from memory to on-chip processing engines. It is well known that the cost of data movement today surpasses the cost of the actual computation; therefore, DNN accelerators require careful orchestration of data across on-chip compute, network, and memory elements to minimize the number of accesses to external DRAM. The book covers DNN dataflows, data reuse, buffer hierarchies, networks-on-chip, and automated design-space exploration. It concludes with data orchestration challenges with compressed and sparse DNNs and future trends. The target audience is students, engineers, and researchers interested in designing high-performance and low-energy accelerators for DNN inference.

Algorithm-Centric Design of Reliable and Efficient Deep Learning Processing Systems

2023 Elbruz Ozen

Author : Elbruz Ozen
Release : 2023
Genre :
Kind :
Book Rating : /5 ( reviews)

Algorithm-Centric Design of Reliable and Efficient Deep Learning Processing Systems - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Algorithm-Centric Design of Reliable and Efficient Deep Learning Processing Systems write by Elbruz Ozen. This book was released on 2023. Algorithm-Centric Design of Reliable and Efficient Deep Learning Processing Systems available in PDF, EPUB and Kindle. Artificial intelligence techniques driven by deep learning have experienced significant advancements in the past decade. The usage of deep learning methods has increased dramatically in practical application domains such as autonomous driving, healthcare, and robotics, where the utmost hardware resource efficiency, as well as strict hardware safety and reliability requirements, are often imposed. The increasing computational cost of deep learning models has been traditionally tackled through model compression and domain-specific accelerator design. As the cost of conventional fault tolerance methods is often prohibitive in consumer electronics, the question of functional safety and reliability for deep learning hardware is still in its infancy. This dissertation outlines a novel approach to deliver dramatic boosts in hardware safety, reliability, and resource efficiency through a synergistic co-design paradigm. We first observe and make use of the unique algorithmic characteristics of deep neural networks, including plasticity in the design process, resiliency to small numerical perturbations, and their inherent redundancy, as well as the unique micro-architectural properties of deep learning accelerators such as regularity. The advocated approach is accomplished by reshaping deep neural networks, enhancing deep neural network accelerators strategically, prioritizing the overall functional correctness, and minimizing the associated costs through the statistical nature of deep neural networks. To illustrate, our analysis demonstrates that deep neural networks equipped with the proposed techniques can maintain accuracy gracefully, even at extreme rates of hardware errors. As a result, the described methodology can embed strong safety and reliability characteristics in mission-critical deep learning applications at a negligible cost. The proposed approach further offers a promising avenue for handling the micro-architectural challenges of deep neural network accelerators and boosting resource efficiency through the synergistic co-design of deep neural networks and hardware micro-architectures.

You may also like...