Convergent and Efficient Methods to Optimize Deep Learning

2022 Mehdi Mashayekhi

Author : Mehdi Mashayekhi
Release : 2022
Genre : Deep learning (Machine learning)
Kind :
Book Rating : /5 ( reviews)

Convergent and Efficient Methods to Optimize Deep Learning - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Convergent and Efficient Methods to Optimize Deep Learning write by Mehdi Mashayekhi. This book was released on 2022. Convergent and Efficient Methods to Optimize Deep Learning available in PDF, EPUB and Kindle. Deep Learning Neural Networks (DLNNs) are flexible modeling methods, capable of generating prediction of both continuous and discrete outputs. These methods continue to make large contributions to people’s lives. Machine Learning (ML) algorithms are efficient in handling everyday problems, especially big-data ones. DLNNs have a variety of applications, such as recovering disrupted audio files, self-driving cars, YouTube tumblers, and the list goes on. Nonetheless, the performance of DLNNs and ML algorithms, in general, depends upon a collection of choices made by their users. These decisions can be described using factors called “hyperparameters” or “generalized hyperparameters” and further categorized into three groups. We say “generalized” because some of the groups might not conventionally be optimized over. One group defines the structure of a DLNN, for instance, the number of layers, activation functions, and the layer type. The second group relates to the parameters governing the optimization algorithms to derive the weights which minimize the loss function. Some might argue that optimizing over these hyperparameters endangers convergence on training sets for the weight optimization. Yet, here we consider these hyperparameters to be fully adjustable because we argue that fostering test set (unseen data) prediction accuracy is more important than the surrogate goal of achieving convergence on training sets. The third group of hyperparameters relates to controlling data preparation including feature generation and the sampling of training sets. The problem of optimally designing these generalized hyperparameter settings has received relatively little attention. In addition, DLNNs have large numbers of hyperparameters due to their structure. Here, we focus on optimization examples involving eight generalized hyperparameters. The large number of options makes the associate decision problem for DLNN design difficult. The common approaches for this problem include so-called grid searches which are “full factorials” in the experimental design literature and so-called “Tree Parzan Estimator (TPE)” which is often one choice from a family of black box optimization methods. In general, the problem of how to even formulate the generalized hyperparameter problem that includes a desire to predict well on validation and test sets has been minimally studied. Here, we propose two formulations relating to fostering the most accurate empirical models possible. To effectively solve these formulations and foster accurate deep learning models, we explore two types of approaches. First, for problems with big data where testing is computationally expensive, single shot Design of Experiment (DOE) approaches are explored systematically to support the generation of alternatives to grid search, which is equivalent to full factorial experimentation. A so-called “meta-experiment” is designed to study how choices about random sampling, the effects of different types of data, and different validation strategies impact the performance of DLNNs on twelve standard datasets from the literature. Findings from this study include the promising merits of resolution V fractional factorials and the general benefits of combining sampling and k-fold validation. Both random effects and fixed effects models for the meta-experiment are considered to explore the generality and practical implications of the findings. Second, in problems for which large numbers of trained DLNNs are possible (e.g., over 30), hyperparameter optimization is explored formally as a simulation-optimization problem. Because simulation optimization can be viewed as sequential design and analysis of experiments, this effort represents an extension of the study of “single shot” DOE. Then, an extension of a well-known algorithm is proposed called Randomized Balanced Explorative and Exploitative Search with Choice (R-BEESE-C) sets. The intent is to capitalize on the fact that there are patterns in the solutions of past problems that give human intuition the ability to pick relevant “choice” sets for prioritization. Then, R-BEESE-C is compared to three other competitors including the popular TPE methods from Hyperopt on 12 test classification problems with three replicates, and the results are discussed. On seven of the test problems, R-BEESE-C derives the highest average accuracy compared to all alternatives. Also, we prove that, with sufficiently high numbers of evaluations, R-BEESE-C converges almost surely to the global optimal generalized hyperparameter solution for all datasets with a few qualifications. Because of our computational results and the convergence, we believe that our recommended approaches including resolution V fractional factorials and R-BEESE-C offer the most relevant approaches for deep learning hyperparameter optimization presently available. Finally, an application of DLNNs in social media is explored. R-BEESE-C is tested for optimizing hyperparameters of the Deep Learning (DL) in Bidirectional Encoder Representations from Transformers (BERT) to detect fake news regarding COVID19. The study is based on 4,505 news articles in addition to posts in social media platforms, such as Facebook. The results illustrate how the proposed optimal strategies from the meta-experiment (k-fold validation) and the hyperparameter optimization method (R-BEESE-C) can foster improved classification accuracy and automatic protections from misinformation.

Deep Learning: Convergence to Big Data Analytics

2018-12-30 Murad Khan

Author : Murad Khan
Release : 2018-12-30
Genre : Computers
Kind :
Book Rating : 595/5 ( reviews)

Deep Learning: Convergence to Big Data Analytics - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Deep Learning: Convergence to Big Data Analytics write by Murad Khan. This book was released on 2018-12-30. Deep Learning: Convergence to Big Data Analytics available in PDF, EPUB and Kindle. This book presents deep learning techniques, concepts, and algorithms to classify and analyze big data. Further, it offers an introductory level understanding of the new programming languages and tools used to analyze big data in real-time, such as Hadoop, SPARK, and GRAPHX. Big data analytics using traditional techniques face various challenges, such as fast, accurate and efficient processing of big data in real-time. In addition, the Internet of Things is progressively increasing in various fields, like smart cities, smart homes, and e-health. As the enormous number of connected devices generate huge amounts of data every day, we need sophisticated algorithms to deal, organize, and classify this data in less processing time and space. Similarly, existing techniques and algorithms for deep learning in big data field have several advantages thanks to the two main branches of the deep learning, i.e. convolution and deep belief networks. This book offers insights into these techniques and applications based on these two types of deep learning. Further, it helps students, researchers, and newcomers understand big data analytics based on deep learning approaches. It also discusses various machine learning techniques in concatenation with the deep learning paradigm to support high-end data processing, data classifications, and real-time data processing issues. The classification and presentation are kept quite simple to help the readers and students grasp the basics concepts of various deep learning paradigms and frameworks. It mainly focuses on theory rather than the mathematical background of the deep learning concepts. The book consists of 5 chapters, beginning with an introductory explanation of big data and deep learning techniques, followed by integration of big data and deep learning techniques and lastly the future directions.

Convergence of Deep Learning and Artificial Intelligence in Internet of Things

2022-12-20 Ajay Rana

Author : Ajay Rana
Release : 2022-12-20
Genre : Computers
Kind :
Book Rating : 087/5 ( reviews)

Convergence of Deep Learning and Artificial Intelligence in Internet of Things - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Convergence of Deep Learning and Artificial Intelligence in Internet of Things write by Ajay Rana. This book was released on 2022-12-20. Convergence of Deep Learning and Artificial Intelligence in Internet of Things available in PDF, EPUB and Kindle. This book covers advances and applications of smart technologies including the Internet of Things (IoT), artificial intelligence, and deep learning in areas such as manufacturing, production, renewable energy, and healthcare. It also covers wearable and implantable biomedical devices for healthcare monitoring, smart surveillance, and monitoring applications such as the use of an autonomous drone for disaster management and rescue operations. It will serve as an ideal reference text for senior undergraduate, graduate students, and academic researchers in the areas such as electrical engineering, electronics and communications engineering, computer engineering, and information technology. • Covers concepts, theories, and applications of artificial intelligence and deep learning, from the perspective of the Internet of Things. • Discusses powers predictive analysis, predictive maintenance, and automated processes for making manufacturing plants more efficient, profitable, and safe. • Explores the importance of blockchain technology in the Internet of Things security issues. • Discusses key deep learning concepts including trust management, identity management, security threats, access control, and privacy. • Showcases the importance of intelligent algorithms for cloud-based Internet of Things applications. This text emphasizes the importance of innovation and improving the profitability of manufacturing plants using smart technologies such as artificial intelligence, deep learning, and the Internet of Things. It further discusses applications of smart technologies in diverse sectors such as agriculture, smart home, production, manufacturing, transport, and healthcare.

Large Scale Optimization for Deep Learning

2019 Xiangru Lian

Author : Xiangru Lian
Release : 2019
Genre :
Kind :
Book Rating : /5 ( reviews)

Large Scale Optimization for Deep Learning - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Large Scale Optimization for Deep Learning write by Xiangru Lian. This book was released on 2019. Large Scale Optimization for Deep Learning available in PDF, EPUB and Kindle. "In the big data era, deep learning is often employed to solve all kinds of problems from traditional classification to reinforcement learning. It often takes weeks or even months to train and tune parameters for a deep neural network. Therefore, the efficiency turns out to be a key bottleneck of deep learning. Parallel optimization has then emerged as an essential technology to solve computationally intensive problems. How to design efficient parallel systems and convergent algorithms becomes more and more important. In this dissertation we investigate how to improve the optimization for deep learning from the following aspects: 1. Asynchronous parallelism for reducing the synchronization overhead in parallel computation. 2. Decentralized parallelism to make parallel algorithms more feasible and robust to network topology, latency, and bandwidth. 3. Lossy compression in communication with error compensation for reducing the communication cost without sacrificing the model's quality. 4. Compositional optimization, where the objective function is composed of multiple expectation of loss functions. Batch normalization can be formulated as a kind of compositional optimization. We provide convergence analysis for all the algorithms we propose, and show when we should and should not use them"--Page vii.

Sparsity Prior in Efficient Deep Learning Based Solvers and Models

2022 Xiaohan Chen

Author : Xiaohan Chen
Release : 2022
Genre :
Kind :
Book Rating : /5 ( reviews)

Sparsity Prior in Efficient Deep Learning Based Solvers and Models - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Sparsity Prior in Efficient Deep Learning Based Solvers and Models write by Xiaohan Chen. This book was released on 2022. Sparsity Prior in Efficient Deep Learning Based Solvers and Models available in PDF, EPUB and Kindle. Deep learning has been empirically successful in recent years thanks to the extremely over-parameterized deep models and the data-driven learning with enormous amounts of data. However, deep learning models are especially limited in terms of efficiency, which has two-fold meanings. Firstly, many deep models are designed in a black-box manner, which means these black-box models are unaware of the prior knowledge about the structure of the problems of interest and hence cannot efficiently utilize it. Such unawareness can cause redundancy in parameterization and inferior performance compared to more dedicated methods. Secondly, the extreme over-parameterization itself is inefficient in terms of model storage, memory requirements and computational complexity. This strictly constrains the realistic applications of deep learning on mobile devices with budget resources. Moreover, the financial and environmental costs of training such enormous deep models are unreasonably high, which is exactly the opposite of the call of green AI. In this work, we strive to address the inefficiency of deep models by introducing sparsity as an important prior knowledge to deep learning. Our efforts will be in three sub-directions. In the first direction, we aim at accelerating the solving process for a specific type of optimization problems with sparsity constraints. Instead of designing black-box deep learning models, we derive new parameterizations by absorbing insights from the sparse optimization field, which result in compact deep-learning-based solvers with significantly reduced training costs but superior empirical performance. In the second direction, we introduce sparsity to deep neural networks via weight pruning. Pruning reduces redundancy in over-parameterized deep networks by removing superfluous weights, thus naturally compressing the model storage and computational costs. We aim at pushing pruning to the limit by combining it with other compression techniques for extremely efficient deep models that can be deployed and fine-tuned on edge devices. In the third direction, we investigate what sparsity brings to deep networks. Creating sparsity in deep networks significantly changes the landscape of its loss function and thus behaviors during training. We aim at understanding what these changes are and how we can utilize them to train better sparse neural networks. The main content of this work can be summarized as below. Sparsity Prior in Efficient Deep Solvers. We adopt the algorithm unrolling method to transform classic optimization algorithms into feed-forward deep neural networks that can accelerate convergence by over 100x times. We also provide theoretical guarantees of linear convergence over the newly developed solvers, which is faster than the convergence rate achievable with classic optimization. Meanwhile, the number of parameters to be trained is reduced from millions to tens and even to 3 hyperparameters, decreasing the training time from hours to 6 minutes. Sparsity Prior in Efficient Deep Learning. We investigate compressing deep networks by unifying pruning, quantization and matrix factorization techniques to remove as much redundancy as possible, so that the resulting networks have low inference and/or training costs. The developed methods improve memory/storage efficiency and latency by at least 5x times, varying over data sets and models used. Sparsity Prior in Sparse Neural Networks. We discuss the properties and behaviors of sparse deep networks with the tool of lottery ticket hypothesis (LTH) and dynamic sparse training (DST) and explore their application for efficient training in computer vision, natural language processing and Internet-of-Things (IoT) systems. With our developed sparse neural networks, performance loss is significantly mitigated while by training much fewer parameters, bringing benefits of saving computation costs in general and communication costs specifically for IoT systems

You may also like...