Approximate Dynamic Programming for Weakly Coupled Markov Decision Processes with Perfect and Imperfect Information

2018 Mahshid Salemi Parizi

Author : Mahshid Salemi Parizi
Release : 2018
Genre :
Kind :
Book Rating : /5 ( reviews)

Approximate Dynamic Programming for Weakly Coupled Markov Decision Processes with Perfect and Imperfect Information - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Approximate Dynamic Programming for Weakly Coupled Markov Decision Processes with Perfect and Imperfect Information write by Mahshid Salemi Parizi. This book was released on 2018. Approximate Dynamic Programming for Weakly Coupled Markov Decision Processes with Perfect and Imperfect Information available in PDF, EPUB and Kindle. A broad range of optimization problems in applications such as healthcare operations, revenue management, telecommunications, high-performance computing, logistics and transportation, business analytics, and defense, have the following form. Heterogeneous service requests arrive dynamically and stochastically over slotted time. A request may require multiple resources to complete. The decision-maker may collect a reward on successfully completing a service request, and may also incur costs for rejecting requests or for delaying service. The decision-maker's goal is to choose how to dynamically allocate limited resources to various service requests so as to optimize a certain performance-metric. Despite the prevalence of these problems, a majority of existing research focuses only on their stylized models. While such stylized models are often insightful, several experts have commented in recent literature reviews that their applicability is limited in practice. On the other hand, more realistic models of such problems are computationally difficult to solve owing to the curse of dimensionality. The research objective of this dissertation is to build Markov decision process (MDP) models of four classes of dynamic resource allocation problems under uncertainty, and then to develop algorithms for their approximate solution. Specifically, most MDP models in this dissertation will possess the so-called weakly coupled structure. That is, the MDP is composed of several sub-MDPs; the reward is additively separable and the transition probabilities are multiplicatively separable over these sub-MDPs; and the sub-MDPs are joined only via linking constraints on the actions they choose. The dissertation proposes mathematical programming-based and simulation-based approximate dynamic programming methods for their solution. Performance of these methods is compared against one-another and against heuristic resource allocation policies. An outline of this dissertation is described below. Chapter 1 investigates a class of scheduling problems where dynamically and stochastically arriving appointment requests are either rejected or booked for future slots. A customer may cancel an appointment. A customer who does not cancel may fail to show up. The planner may overbook appointments to mitigate the detrimental effects of cancellations and no-shows. A customer needs multiple renewable resources. The system receives a reward for providing service; and incurs costs for rejecting requests, appointment delays, and overtime. Customers are heterogeneous in all problem parameters. The chapter provides a weakly coupled MDP formulation of these problems. Exact solution of this MDP is intractable. An approximate dynamic programming method rooted in Lagrangian relaxation, affine value function approximation, and constraint generation is applied to this weakly coupled MDP. This method is compared with a myopic scheduling heuristic on 1800 problem instances. These numerical experiments show that there was a statistically significant difference in the performance of the two methods in 77% of these instances. Of these statistically significant instances, the Lagrangian method outperformed the myopic method in 97% instances. Chapter 2 focuses on a class of non-preemptive scheduling problems, where a decision-maker stochastically and dynamically receives requests to work on heterogeneous projects over discrete time. The projects are comprised of precedence-constrained tasks that require multiple resources with limited availabilities. Incomplete projects are held in virtual queues with finite capacities. When a queue is full, an arriving project must be rejected. The projects differ in their stochastic arrival patterns; completion rewards; rejection, waiting and operating costs; activity-on-node networks and task durations; queue capacities; and resource requirements. The decision-maker's goal is to choose which tasks to start in each time-slot to maximize the infinite-horizon discounted expected profit. The chapter provides a weakly coupled MDP formulation of such dynamic resource-constrained project scheduling problems (DRCPSPs). Unfortunately, existing mathematical programming-based approximate dynamic programming techniques (similar to those in Chapter 1) are computationally tedious for DRCPSPs owing to their exceedingly large scale and complex combinatorial structure. Therefore, the chapter applies a simulation-based policy iteration method that uses least-squares fitting to tune the parameters of a value function approximation. The performance of this method is numerically compared against a myopic scheduling heuristic on 480 randomly generated problem instances. These numerical experiments show that the difference between the two methods statistically significant in about 60%of the instances. The approximate policy iteration method outperformed the myopic heuristic in 74% of these statistically significant instances. In Chapters 1 and 2, the decision-maker is assumed to know all parameters that describe the weakly coupled MDPs. Chapter 3 investigates an extension where the decision-maker only has imperfect information about the weakly coupled MDP. Rather than only focusing on weakly coupled MDPs that arise in specific applications as in Chapters 1 and 2, Chapter 3 works with general weakly coupled MDPs. Two different scenarios with imperfect information are studied. In the first case, the transition probabilities for each subproblem are unknown to the decision-maker. In particular, these transition probabilities are parameterized, and the decision-maker does not know the values of these parameters. The decision-maker begins with prior probabilistic beliefs about these parameters and updates these beliefs using Bayes' Theorem as the state evolution is observed. This yields a Bayes-adaptive weakly coupled MDP formulation whose exact solution is intractable. Computationally tractable approximate dynamic programing methods that combine semi-stochastic certainty equivalent control or Thompson sampling with Lagrangian relaxation are proposed. These ideas are applied to a class of dynamic stochastic resource allocation problems and numerical results are presented.In the second case, the decision-maker cannot observe the actual state of the system, but only receives a noisy signal about it. The decision-maker thus needs to probabilistically infer the actual state. This yields a partially observable weakly coupled MDP formulation whose exact solution is also intractable. Computationally tractable approximate dynamic programming methods rooted in semi-stochastic certainty equivalent control and Thompson sampling are again proposed. These ideas are applied to a restless multi-armed bandit problem and numerical results are presented. Chapter 4 investigates a class of sequential auction design problems under imperfect information. There, the resource corresponds to the seller's inventory on hand, which is to be allocated to dynamically and stochastically arriving buyers' requests (bids). In particular, the seller needs to decide lot-sizes in a sequential, multi-unit auction setting, where bidder demand and bid distributions are not known in their entirety. The chapter formulates a Bayes-adaptive MDP to study a profit maximization problem in this scenario. The number of bidders is Poisson distributed with a Gamma prior on its mean, and the bid distribution is categorical with a Dirichlet prior. The seller updates these beliefs using data collected over auctions while simultaneously making lot-sizing decisions until all inventory is depleted. Exact solution of this Bayes-adaptive MDP is intractable. The chapter proposes three approximation methods (semi-stochastic certainty equivalent control, knowledge gradient, and Thompson sampling) and compares them via numerical experiments.

Constrained Markov Decision Processes

1999-03-30 Eitan Altman

Author : Eitan Altman
Release : 1999-03-30
Genre : Mathematics
Kind :
Book Rating : 821/5 ( reviews)

Constrained Markov Decision Processes - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Constrained Markov Decision Processes write by Eitan Altman. This book was released on 1999-03-30. Constrained Markov Decision Processes available in PDF, EPUB and Kindle. This book provides a unified approach for the study of constrained Markov decision processes with a finite state space and unbounded costs. Unlike the single controller case considered in many other books, the author considers a single controller with several objectives, such as minimizing delays and loss, probabilities, and maximization of throughputs. It is desirable to design a controller that minimizes one cost objective, subject to inequality constraints on other cost objectives. This framework describes dynamic decision problems arising frequently in many engineering fields. A thorough overview of these applications is presented in the introduction. The book is then divided into three sections that build upon each other. The first part explains the theory for the finite state space. The author characterizes the set of achievable expected occupation measures as well as performance vectors, and identifies simple classes of policies among which optimal policies exist. This allows the reduction of the original dynamic into a linear program. A Lagranian approach is then used to derive the dual linear program using dynamic programming techniques. In the second part, these results are extended to the infinite state space and action spaces. The author provides two frameworks: the case where costs are bounded below and the contracting framework. The third part builds upon the results of the first two parts and examines asymptotical results of the convergence of both the value and the policies in the time horizon and in the discount factor. Finally, several state truncation algorithms that enable the approximation of the solution of the original control problem via finite linear programs are given.

Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design

2012 Jennifer Elizabeth Mason

Author : Jennifer Elizabeth Mason
Release : 2012
Genre :
Kind :
Book Rating : /5 ( reviews)

Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design write by Jennifer Elizabeth Mason. This book was released on 2012. Markov Decision Processes and Approximate Dynamic Programming Methods for Optimal Treatment Design available in PDF, EPUB and Kindle.

Simulation-based Algorithms for Markov Decision Processes

2007-05-01 Hyeong Soo Chang

Author : Hyeong Soo Chang
Release : 2007-05-01
Genre : Business & Economics
Kind :
Book Rating : 905/5 ( reviews)

Simulation-based Algorithms for Markov Decision Processes - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Simulation-based Algorithms for Markov Decision Processes write by Hyeong Soo Chang. This book was released on 2007-05-01. Simulation-based Algorithms for Markov Decision Processes available in PDF, EPUB and Kindle. Markov decision process (MDP) models are widely used for modeling sequential decision-making problems that arise in engineering, economics, computer science, and the social sciences. This book brings the state-of-the-art research together for the first time. It provides practical modeling methods for many real-world problems with high dimensionality or complexity which have not hitherto been treatable with Markov decision processes.

Markov Decision Processes with Their Applications

2007-09-14 Qiying Hu

Author : Qiying Hu
Release : 2007-09-14
Genre : Business & Economics
Kind :
Book Rating : 511/5 ( reviews)

Markov Decision Processes with Their Applications - read free eBook in online reader or directly download on the web page. Select files or add your book in reader. Download and read online ebook Markov Decision Processes with Their Applications write by Qiying Hu. This book was released on 2007-09-14. Markov Decision Processes with Their Applications available in PDF, EPUB and Kindle. Put together by two top researchers in the Far East, this text examines Markov Decision Processes - also called stochastic dynamic programming - and their applications in the optimal control of discrete event systems, optimal replacement, and optimal allocations in sequential online auctions. This dynamic new book offers fresh applications of MDPs in areas such as the control of discrete event systems and the optimal allocations in sequential online auctions.

You may also like...