Stochastic Optimization Methods for Policy Evaluation in Reinforcement Learning

Yi Zhou, Shaocong Ma

Format: Paperback
Publisher: now publishers Inc
Country: United States
Published: 15 August 2024
Pages: 60
ISBN: 9781638283706

Stochastic Optimization Methods for Policy Evaluation in Reinforcement Learning

Yi Zhou, Shaocong Ma

This title is printed to order. This book may have been self-published. If so, we cannot guarantee the quality of the content. In the main most books will have gone through the editing process however some may not. We therefore suggest that you be aware of this before ordering this book. If in doubt check either the author or publisher’s details as we are unable to accept any returns unless they are faulty. Please contact us if you have any questions.

This monograph introduces various value-based approaches for solving the policy evaluation problem in the online reinforcement learning (RL) scenario, which aims to learn the value function associated with a specific policy under a single Markov decision process (MDP). Approaches vary depending on whether they are implemented in an on-policy or off-policy manner. In on-policy settings, where the evaluation of the policy is conducted using data generated from the same policy that is being assessed, classical techniques such as TD(0), TD(?), and their extensions with function approximation or variance reduction are employed in this setting. For off-policy evaluation, where samples are collected under a different behavior policy, this monograph introduces gradient-based two-timescale algorithms like GTD2, TDC, and variance-reduced TDC. These algorithms are designed to minimize the mean-squared projected Bellman error (MSPBE) as the objective function. This monograph also discusses their finite-sample convergence upper bounds and sample complexity.

This item is not currently in-stock. It can be ordered online and is expected to ship in 7-14 days

Online
Out of stock
Carlton
Out of stock
- Out of stock
- 03 9347 6633
- Woiwurrung Country, 309 Lygon St, Carlton, Victoria, 3053
Doncaster
Out of stock
Emporium
Out of stock
Hawthorn
Out of stock
- Out of stock
- 03 9819 1917
- Woiwurrung Country, 687 Glenferrie Rd, Hawthorn, Victoria, 3122
Kids
Out of stock
- Out of stock
- 03 9341 7730
- Woiwurrung Country, 315 Lygon St, Carlton 3053
Malvern
Out of stock
- Out of stock
- 03 9509 1952
- Woiwurrung Country, 185 Glenferrie Rd, Malvern 3144
St Kilda
Out of stock
- Out of stock
- 03 9525 3852
- Boonwurrung Country, 112 Acland St, St Kilda 3182
State Library
Out of stock

Our stock data is updated periodically, and availability may change throughout the day for in-demand items. Please call the relevant shop for the most current stock information. Prices are subject to change without notice.

Books, books & more books.

Straight to your inbox.

Stochastic Optimization Methods for Policy Evaluation in Reinforcement Learning

Yi Zhou, Shaocong Ma

Stochastic Optimization Methods for Policy Evaluation in Reinforcement Learning

Yi Zhou, Shaocong Ma