Search CORE

624 research outputs found

Model and Reinforcement Learning for Markov Games with Risk Preferences

Author: Hai Pham Viet
Haskell William B.
Huang Wenjie
Publication venue
Publication date: 21/11/2019
Field of study

We motivate and propose a new model for non-cooperative Markov game which considers the interactions of risk-aware players. This model characterizes the time-consistent dynamic "risk" from both stochastic state transitions (inherent to the game) and randomized mixed strategies (due to all other players). An appropriate risk-aware equilibrium concept is proposed and the existence of such equilibria is demonstrated in stationary strategies by an application of Kakutani's fixed point theorem. We further propose a simulation-based Q-learning type algorithm for risk-aware equilibrium computation. This algorithm works with a special form of minimax risk measures which can naturally be written as saddle-point stochastic optimization problems, and covers many widely investigated risk measures. Finally, the almost sure convergence of this simulation-based algorithm to an equilibrium is demonstrated under some mild conditions. Our numerical experiments on a two player queuing game validate the properties of our model and algorithm, and demonstrate their worth and applicability in real life competitive decision-making.Comment: 38 pages, 6 tables, 5 figure

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Stochastic L-BFGS: Improved Convergence Rates and Practical Acceleration Strategies

Author: Haskell William B.
Tan Vincent Y. F.
Zhao Renbo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/10/2017
Field of study

We revisit the stochastic limited-memory BFGS (L-BFGS) algorithm. By proposing a new framework for the convergence analysis, we prove improved convergence rates and computational complexities of the stochastic L-BFGS algorithms compared to previous works. In addition, we propose several practical acceleration strategies to speed up the empirical performance of such algorithms. We also provide theoretical analyses for most of the strategies. Experiments on large-scale logistic and ridge regression problems demonstrate that our proposed strategies yield significant improvements vis-\`a-vis competing state-of-the-art algorithms

arXiv.org e-Print Archive

Crossref

An Inexact Primal-Dual Smoothing Framework for Large-Scale Non-Bilinear Saddle Point Problems

Author: Haskell William B.
Hien Le Thi Khanh
Zhao Renbo
Publication venue
Publication date: 09/07/2020
Field of study

We develop an inexact primal-dual first-order smoothing framework to solve a class of non-bilinear saddle point problems with primal strong convexity. Compared with existing methods, our framework yields a significant improvement over the primal oracle complexity, while it has competitive dual oracle complexity. In addition, we consider the situation where the primal-dual coupling term has a large number of component functions. To efficiently handle this situation, we develop a randomized version of our smoothing framework, which allows the primal and dual sub-problems in each iteration to be solved by randomized algorithms inexactly in expectation. The convergence of this framework is analyzed both in expectation and with high probability. In terms of the primal and dual oracle complexities, this framework significantly improves over its deterministic counterpart. As an important application, we adapt both frameworks for solving convex optimization problems with many functional constraints. To obtain an

\varepsilon

-optimal and

\varepsilon

-feasible solution, both frameworks achieve the best-known oracle complexities (in terms of their dependence on

\varepsilon

)

arXiv.org e-Print Archive

Scaling device for photographic images

Author: Cox Robert B.
Haskell William D.
Rivera Jorge E.
Stevenson Charles G.
Youngquist Robert C.
Publication venue
Publication date: 10/05/2005
Field of study

A scaling device projects a known optical pattern into the field of view of a camera, which can be employed as a reference scale in a resulting photograph of a remote object, for example. The device comprises an optical beam projector that projects two or more spaced, parallel optical beams onto a surface of a remotely located object to be photographed. The resulting beam spots or lines on the object are spaced from one another by a known, predetermined distance. As a result, the size of other objects or features in the photograph can be determined through comparison of their size to the known distance between the beam spots. Preferably, the device is a small, battery-powered device that can be attached to a camera and employs one or more laser light sources and associated optics to generate the parallel light beams. In a first embodiment of the invention, a single laser light source is employed, but multiple parallel beams are generated thereby through use of beam splitting optics. In another embodiment, multiple individual laser light sources are employed that are mounted in the device parallel to one another to generate the multiple parallel beams

NASA Technical Reports Server