(Translated by https://www.hiragana.jp/)
Search | arXiv e-print repository
Skip to main content

Showing 1–9 of 9 results for author: Timbers, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2206.15378  [pdf, other

    cs.AI cs.GT cs.MA

    Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

    Authors: Julien Perolat, Bart de Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen, Tom Eccles, Mark Rowland, Marc Lanctot, Jean-Baptiste Lespiau, Bilal Piot , et al. (9 additional authors not shown)

    Abstract: We introduce DeepNash, an autonomous agent capable of learning to play the imperfect information game Stratego from scratch, up to a human expert level. Stratego is one of the few iconic board games that Artificial Intelligence (AI) has not yet mastered. This popular game has an enormous game tree on the order of $10^{535}$ nodes, i.e., $10^{175}$ times larger than that of Go. It has the additiona… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

  2. arXiv:2205.06760  [pdf, other

    cs.AI cs.LG cs.MA

    Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

    Authors: Michael Bradley Johanson, Edward Hughes, Finbarr Timbers, Joel Z. Leibo

    Abstract: Advances in artificial intelligence often stem from the development of new environments that abstract real-world situations into a form where research can be done conveniently. This paper contributes such an environment based on ideas inspired by elementary Microeconomics. Agents learn to produce resources in a spatially complex world, trade them with one another, and consume those that they prefe… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

  3. Reward-Respecting Subtasks for Model-Based Reinforcement Learning

    Authors: Richard S. Sutton, Marlos C. Machado, G. Zacharias Holland, David Szepesvari, Finbarr Timbers, Brian Tanner, Adam White

    Abstract: To achieve the ambitious goals of artificial intelligence, reinforcement learning must include planning with a model of the world that is abstract in state and time. Deep learning has made progress with state abstraction, but temporal abstraction has rarely been used, despite extensively developed theory based on the options framework. One reason for this is that the space of possible options is i… ▽ More

    Submitted 16 September, 2023; v1 submitted 7 February, 2022; originally announced February 2022.

    Journal ref: Artificial Intelligence, first published online September 6, 2023

  4. arXiv:2112.03178  [pdf, other

    cs.AI cs.GT cs.LG

    Student of Games: A unified learning algorithm for both perfect and imperfect information games

    Authors: Martin Schmid, Matej Moravcik, Neil Burch, Rudolf Kadlec, Josh Davidson, Kevin Waugh, Nolan Bard, Finbarr Timbers, Marc Lanctot, G. Zacharias Holland, Elnaz Davoodi, Alden Christianson, Michael Bowling

    Abstract: Games have a long history as benchmarks for progress in artificial intelligence. Approaches using search and learning produced strong performance across many perfect information games, and approaches using game-theoretic reasoning and learning demonstrated strong performance for specific imperfect information poker variants. We introduce Student of Games, a general-purpose algorithm that unifies p… ▽ More

    Submitted 15 November, 2023; v1 submitted 6 December, 2021; originally announced December 2021.

    Comments: Published in Science Advances

    Journal ref: Science Advances 9, eadg3256 (2023)

  5. arXiv:2101.04237  [pdf, other

    cs.AI cs.LG

    Solving Common-Payoff Games with Approximate Policy Iteration

    Authors: Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot

    Abstract: For artificially intelligent learning systems to have widespread applicability in real-world settings, it is important that they be able to operate decentrally. Unfortunately, decentralized control is difficult -- computing even an epsilon-optimal joint policy is a NEXP complete problem. Nevertheless, a recently rediscovered insight -- that a team of agents can coordinate via common knowledge -- h… ▽ More

    Submitted 11 January, 2021; originally announced January 2021.

    Comments: AAAI 2021

  6. arXiv:2008.12234  [pdf, other

    cs.AI cs.LG

    The Advantage Regret-Matching Actor-Critic

    Authors: Audrūnas Gruslys, Marc Lanctot, Rémi Munos, Finbarr Timbers, Martin Schmid, Julien Perolat, Dustin Morrill, Vinicius Zambaldi, Jean-Baptiste Lespiau, John Schultz, Mohammad Gheshlaghi Azar, Michael Bowling, Karl Tuyls

    Abstract: Regret minimization has played a key role in online learning, equilibrium computation in games, and reinforcement learning (RL). In this paper, we describe a general model-free RL method for no-regret learning based on repeated reconsideration of past behavior. We propose a model-free RL algorithm, the AdvantageRegret-Matching Actor-Critic (ARMAC): rather than saving past state-action data, ARMAC… ▽ More

    Submitted 27 August, 2020; originally announced August 2020.

  7. arXiv:2004.09677  [pdf, other

    cs.LG stat.ML

    Approximate exploitability: Learning a best response in large games

    Authors: Finbarr Timbers, Nolan Bard, Edward Lockhart, Marc Lanctot, Martin Schmid, Neil Burch, Julian Schrittwieser, Thomas Hubert, Michael Bowling

    Abstract: Researchers have demonstrated that neural networks are vulnerable to adversarial examples and subtle environment changes, both of which one can view as a form of distribution shift. To humans, the resulting errors can look like blunders, eroding trust in these agents. In prior games research, agent evaluation often focused on the in-practice game outcomes. While valuable, such evaluation typically… ▽ More

    Submitted 3 November, 2022; v1 submitted 20 April, 2020; originally announced April 2020.

  8. arXiv:1908.09453  [pdf, other

    cs.LG cs.AI cs.GT cs.MA

    OpenSpiel: A Framework for Reinforcement Learning in Games

    Authors: Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinicius Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas Anthony, Edward Hughes , et al. (2 additional authors not shown)

    Abstract: OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games. OpenSpiel supports n-player (single- and multi- agent) zero-sum, cooperative and general-sum, one-shot and sequential, strictly turn-taking and simultaneous-move, perfect and imperfect information games, as well as traditional multiagent environments such as (partia… ▽ More

    Submitted 26 September, 2020; v1 submitted 25 August, 2019; originally announced August 2019.

  9. arXiv:1903.05614  [pdf, other

    cs.AI cs.GT cs.LG

    Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent

    Authors: Edward Lockhart, Marc Lanctot, Julien Pérolat, Jean-Baptiste Lespiau, Dustin Morrill, Finbarr Timbers, Karl Tuyls

    Abstract: In this paper, we present exploitability descent, a new algorithm to compute approximate equilibria in two-player zero-sum extensive-form games with imperfect information, by direct policy optimization against worst-case opponents. We prove that when following this optimization, the exploitability of a player's strategy converges asymptotically to zero, and hence when both players employ this opti… ▽ More

    Submitted 12 June, 2020; v1 submitted 13 March, 2019; originally announced March 2019.

    Comments: IJCAI 2019, 11 pages, 1 figure