208 research outputs found

    Open-ended Learning in Symmetric Zero-sum Games

    Get PDF
    Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'. If the game is approximately transitive, then self-play generates sequences of agents of increasing strength. However, nontransitive games, such as rock-paper-scissors, can exhibit strategic cycles, and there is no longer a clear objective -- we want agents to increase in strength, but against whom is unclear. In this paper, we introduce a geometric framework for formulating agent objectives in zero-sum games, in order to construct adaptive sequences of objectives that yield open-ended learning. The framework allows us to reason about population performance in nontransitive games, and enables the development of a new algorithm (rectified Nash response, PSRO_rN) that uses game-theoretic niching to construct diverse populations of effective agents, producing a stronger set of agents than existing algorithms. We apply PSRO_rN to two highly nontransitive resource allocation games and find that PSRO_rN consistently outperforms the existing alternatives.Comment: ICML 2019, final versio

    Arbitrarily primed PCR to type Vibrio spp. pathogenic for shrimp.

    Get PDF
    International audienceA molecular typing study on Vibrio strains implicated in shrimp disease outbreaks in New Caledonia and Japan was conducted by using AP-PCR (arbitrarily primed PCR). It allowed rapid identification of isolates at the genospecies level and studies of infraspecific population structures of epidemiological interest. Clusters identified within the species Vibrio penaeicida were related to their area of origin, allowing discrimination between Japanese and New Caledonian isolates, as well as between those from two different bays in New Caledonia separated by only 50 km. Other subclusters of New Caledonian V. penaeicida isolates could be identified, but it was not possible to link those differences to accurate epidemiological features. This contribution of AP-PCR to the study of vibriosis in penaeid shrimps demonstrates its high discriminating power and the relevance of the epidemiological information provided. This approach would contribute to better knowledge of the ecology of Vibrio spp. and their implication in shrimp disease in aquaculture

    Approximate dynamic programming for two-player zero-sum Markov games

    Get PDF
    International audienceThis paper provides an analysis of error propagation in Approximate Dynamic Programming applied to zero-sum two-player Stochastic Games. We provide a novel and unified error propagation analysis in L p-norm of three well-known algorithms adapted to Stochastic Games (namely Approximate Value Iteration, Approximate Policy Iteration and Approximate Generalized Policy Iteratio,n). We show that we can achieve a stationary policy which is 2γ+ (1−γ) 2-optimal, where is the value function approximation error and is the approximate greedy operator error. In addition , we provide a practical algorithm (AGPI-Q) to solve infinite horizon γ-discounted two-player zero-sum Stochastic Games in a batch setting. It is an extension of the Fitted-Q algorithm (which solves Markov Decisions Processes from data) and can be non-parametric. Finally, we demonstrate experimentally the performance of AGPI-Q on a simultaneous two-player game, namely Alesia

    A multi-agent reinforcement learning model of common-pool resource appropriation

    Get PDF
    Humanity faces numerous problems of common-pool resource appropriation. This class of multi-agent social dilemma includes the problems of ensuring sustainable use of fresh water, common fisheries, grazing pastures, and irrigation systems. Abstract models of common-pool resource appropriation based on non-cooperative game theory predict that self-interested agents will generally fail to find socially positive equilibria---a phenomenon called the tragedy of the commons. However, in reality, human societies are sometimes able to discover and implement stable cooperative solutions. Decades of behavioral game theory research have sought to uncover aspects of human behavior that make this possible. Most of that work was based on laboratory experiments where participants only make a single choice: how much to appropriate. Recognizing the importance of spatial and temporal resource dynamics, a recent trend has been toward experiments in more complex real-time video game-like environments. However, standard methods of non-cooperative game theory can no longer be used to generate predictions for this case. Here we show that deep reinforcement learning can be used instead. To that end, we study the emergent behavior of groups of independently learning agents in a partially observed Markov game modeling common-pool resource appropriation. Our experiments highlight the importance of trial-and-error learning in common-pool resource appropriation and shed light on the relationship between exclusion, sustainability, and inequality

    Pulmonary haemorrhage as a predominant cause of death in leptospirosis in Seychelles

    Get PDF
    We examined the cause of death during a 12-month period (1995/96) in all consecutive patients admitted to hospital with leptospiral infection in Seychelles (Indian Ocean), where the disease is endemic. Leptospirosis was diagnosed by use of the microscopic agglutination test and a specific polymerase chain reaction assay on serum samples. Seventy-five cases were diagnosed and 6 patients died, a case fatality of 8%. All 6 patients died within 9 days of onset of symptoms and within 2 days of admission for 5 of them (5 days for the 6th). On autopsy, diffuse bilateral pulmonary haemorrhage (PH) was found in all fatalities. Renal, cardiac, digestive and cerebral haemorrhages were also found in 5, 3, 3 and 1 case(s), respectively. Incidentally, haemoptysis and lung infiltrate on chest radiographs, which suggest PH, were found in 8 of the 69 non-fatal cases. Dengue and hantavirus infections were ruled out. In conclusion, PH appeared to be a main cause of death in leptospirosis in this population, although haemorrhage in other organs may also have contributed to fatal outcomes. This cause of death contrasts with the findings generally reported in endemic setting

    Navigating the Landscape of Multiplayer Games

    Full text link
    Multiplayer games have long been used as testbeds in artificial intelligence research, aptly referred to as the Drosophila of artificial intelligence. Traditionally, researchers have focused on using well-known games to build strong agents. This progress, however, can be better informed by characterizing games and their topological landscape. Tackling this latter question can facilitate understanding of agents and help determine what game an agent should target next as part of its training. Here, we show how network measures applied to response graphs of large-scale games enable the creation of a landscape of games, quantifying relationships between games of varying sizes and characteristics. We illustrate our findings in domains ranging from canonical games to complex empirical games capturing the performance of trained agents pitted against one another. Our results culminate in a demonstration leveraging this information to generate new and interesting games, including mixtures of empirical games synthesized from real world games

    A Generalised Method for Empirical Game Theoretic Analysis

    Get PDF
    This paper provides theoretical bounds for empirical game theoretical analysis of complex multi-agent interactions. We provide insights in the empirical meta game showing that a Nash equilibrium of the meta-game is an approximate Nash equilibrium of the true underlying game. We investigate and show how many data samples are required to obtain a close enough approximation of the underlying game. Additionally, we extend the meta-game analysis methodology to asymmetric games. The state-of-the-art has only considered empirical games in which agents have access to the same strategy sets and the payoff structure is symmetric, implying that agents are interchangeable. Finally, we carry out an empirical illustration of the generalised method in several domains, illustrating the theory and evolutionary dynamics of several versions of the AlphaGo algorithm (symmetric), the dynamics of the Colonel Blotto game played by human players on Facebook (symmetric), and an example of a meta-game in Leduc Poker (asymmetric), generated by the PSRO multi-agent learning algorithm.Comment: will appear at AAMAS'1
    corecore