55 research outputs found

    Learning Classical Planning Strategies with Policy Gradient

    Get PDF
    A common paradigm in classical planning is heuristic forward search. Forward search planners often rely on simple best-first search which remains fixed throughout the search process. In this paper, we introduce a novel search framework capable of alternating between several forward search approaches while solving a particular planning problem. Selection of the approach is performed using a trainable stochastic policy, mapping the state of the search to a probability distribution over the approaches. This enables using policy gradient to learn search strategies tailored to a specific distributions of planning problems and a selected performance metric, e.g. the IPC score. We instantiate the framework by constructing a policy space consisting of five search approaches and a two-dimensional representation of the planner's state. Then, we train the system on randomly generated problems from five IPC domains using three different performance metrics. Our experimental results show that the learner is able to discover domain-specific search strategies, improving the planner's performance relative to the baselines of plain best-first search and a uniform policy.Comment: Accepted for ICAPS 201

    Towards learning domain-independent planning heuristics

    Full text link
    Automated planning remains one of the most general paradigms in Artificial Intelligence, providing means of solving problems coming from a wide variety of domains. One of the key factors restricting the applicability of planning is its computational complexity resulting from exponentially large search spaces. Heuristic approaches are necessary to solve all but the simplest problems. In this work, we explore the possibility of obtaining domain-independent heuristic functions using machine learning. This is a part of a wider research program whose objective is to improve practical applicability of planning in systems for which the planning domains evolve at run time. The challenge is therefore the learning of (corrections of) domain-independent heuristics that can be reused across different planning domains.Comment: Accepted for the IJCAI-17 Workshop on Architectures for Generality and Autonom

    An Inductive Approach for Modal Transition System Refinement

    Get PDF
    Modal Transition Systems (MTSs) provide an appropriate framework for modelling software behaviour when only a partial specification is available. A key characteristic of an MTS is that it explicitly models events that a system is required to provide and is proscribed from exhibiting, and those for which no specification is available, called maybe events. Incremental elaboration of maybe events into either required or proscribed events can be seen as a process of MTS refinement, resulting from extending a given partial specification with more information about the system behaviour. This paper focuses on providing automated support for computing strong refinements of an MTS with respect to event traces that describe required and proscribed behaviours using a non-monotonic inductive logic programming technique. A real case study is used to illustrate the practical application of the approach

    ω\omega-regular Expression Synthesis from Transition-Based B\"uchi Automata

    Full text link
    A popular method for modelling reactive systems is to use ω\omega-regular languages. These languages can be represented as nondeterministic B\"uchi automata (NBAs) or ω\omega-regular expressions. Existing methods synthesise expressions from state-based NBAs. Synthesis from transition-based NBAs is traditionally done by transforming transition-based NBAs into state-based NBAs. This transformation, however, can increase the complexity of the synthesised expressions. This paper proposes a novel method for directly synthesising ω\omega-regular expressions from transition-based NBAs. We prove that the method is sound and complete. Our empirical results show that the ω\omega-regular expressions synthesised from transition-based NBAs are more compact than those synthesised from state-based NBAs. This is particularly the case for NBAs computed from obligation, reactivity, safety and recurrence-type LTL formulas, reporting in the latter case an average reduction of over 50%. We also show that our method successfully synthesises ω\omega-regular expressions from more LTL formulas when using a transition-based instead of a state-based NBA

    Combining Experts’ Causal Judgments

    Get PDF
    Consider a policymaker who wants to decide which intervention to perform in order to change a currently undesirable situation. The policymaker has at her disposal a team of experts, each with their own understanding of the causal dependencies between different factors contributing to the outcome. The policymaker has varying degrees of confidence in the experts’ opinions. She wants to combine their opinions in order to decide on the most effective intervention. We formally define the notion of an effective intervention, and then consider how experts’ causal judgments can be combined in order to determine the most effective intervention. We define a notion of two causal models being compatible, and show how compatible causal models can be merged. We then use it as the basis for combining experts’ causal judgments. We also provide a definition of decomposition for causal models to cater for cases when models are incompatible. We illustrate our approach on a number of real-life examples

    Technical Communications of ICLP

    Get PDF
    Abstract The need for systematic research into behavioural factors of individual terrorists has been highlighted by much recent work on terrorism. Many existing methods follow a hypothesistesting approach in which statistical modelling and analysis of existing data is conducted to either confirm or refute a hypothesis. However, the initial construction of hypotheses is not trivial, nor is the decision upon which of the variables are to be considered relevant for the testings. It has been argued that the lack of a methodical approach to represent, analyse, interpret and infer from existing data presents a pressing challenge to the progress of lone-actor terrorism research in particular, and the terrorism field more generally. This paper sets a new agenda for such research. We propose the use of a logic programming approach to address the shortcomings of existing methodologies in the study of lone-actor terrorism. Our method is based on transforming characteristic and behavioural codes into a logic program and applying inductive logic programming to learn hypotheses about potentially relevant factors associated with terrorist behaviour, as well as the influence of specific factors on such behaviour. This paper is an exploratory study of 111 lone-actor terrorists' target selections (civilian vs. high-value targets) and the agency of their ideological orientation in determining their target choices

    Adapting specifications for reactive controllers

    Get PDF
    For systems to respond to scenarios that were unforeseen at design time, they must be capable of safely adapting, at runtime, the assumptions they make about the environment, the goals they are expected to achieve, and the strategy that guarantees the goals are fulfilled if the assumptions hold. Such adaptation often involves the system degrading its functionality, by weakening its environment assumptions and/or the goals it aims to meet, ideally in a graceful manner. However, finding weaker assumptions that account for the unanticipated behaviour and of goals that are achievable in the new environment in a systematic and safe way remains an open challenge. In this paper, we propose a novel framework that supports assumption and, if necessary, goal degradation to allow systems to cope with runtime assumption violations. The framework, which integrates into the MORPH reference architecture, combines symbolic learning and reactive synthesis to compute implementable controllers that may be deployed safely. We describe and implement an algorithm that illustrates the working of this framework. We further demonstrate in our evaluation its effectiveness and applicability to a series of benchmarks from the literature. The results show that the algorithm successfully learns realizable specifications that accommodate previously violating environment behaviour in almost all cases. Exceptions are discussed in the evaluation
    corecore