55 research outputs found
Learning Classical Planning Strategies with Policy Gradient
A common paradigm in classical planning is heuristic forward search. Forward
search planners often rely on simple best-first search which remains fixed
throughout the search process. In this paper, we introduce a novel search
framework capable of alternating between several forward search approaches
while solving a particular planning problem. Selection of the approach is
performed using a trainable stochastic policy, mapping the state of the search
to a probability distribution over the approaches. This enables using policy
gradient to learn search strategies tailored to a specific distributions of
planning problems and a selected performance metric, e.g. the IPC score. We
instantiate the framework by constructing a policy space consisting of five
search approaches and a two-dimensional representation of the planner's state.
Then, we train the system on randomly generated problems from five IPC domains
using three different performance metrics. Our experimental results show that
the learner is able to discover domain-specific search strategies, improving
the planner's performance relative to the baselines of plain best-first search
and a uniform policy.Comment: Accepted for ICAPS 201
Towards learning domain-independent planning heuristics
Automated planning remains one of the most general paradigms in Artificial
Intelligence, providing means of solving problems coming from a wide variety of
domains. One of the key factors restricting the applicability of planning is
its computational complexity resulting from exponentially large search spaces.
Heuristic approaches are necessary to solve all but the simplest problems. In
this work, we explore the possibility of obtaining domain-independent heuristic
functions using machine learning. This is a part of a wider research program
whose objective is to improve practical applicability of planning in systems
for which the planning domains evolve at run time. The challenge is therefore
the learning of (corrections of) domain-independent heuristics that can be
reused across different planning domains.Comment: Accepted for the IJCAI-17 Workshop on Architectures for Generality
and Autonom
An Inductive Approach for Modal Transition System Refinement
Modal Transition Systems (MTSs) provide an appropriate framework for modelling software behaviour when only a partial specification is available. A key characteristic of an MTS is that it explicitly models events that a system is required to provide and is proscribed from exhibiting, and those for which no specification is available, called maybe events. Incremental elaboration of maybe events into either required or proscribed events can be seen as a process of MTS refinement, resulting from extending a given partial specification with more information about the system behaviour. This paper focuses on providing automated support for computing strong refinements
of an MTS with respect to event traces that describe required and proscribed behaviours using a non-monotonic inductive logic programming technique. A real case study is used to illustrate
the practical application of the approach
-regular Expression Synthesis from Transition-Based B\"uchi Automata
A popular method for modelling reactive systems is to use -regular
languages. These languages can be represented as nondeterministic B\"uchi
automata (NBAs) or -regular expressions. Existing methods synthesise
expressions from state-based NBAs. Synthesis from transition-based NBAs is
traditionally done by transforming transition-based NBAs into state-based NBAs.
This transformation, however, can increase the complexity of the synthesised
expressions. This paper proposes a novel method for directly synthesising
-regular expressions from transition-based NBAs. We prove that the
method is sound and complete. Our empirical results show that the
-regular expressions synthesised from transition-based NBAs are more
compact than those synthesised from state-based NBAs. This is particularly the
case for NBAs computed from obligation, reactivity, safety and recurrence-type
LTL formulas, reporting in the latter case an average reduction of over 50%. We
also show that our method successfully synthesises -regular expressions
from more LTL formulas when using a transition-based instead of a state-based
NBA
Combining Experts’ Causal Judgments
Consider a policymaker who wants to decide which intervention to perform in order to change a currently undesirable situation. The policymaker has at her disposal a team of experts, each with their own understanding of the causal dependencies between different factors contributing to the outcome. The policymaker has varying degrees of confidence in the experts’ opinions. She wants to combine their opinions in order to decide on the most effective intervention. We formally define the notion of an effective intervention, and then consider how experts’ causal judgments can be combined in order to determine the most effective intervention. We define a notion of two causal models being compatible, and show how compatible causal models can be merged. We then use it as the basis for combining experts’ causal judgments. We also provide a definition of decomposition for causal models to cater for cases when models are incompatible. We illustrate our approach on a number of real-life examples
Technical Communications of ICLP
Abstract The need for systematic research into behavioural factors of individual terrorists has been highlighted by much recent work on terrorism. Many existing methods follow a hypothesistesting approach in which statistical modelling and analysis of existing data is conducted to either confirm or refute a hypothesis. However, the initial construction of hypotheses is not trivial, nor is the decision upon which of the variables are to be considered relevant for the testings. It has been argued that the lack of a methodical approach to represent, analyse, interpret and infer from existing data presents a pressing challenge to the progress of lone-actor terrorism research in particular, and the terrorism field more generally. This paper sets a new agenda for such research. We propose the use of a logic programming approach to address the shortcomings of existing methodologies in the study of lone-actor terrorism. Our method is based on transforming characteristic and behavioural codes into a logic program and applying inductive logic programming to learn hypotheses about potentially relevant factors associated with terrorist behaviour, as well as the influence of specific factors on such behaviour. This paper is an exploratory study of 111 lone-actor terrorists' target selections (civilian vs. high-value targets) and the agency of their ideological orientation in determining their target choices
Adapting specifications for reactive controllers
For systems to respond to scenarios that were unforeseen at design time, they must be capable of safely adapting, at runtime, the assumptions they make about the environment, the goals they are expected to achieve, and the strategy that guarantees the goals are fulfilled if the assumptions hold. Such adaptation often involves the system degrading its functionality, by weakening its environment assumptions and/or the goals it aims to meet, ideally in a graceful manner. However, finding weaker assumptions that account for the unanticipated behaviour and of goals that are achievable in the new environment in a systematic and safe way remains an open challenge. In this paper, we propose a novel framework that supports assumption and, if necessary, goal degradation to allow systems to cope with runtime assumption violations. The framework, which integrates into the MORPH reference architecture, combines symbolic learning and reactive synthesis to compute implementable controllers that may be deployed safely. We describe and implement an algorithm that illustrates the working of this framework. We further demonstrate in our evaluation its effectiveness and applicability to a series of benchmarks from the literature. The results show that the algorithm successfully learns realizable specifications that accommodate previously violating environment behaviour in almost all cases. Exceptions are discussed in the evaluation
- …
