539 research outputs found
Synthesizing Imperative Programs from Examples Guided by Static Analysis
We present a novel algorithm that synthesizes imperative programs for
introductory programming courses. Given a set of input-output examples and a
partial program, our algorithm generates a complete program that is consistent
with every example. Our key idea is to combine enumerative program synthesis
and static analysis, which aggressively prunes out a large search space while
guaranteeing to find, if any, a correct solution. We have implemented our
algorithm in a tool, called SIMPL, and evaluated it on 30 problems used in
introductory programming courses. The results show that SIMPL is able to solve
the benchmark problems in 6.6 seconds on average.Comment: The paper is accepted in Static Analysis Symposium (SAS) '17. The
submission version is somewhat different from the version in arxiv. The final
version will be uploaded after the camera-ready version is read
Reconstituting typeset Marriage Registers using simple software tools
In a world of fully integrated software applications, which can seem daunting to develop and to maintain, it is sometimes useful to recall that a system of loosely-linked software components can provide surprisingly powerful and flexible methods for software development.
This paper describes a project which aims to retypeset a series of volumes from the Phillimore Marriage Registers, first published in England around the turn of the last century. The source material is plain text derived from running Optical Character Recognition (OCR) on a set of page scans taken from the original printed volumes. The regular, tabular, structure of the Register pages allows us to automate the re-typesetting process.
The UNIX troff software and its tbl preprocessor are used for the typesetting itself, but a series of simple awk-based software tools, all of them parsers and code generators of one sort or another, is used to bring about the OCR-to-troff transformation.
By re-parsing the generated troff codes it is possible to
produce a surname index as a supplement to the retypeset
volume. Moreover, this second-stage parsing has been invaluable in discovering subtle ‘typos’ in the automatically generated material. With small adjustments to this parser it would be possible to output the complete marriage entries in standard XML or GEDCOM notations
A Computational Interpretation of Context-Free Expressions
We phrase parsing with context-free expressions as a type inhabitation
problem where values are parse trees and types are context-free expressions. We
first show how containment among context-free and regular expressions can be
reduced to a reachability problem by using a canonical representation of
states. The proofs-as-programs principle yields a computational interpretation
of the reachability problem in terms of a coercion that transforms the parse
tree for a context-free expression into a parse tree for a regular expression.
It also yields a partial coercion from regular parse trees to context-free
ones. The partial coercion from the trivial language of all words to a
context-free expression corresponds to a predictive parser for the expression
An automata characterisation for multiple context-free languages
We introduce tree stack automata as a new class of automata with storage and
identify a restricted form of tree stack automata that recognises exactly the
multiple context-free languages.Comment: This is an extended version of a paper with the same title accepted
at the 20th International Conference on Developments in Language Theory (DLT
2016
Linear Parsing Expression Grammars
PEGs were formalized by Ford in 2004, and have several pragmatic operators
(such as ordered choice and unlimited lookahead) for better expressing modern
programming language syntax. Since these operators are not explicitly defined
in the classic formal language theory, it is significant and still challenging
to argue PEGs' expressiveness in the context of formal language theory.Since
PEGs are relatively new, there are several unsolved problems.One of the
problems is revealing a subclass of PEGs that is equivalent to DFAs. This
allows application of some techniques from the theory of regular grammar to
PEGs. In this paper, we define Linear PEGs (LPEGs), a subclass of PEGs that is
equivalent to DFAs. Surprisingly, LPEGs are formalized by only excluding some
patterns of recursive nonterminal in PEGs, and include the full set of ordered
choice, unlimited lookahead, and greedy repetition, which are characteristic of
PEGs. Although the conversion judgement of parsing expressions into DFAs is
undecidable in general, the formalism of LPEGs allows for a syntactical
judgement of parsing expressions.Comment: Parsing expression grammars, Boolean finite automata, Packrat parsin
Reasoning about embedded dependencies using inclusion dependencies
The implication problem for the class of embedded dependencies is
undecidable. However, this does not imply lackness of a proof procedure as
exemplified by the chase algorithm. In this paper we present a complete
axiomatization of embedded dependencies that is based on the chase and uses
inclusion dependencies and implicit existential quantification in the
intermediate steps of deductions
A Rule-Based Approach to Analyzing Database Schema Objects with Datalog
Database schema elements such as tables, views, triggers and functions are
typically defined with many interrelationships. In order to support database
users in understanding a given schema, a rule-based approach for analyzing the
respective dependencies is proposed using Datalog expressions. We show that
many interesting properties of schema elements can be systematically determined
this way. The expressiveness of the proposed analysis is exemplarily shown with
the problem of computing induced functional dependencies for derived relations.
The propagation of functional dependencies plays an important role in data
integration and query optimization but represents an undecidable problem in
general. And yet, our rule-based analysis covers all relational operators as
well as linear recursive expressions in a systematic way showing the depth of
analysis possible by our proposal. The analysis of functional dependencies is
well-integrated in a uniform approach to analyzing dependencies between schema
elements in general.Comment: Pre-proceedings paper presented at the 27th International Symposium
on Logic-Based Program Synthesis and Transformation (LOPSTR 2017), Namur,
Belgium, 10-12 October 2017 (arXiv:1708.07854
Efficient FPT algorithms for (strict) compatibility of unrooted phylogenetic trees
In phylogenetics, a central problem is to infer the evolutionary
relationships between a set of species ; these relationships are often
depicted via a phylogenetic tree -- a tree having its leaves univocally labeled
by elements of and without degree-2 nodes -- called the "species tree". One
common approach for reconstructing a species tree consists in first
constructing several phylogenetic trees from primary data (e.g. DNA sequences
originating from some species in ), and then constructing a single
phylogenetic tree maximizing the "concordance" with the input trees. The
so-obtained tree is our estimation of the species tree and, when the input
trees are defined on overlapping -- but not identical -- sets of labels, is
called "supertree". In this paper, we focus on two problems that are central
when combining phylogenetic trees into a supertree: the compatibility and the
strict compatibility problems for unrooted phylogenetic trees. These problems
are strongly related, respectively, to the notions of "containing as a minor"
and "containing as a topological minor" in the graph community. Both problems
are known to be fixed-parameter tractable in the number of input trees , by
using their expressibility in Monadic Second Order Logic and a reduction to
graphs of bounded treewidth. Motivated by the fact that the dependency on
of these algorithms is prohibitively large, we give the first explicit dynamic
programming algorithms for solving these problems, both running in time
, where is the total size of the input.Comment: 18 pages, 1 figur
An approach to computing downward closures
The downward closure of a word language is the set of all (not necessarily
contiguous) subwords of its members. It is well-known that the downward closure
of any language is regular. While the downward closure appears to be a powerful
abstraction, algorithms for computing a finite automaton for the downward
closure of a given language have been established only for few language
classes.
This work presents a simple general method for computing downward closures.
For language classes that are closed under rational transductions, it is shown
that the computation of downward closures can be reduced to checking a certain
unboundedness property.
This result is used to prove that downward closures are computable for (i)
every language class with effectively semilinear Parikh images that are closed
under rational transductions, (ii) matrix languages, and (iii) indexed
languages (equivalently, languages accepted by higher-order pushdown automata
of order 2).Comment: Full version of contribution to ICALP 2015. Comments welcom
Graph-Based Shape Analysis Beyond Context-Freeness
We develop a shape analysis for reasoning about relational properties of data
structures. Both the concrete and the abstract domain are represented by
hypergraphs. The analysis is parameterized by user-supplied indexed graph
grammars to guide concretization and abstraction. This novel extension of
context-free graph grammars is powerful enough to model complex data structures
such as balanced binary trees with parent pointers, while preserving most
desirable properties of context-free graph grammars. One strength of our
analysis is that no artifacts apart from grammars are required from the user;
it thus offers a high degree of automation. We implemented our analysis and
successfully applied it to various programs manipulating AVL trees,
(doubly-linked) lists, and combinations of both
- …
