29 research outputs found
Work Analysis with Resource-Aware Session Types
While there exist several successful techniques for supporting programmers in
deriving static resource bounds for sequential code, analyzing the resource
usage of message-passing concurrent processes poses additional challenges. To
meet these challenges, this article presents an analysis for statically
deriving worst-case bounds on the total work performed by message-passing
processes. To decompose interacting processes into components that can be
analyzed in isolation, the analysis is based on novel resource-aware session
types, which describe protocols and resource contracts for inter-process
communication. A key innovation is that both messages and processes carry
potential to share and amortize cost while communicating. To symbolically
express resource usage in a setting without static data structures and
intrinsic sizes, resource contracts describe bounds that are functions of
interactions between processes. Resource-aware session types combine standard
binary session types and type-based amortized resource analysis in a linear
type system. This type system is formulated for a core session-type calculus of
the language SILL and proved sound with respect to a multiset-based operational
cost semantics that tracks the total number of messages that are exchanged in a
system. The effectiveness of the analysis is demonstrated by analyzing standard
examples from amortized analysis and the literature on session types and by a
comparative performance analysis of different concurrent programs implementing
the same interface.Comment: 25 pages, 2 pages of references, 11 pages of appendix, Accepted at
LICS 201
Database Theory in Action: Search-Based Program Optimization
Recent work in programming languages developed an approach to term rewritings based on equality saturation (EqSat), which, instead of applying destructively the rewrite rules, maintains all equivalent expressions in a structure called an E-graph. This paper describes two surprising connections between EqSat and databases, going both ways. On one hand equality saturation can be viewed as a query evaluation problem, with great benefits. On the other hand, most sophisticated SQL query optimizers are based on the Volcano/Cascades framework which, we explain, is a variant of EqSat
Small Proofs from Congruence Closure
Satisfiability Modulo Theory (SMT) solvers and equality saturation engines
must generate proof certificates from e-graph-based congruence closure
procedures to enable verification and conflict clause generation. Smaller proof
certificates speed up these activities. Though the problem of generating proofs
of minimal size is known to be NP-complete, existing proof minimization
algorithms for congruence closure generate unnecessarily large proofs and
introduce asymptotic overhead over the core congruence closure procedure. In
this paper, we introduce an O(n^5) time algorithm which generates optimal
proofs under a new relaxed "proof tree size" metric that directly bounds proof
size. We then relax this approach further to a practical O(n \log(n)) greedy
algorithm which generates small proofs with no asymptotic overhead. We
implemented our techniques in the egg equality saturation toolkit, yielding the
first certifying equality saturation engine. We show that our greedy approach
in egg quickly generates substantially smaller proofs than the state-of-the-art
Z3 SMT solver on a corpus of 3760 benchmarks
babble: Learning Better Abstractions with E-Graphs and Anti-Unification
Library learning compresses a given corpus of programs by extracting common
structure from the corpus into reusable library functions. Prior work on
library learning suffers from two limitations that prevent it from scaling to
larger, more complex inputs. First, it explores too many candidate library
functions that are not useful for compression. Second, it is not robust to
syntactic variation in the input.
We propose library learning modulo theory (LLMT), a new library learning
algorithm that additionally takes as input an equational theory for a given
problem domain. LLMT uses e-graphs and equality saturation to compactly
represent the space of programs equivalent modulo the theory, and uses a novel
e-graph anti-unification technique to find common patterns in the corpus more
directly and efficiently.
We implemented LLMT in a tool named BABBLE. Our evaluation shows that BABBLE
achieves better compression orders of magnitude faster than the state of the
art. We also provide a qualitative evaluation showing that BABBLE learns
reusable functions on inputs previously out of reach for library learning.Comment: POPL 202
Synthesizing Structured CAD Models with Equality Saturation and Inverse Transformations
Recent program synthesis techniques help users customize CAD models(e.g., for
3D printing) by decompiling low-level triangle meshes to Constructive Solid
Geometry (CSG) expressions. Without loops or functions, editing CSG can require
many coordinated changes, and existing mesh decompilers use heuristics that can
obfuscate high-level structure.
This paper proposes a second decompilation stage to robustly "shrink"
unstructured CSG expressions into more editable programs with map and fold
operators. We present Szalinski, a tool that uses Equality Saturation with
semantics-preserving CAD rewrites to efficiently search for smaller equivalent
programs. Szalinski relies on inverse transformations, a novel way for solvers
to speculatively add equivalences to an E-graph. We qualitatively evaluate
Szalinski in case studies, show how it composes with an existing mesh
decompiler, and demonstrate that Szalinski can shrink large models in seconds.Comment: 14 page
Better Together: Unifying Datalog and Equality Saturation
We present egglog, a fixpoint reasoning system that unifies Datalog and
equality saturation (EqSat). Like Datalog, it supports efficient incremental
execution, cooperating analyses, and lattice-based reasoning. Like EqSat, it
supports term rewriting, efficient congruence closure, and extraction of
optimized terms.
We identify two recent applications--a unification-based pointer analysis in
Datalog and an EqSat-based floating-point term rewriter--that have been
hampered by features missing from Datalog but found in EqSat or vice-versa. We
evaluate egglog by reimplementing those projects in egglog. The resulting
systems in egglog are faster, simpler, and fix bugs found in the original
systems.Comment: PLDI 202
Practical and Flexible Equality Saturation
Thesis (Ph.D.)--University of Washington, 2021Programming language tools like compilers, optimizers, verifiers, and synthesizers rely on term rewriting to effectively manipulate programs. While powerful and well-studied, term rewriting traditionally suffers from a critical stumbling block: users must choose when and how to apply the right rewrite, and the quality of the results hinges on this difficult decision. A recent technique called equality saturation mitigates this “rewrite choice” issue by allowing many rewrites to apply simultaneously. Despite its promise, the technique’s applicability has been limited by lack of flexibility and poor scalability. This thesis offers theoretical and practical advances that make equality saturation fast and flexible enough to use in real-world applications in any domain. On the theoretical side, this work contributes two techniques to make e-graphs, the data structure underlying equality saturation, better suited to the algorithm’s needs. A new amortized invariant restoration technique called rebuilding takes advantage of equality saturation’s distinct workload, providing asymptotic speedups over current techniques in practice. A general mechanism called e-class analyses integrates domain-specific analyses into the e-graph, reducing the need for ad hoc manipulation. We implemented these techniques in a new open-source library called egg. egg has been used to achieve state-of-the-art results in many domains, including floating point accuracy, automatic vectorization, deep learning compute graphs, 3D CAD decompilation, and linear algebra kernels, among others. We present case studies that highlight how egg’s performance and flexibility helped these projects succeed, making the case that equality saturation is ready for a wide variety of real-world use cases
Design and Implementation of Concurrent C0
We describe Concurrent C0, a type-safe C-like language with contracts and session-typed communication over channels. Concurrent C0 supports an operation called forwarding which allows channels to be combined in a well-defined way. The language's type system enables elegant expression of session types and message-passing concurrent programs. We provide a Go-based implementation with language based optimizations that outperforms traditional message passing techniques
