Search CORE

33 research outputs found

On the Importance of Registers for Computability

Author: D. Alistarh
H. Attiya
M. Herlihy
M. Merritt
M. Moir
M.P. Herlihy
P. Jayanti
R.A. Bazzi
Y. Afek
Publication venue
Publication date: 01/01/2014
Field of study

All consensus hierarchies in the literature assume that we have, in addition to copies of a given object, an unbounded number of registers. But why do we really need these registers? This paper considers what would happen if one attempts to solve consensus using various objects but without any registers. We show that under a reasonable assumption, objects like queues and stacks cannot emulate the missing registers. We also show that, perhaps surprisingly, initialization, shown to have no computational consequences when registers are readily available, is crucial in determining the synchronization power of objects when no registers are allowed. Finally, we show that without registers, the number of available objects affects the level of consensus that can be solved. Our work thus raises the question of whether consensus hierarchies which assume an unbounded number of registers truly capture synchronization power, and begins a line of research aimed at better understanding the interaction between read-write memory and the powerful synchronization operations available on modern architectures.Comment: 12 pages, 0 figure

arXiv.org e-Print Archive

DSpace@MIT

Crossref

How Many Cooks Spoil the Soup?

Author: D Alistarh
D Angluin
D Angluin
D Doty
D Malkhi
D Peleg
H Attiya
H-L Chen
H-L Chen
I Chatzigiannakis
J Aspnes
J Czyzowicz
J Suomela
K-T Förster
M Yamashita
NA Lynch
O Michail
O Michail
P Flocchini
Y Afek
Publication venue
Publication date: 17/08/2016
Field of study

In this work, we study the following basic question: "How much parallelism does a distributed task permit?" Our definition of parallelism (or symmetry) here is not in terms of speed, but in terms of identical roles that processes have at the same time in the execution. We initiate this study in population protocols, a very simple model that not only allows for a straightforward definition of what a role is, but also encloses the challenge of isolating the properties that are due to the protocol from those that are due to the adversary scheduler, who controls the interactions between the processes. We (i) give a partial characterization of the set of predicates on input assignments that can be stably computed with maximum symmetry, i.e.,

\Theta(N_{min})

, where

N_{min}

is the minimum multiplicity of a state in the initial configuration, and (ii) we turn our attention to the remaining predicates and prove a strong impossibility result for the parity predicate: the inherent symmetry of any protocol that stably computes it is upper bounded by a constant that depends on the size of the protocol.Comment: 19 page

arXiv.org e-Print Archive

University of Liverpool Repository

Crossref

Of Choices, Failures and Asynchrony: The Many Faces of Set Agreement

Author: C. Dwork
D. Alistarh
H. Attiya
L. Lamport
M. Herlihy
M.E. Saks
P. Dutta
R. Guerraoui
S. Chaudhuri
S. Chaudhuri
Y. Afek
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2009
Field of study

International audienceSet agreement is a fundamental problem in distributed com- puting in which processes collectively choose a small subset of values from a larger set of proposals. The impossibility of fault-tolerant set agreement in asynchronous networks is one of the seminal results in distributed computing. The complexity of set agreement in synchronous networks has also been a significant research challenge. Real systems, however, are neither purely synchronous nor purely asynchronous. Rather, they tend to alternate between periods of synchrony and periods of asynchrony. In this paper, we analyze the complexity of set agreement in a such a "partially synchronous" setting, presenting the first (asymptotically) tight bound on the complexity of set agreement in such systems. We introduce a novel technique for simulating, in fault-prone asynchronous shared memory, executions of an asynchronous and failure-prone message- passing system in which some fragments appear synchronous to some processes. We use this technique to derive a lower bound on the round complexity of set agreement in a partially synchronous system by a reduction from asynchronous wait-free set agreement. We also present an asymptotically matching algorithm that relies on a distributed asyn- chrony detection mechanism to decide as soon as possible during periods of synchrony. By relating environments with differing degrees of synchrony, our simu- lation technique is of independent interest. In particular, it allows us to obtain a new lower bound on the complexity of early deciding k-set agree- ment complementary to that of [12], and to re-derive the combinatorial topology lower bound of [13] in an algorithmic way

HAL-CentraleSupelec

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Portail HAL UNIV-RENNES

Quantized Distributed Training of Large Models with Convergence Guarantees

Author: Alistarh D.
Guo Q.
Markov I.
Vladu A.
Publication venue
Publication date: 01/01/2023
Field of study

MPG.PuRe

Brief Announcement: Fast Approximate Counting and Leader Election in Populations

Author: D Alistarh
D Angluin
D Angluin
Leszek Gąsieniec
M Fischer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

University of Liverpool Repository

Crossref

Recommended from our members

Wait-free Trees with Asymptotically-Efficient Range Queries

Author: Aksenov V.
Alistarh D.
Kokorin I.
Yudov V.
Publication venue: IEEE
Publication date: 08/07/2024
Field of study

Tree data structures, such as red-black trees, quad trees, treaps, or tries, are fundamental tools in computer science. A classical problem in concurrency is to obtain expressive, efficient, and scalable versions of practical tree data structures. We are interested in concurrent trees supporting range queries, i.e., queries that involve multiple consecutive data items. Existing implementations with this capability can list keys in a specific range, but do not support aggregate range queries: for instance, if we want to calculate the number of keys in a range, the only choice is to retrieve a whole list and return its size. This is suboptimal: in the sequential setting, one can augment a balanced search tree with counters and, consequently, perform these aggregate requests in logarithmic rather than linear time.In this paper, we propose a generic approach to implement a broad class of range queries on concurrent trees in a way that is wait-free, asymptotically efficient, and practically scalable. The key idea is a new mechanism for maintaining metadata concurrently at tree nodes, which can be seen as a wait-free variant of hand-over-hand locking (which we call hand-over-hand helping). We did a preliminary implementation of the wait-free binary search tree and preliminary experiments have indicated the soundness of our approach

City Research Online

Sectorized FMCW MIMO Radar by Modular Design With Non-Uniform Sparse Arrays

Author: Alistarh Cristian
Goussetis George
Hilario Re Pascual D.
Lee Jaesup
Mateo-Segura Carolina
Pailhas Yan
Petillot Yvan R.
Podilchak Symon
Sellathurai Mathini
Strober Thomas M.
Thompson John
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/06/2022
Field of study

Edinburgh Research Explorer

Fast Approximate Counting and Leader Election in Populations

Author: A Casteigts
D Alistarh
D Angluin
D Angluin
D Angluin
D Doty
D Soloveichik
David Doty
GA Luna Di
H-L Chen
I Chatzigiannakis
J Aspnes
J Beauquier
J Beauquier
Leszek Gąsieniec
M Fischer
O Michail
O Michail
O Michail
R Guerraoui
R Mizoguchi
S Das
T Izumi
Publication venue
Publication date: 07/06/2018
Field of study

We study the problems of leader election and population size counting for population protocols: networks of finite-state anonymous agents that interact randomly under a uniform random scheduler. We show a protocol for leader election that terminates in

O(\log_m(n) \cdot \log_2 n)

parallel time, where

m

is a parameter, using

O(\max\{m,\log n\})

states. By adjusting the parameter

m

between a constant and

n

, we obtain a single leader election protocol whose time and space can be smoothly traded off between

O(\log^2 n)

O(\log n)

time and

O(\log n)

O(n)

states. Finally, we give a protocol which provides an upper bound

\hat{n}

of the size

n

of the population, where

\hat{n}

is at most

n^a

for some

a>1

. This protocol assumes the existence of a unique leader in the population and stabilizes in

\Theta{(\log{n})}

parallel time, using constant number of states in every node, except the unique leader which is required to use

\Theta{(\log^2{n})}

states

arXiv.org e-Print Archive

University of Liverpool Repository

Crossref

Lock-Free algorithms under stochastic schedulers

Author: Alistarh D
Sauerwald T
Vojnovíc M
Publication venue: Proceedings of the Annual ACM Symposium on Principles of Distributed Computing
Publication date: 01/01/2015
Field of study

In this work, we consider the following random process, motivated by the analysis of lock-free concurrent algorithms under high memory contention. In each round, a new scheduling step is allocated to one of n threads, according to a distribution p = (p1, p2, ..., pn), where thread i is scheduled with probability pi. When some thread first reaches a set threshold of executed steps, it registers a win, completing its current operation, and resets its step count to 1. At the same time, threads whose step count was close to the threshold also get reset because of the win, but to 0 steps, being penalized for almost winning. We are interested in two questions: how often does some thread complete an operation (system latency), and how often does a specific thread complete an operation (individual latency)? We provide asymptotically tight bounds for the system and individual latency of this general concurrency pattern, for arbitrary scheduling distributions p. Surprisingly, a simple characterization exists: in expectation, the system will complete a new operation every Θ(1 / |p|2) steps, while thread i will complete a new operation every Θ(|p|2 / pi2) steps. The proof is interesting in its own right, as it requires a careful analysis of how the higher norms of the vector p influence the thread step counts and latencies in this random process. Our result offers a simple connection between the scheduling distribution and the average performance of concurrent algorithms, which has several applications

Crossref

LSE Research Online

Apollo (Cambridge)

Quantized stochastic gradient descent: communication versus convergence

Author: Alistarh D.
Li J.
Tomioka R.
Vojnovic Milan
Publication venue
Publication date: 10/12/2016
Field of study

Parallel implementations of stochastic gradient descent (SGD) have received significant research attention, thanks to excellent scalability properties of this algorithm, and to its efficiency in the context of training deep neural networks. A fundamental barrier for parallelizing large-scale SGD is the fact that the cost of communicating the gradient updates between nodes can be very large. Consequently, lossy compression heuristics have been proposed, by which nodes only communicate quantized gradients. Although effective in practice, these heuristics do not always provably converge, and it is not clear whether they are optimal. In this paper, we propose Quantized SGD (QSGD), a family of compression schemes which allow the compression of gradient updates at each node, while guaranteeing convergence under standard assumptions. QSGD allows the user to trade off compression and convergence time: it can communicate a sublinear number of bits per iteration in the model dimension, and can achieve asymptotically optimal communication cost. We complement our theoretical results with empirical data, showing that QSGD can significantly reduce communication cost, while being competitive with standard uncompressed techniques on a variety of real tasks

LSE Research Online