21,171 research outputs found

    On statistics, computation and scalability

    Full text link
    How should statistical procedures be designed so as to be scalable computationally to the massive datasets that are increasingly the norm? When coupled with the requirement that an answer to an inferential question be delivered within a certain time budget, this question has significant repercussions for the field of statistics. With the goal of identifying "time-data tradeoffs," we investigate some of the statistical consequences of computational perspectives on scability, in particular divide-and-conquer methodology and hierarchies of convex relaxations.Comment: Published in at http://dx.doi.org/10.3150/12-BEJSP17 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

    Bayesian inference for queueing networks and modeling of internet services

    Get PDF
    Modern Internet services, such as those at Google, Yahoo!, and Amazon, handle billions of requests per day on clusters of thousands of computers. Because these services operate under strict performance requirements, a statistical understanding of their performance is of great practical interest. Such services are modeled by networks of queues, where each queue models one of the computers in the system. A key challenge is that the data are incomplete, because recording detailed information about every request to a heavily used system can require unacceptable overhead. In this paper we develop a Bayesian perspective on queueing models in which the arrival and departure times that are not observed are treated as latent variables. Underlying this viewpoint is the observation that a queueing model defines a deterministic transformation between the data and a set of independent variables called the service times. With this viewpoint in hand, we sample from the posterior distribution over missing data and model parameters using Markov chain Monte Carlo. We evaluate our framework on data from a benchmark Web application. We also present a simple technique for selection among nested queueing models. We are unaware of any previous work that considers inference in networks of queues in the presence of missing data.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS392 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Adiabatic optimization without local minima

    Full text link
    Several previous works have investigated the circumstances under which quantum adiabatic optimization algorithms can tunnel out of local energy minima that trap simulated annealing or other classical local search algorithms. Here we investigate the even more basic question of whether adiabatic optimization algorithms always succeed in polynomial time for trivial optimization problems in which there are no local energy minima other than the global minimum. Surprisingly, we find a counterexample in which the potential is a single basin on a graph, but the eigenvalue gap is exponentially small as a function of the number of vertices. In this counterexample, the ground state wavefunction consists of two "lobes" separated by a region of exponentially small amplitude. Conversely, we prove if the ground state wavefunction is single-peaked then the eigenvalue gap scales at worst as one over the square of the number of vertices.Comment: 20 pages, 1 figure. Journal versio
    corecore