376 research outputs found

    String Synchronizing Sets: Sublinear-Time BWT Construction and Optimal LCE Data Structure

    Full text link
    Burrows-Wheeler transform (BWT) is an invertible text transformation that, given a text TT of length nn, permutes its symbols according to the lexicographic order of suffixes of TT. BWT is one of the most heavily studied algorithms in data compression with numerous applications in indexing, sequence analysis, and bioinformatics. Its construction is a bottleneck in many scenarios, and settling the complexity of this task is one of the most important unsolved problems in sequence analysis that has remained open for 25 years. Given a binary string of length nn, occupying O(n/logn)O(n/\log n) machine words, the BWT construction algorithm due to Hon et al. (SIAM J. Comput., 2009) runs in O(n)O(n) time and O(n/logn)O(n/\log n) space. Recent advancements (Belazzougui, STOC 2014, and Munro et al., SODA 2017) focus on removing the alphabet-size dependency in the time complexity, but they still require Ω(n)\Omega(n) time. In this paper, we propose the first algorithm that breaks the O(n)O(n)-time barrier for BWT construction. Given a binary string of length nn, our procedure builds the Burrows-Wheeler transform in O(n/logn)O(n/\sqrt{\log n}) time and O(n/logn)O(n/\log n) space. We complement this result with a conditional lower bound proving that any further progress in the time complexity of BWT construction would yield faster algorithms for the very well studied problem of counting inversions: it would improve the state-of-the-art O(mlogm)O(m\sqrt{\log m})-time solution by Chan and P\v{a}tra\c{s}cu (SODA 2010). Our algorithm is based on a novel concept of string synchronizing sets, which is of independent interest. As one of the applications, we show that this technique lets us design a data structure of the optimal size O(n/logn)O(n/\log n) that answers Longest Common Extension queries (LCE queries) in O(1)O(1) time and, furthermore, can be deterministically constructed in the optimal O(n/logn)O(n/\log n) time.Comment: Full version of a paper accepted to STOC 201

    Replica Placement on Bounded Treewidth Graphs

    Full text link
    We consider the replica placement problem: given a graph with clients and nodes, place replicas on a minimum set of nodes to serve all the clients; each client is associated with a request and maximum distance that it can travel to get served and there is a maximum limit (capacity) on the amount of request a replica can serve. The problem falls under the general framework of capacitated set covering. It admits an O(\log n)-approximation and it is NP-hard to approximate within a factor of o(logn)o(\log n). We study the problem in terms of the treewidth tt of the graph and present an O(t)-approximation algorithm.Comment: An abridged version of this paper is to appear in the proceedings of WADS'1

    Approximate Near Neighbors for General Symmetric Norms

    Full text link
    We show that every symmetric normed space admits an efficient nearest neighbor search data structure with doubly-logarithmic approximation. Specifically, for every nn, d=no(1)d = n^{o(1)}, and every dd-dimensional symmetric norm \|\cdot\|, there exists a data structure for poly(loglogn)\mathrm{poly}(\log \log n)-approximate nearest neighbor search over \|\cdot\| for nn-point datasets achieving no(1)n^{o(1)} query time and n1+o(1)n^{1+o(1)} space. The main technical ingredient of the algorithm is a low-distortion embedding of a symmetric norm into a low-dimensional iterated product of top-kk norms. We also show that our techniques cannot be extended to general norms.Comment: 27 pages, 1 figur

    Testing Conditional Independence of Discrete Distributions

    Full text link
    We study the problem of testing \emph{conditional independence} for discrete distributions. Specifically, given samples from a discrete random variable (X,Y,Z)(X, Y, Z) on domain [1]×[2]×[n][\ell_1]\times[\ell_2] \times [n], we want to distinguish, with probability at least 2/32/3, between the case that XX and YY are conditionally independent given ZZ from the case that (X,Y,Z)(X, Y, Z) is ϵ\epsilon-far, in 1\ell_1-distance, from every distribution that has this property. Conditional independence is a concept of central importance in probability and statistics with a range of applications in various scientific domains. As such, the statistical task of testing conditional independence has been extensively studied in various forms within the statistics and econometrics communities for nearly a century. Perhaps surprisingly, this problem has not been previously considered in the framework of distribution property testing and in particular no tester with sublinear sample complexity is known, even for the important special case that the domains of XX and YY are binary. The main algorithmic result of this work is the first conditional independence tester with {\em sublinear} sample complexity for discrete distributions over [1]×[2]×[n][\ell_1]\times[\ell_2] \times [n]. To complement our upper bounds, we prove information-theoretic lower bounds establishing that the sample complexity of our algorithm is optimal, up to constant factors, for a number of settings. Specifically, for the prototypical setting when 1,2=O(1)\ell_1, \ell_2 = O(1), we show that the sample complexity of testing conditional independence (upper bound and matching lower bound) is \[ \Theta\left({\max\left(n^{1/2}/\epsilon^2,\min\left(n^{7/8}/\epsilon,n^{6/7}/\epsilon^{8/7}\right)\right)}\right)\,. \

    Quantum singular value transformation and beyond: exponential improvements for quantum matrix arithmetics

    Get PDF
    Quantum computing is powerful because unitary operators describing the time-evolution of a quantum system have exponential size in terms of the number of qubits present in the system. We develop a new "Singular value transformation" algorithm capable of harnessing this exponential advantage, that can apply polynomial transformations to the singular values of a block of a unitary, generalizing the optimal Hamiltonian simulation results of Low and Chuang. The proposed quantum circuits have a very simple structure, often give rise to optimal algorithms and have appealing constant factors, while usually only use a constant number of ancilla qubits. We show that singular value transformation leads to novel algorithms. We give an efficient solution to a certain "non-commutative" measurement problem and propose a new method for singular value estimation. We also show how to exponentially improve the complexity of implementing fractional queries to unitaries with a gapped spectrum. Finally, as a quantum machine learning application we show how to efficiently implement principal component regression. "Singular value transformation" is conceptually simple and efficient, and leads to a unified framework of quantum algorithms incorporating a variety of quantum speed-ups. We illustrate this by showing how it generalizes a number of prominent quantum algorithms, including: optimal Hamiltonian simulation, implementing the Moore-Penrose pseudoinverse with exponential precision, fixed-point amplitude amplification, robust oblivious amplitude amplification, fast QMA amplification, fast quantum OR lemma, certain quantum walk results and several quantum machine learning algorithms. In order to exploit the strengths of the presented method it is useful to know its limitations too, therefore we also prove a lower bound on the efficiency of singular value transformation, which often gives optimal bounds.Comment: 67 pages, 1 figur

    Optimal solution of nonlinear equations

    Get PDF
    Journal ArticleWe survey recent worst case complexity results for the solution of nonlinear equations. Notes on worst and average case analysis of iterative algorithms and a bibliography of the subject are also included

    Quantum Weak Coin Flipping

    Full text link
    We investigate weak coin flipping, a fundamental cryptographic primitive where two distrustful parties need to remotely establish a shared random bit. A cheating player can try to bias the output bit towards a preferred value. For weak coin flipping the players have known opposite preferred values. A weak coin-flipping protocol has a bias ϵ\epsilon if neither player can force the outcome towards their preferred value with probability more than 12+ϵ\frac{1}{2}+\epsilon. While it is known that all classical protocols have ϵ=12\epsilon=\frac{1}{2}, Mochon showed in 2007 [arXiv:0711.4114] that quantumly weak coin flipping can be achieved with arbitrarily small bias (near perfect) but the best known explicit protocol has bias 1/61/6 (also due to Mochon, 2005 [Phys. Rev. A 72, 022341]). We propose a framework to construct new explicit protocols achieving biases below 1/61/6. In particular, we construct explicit unitaries for protocols with bias approaching 1/101/10. To go below, we introduce what we call the Elliptic Monotone Align (EMA) algorithm which, together with the framework, allows us to numerically construct protocols with arbitrarily small biases.Comment: 98 pages split into 3 parts, 10 figures; For updates and contact information see https://atulsingharora.github.io/WCF. Version 2 has minor improvements. arXiv admin note: text overlap with arXiv:1402.7166 by other author

    The systematic guideline review: method, rationale, and test on chronic heart failure

    Get PDF
    Background: Evidence-based guidelines have the potential to improve healthcare. However, their de-novo-development requires substantial resources-especially for complex conditions, and adaptation may be biased by contextually influenced recommendations in source guidelines. In this paper we describe a new approach to guideline development-the systematic guideline review method (SGR), and its application in the development of an evidence-based guideline for family physicians on chronic heart failure (CHF). Methods: A systematic search for guidelines was carried out. Evidence-based guidelines on CHF management in adults in ambulatory care published in English or German between the years 2000 and 2004 were included. Guidelines on acute or right heart failure were excluded. Eligibility was assessed by two reviewers, methodological quality of selected guidelines was appraised using the AGREE instrument, and a framework of relevant clinical questions for diagnostics and treatment was derived. Data were extracted into evidence tables, systematically compared by means of a consistency analysis and synthesized in a preliminary draft. Most relevant primary sources were re-assessed to verify the cited evidence. Evidence and recommendations were summarized in a draft guideline. Results: Of 16 included guidelines five were of good quality. A total of 35 recommendations were systematically compared: 25/35 were consistent, 9/35 inconsistent, and 1/35 un-rateable (derived from a single guideline). Of the 25 consistencies, 14 were based on consensus, seven on evidence and four differed in grading. Major inconsistencies were found in 3/9 of the inconsistent recommendations. We re-evaluated the evidence for 17 recommendations (evidence-based, differing evidence levels and minor inconsistencies) - the majority was congruent. Incongruity was found where the stated evidence could not be verified in the cited primary sources, or where the evaluation in the source guidelines focused on treatment benefits and underestimated the risks. The draft guideline was completed in 8.5 man-months. The main limitation to this study was the lack of a second reviewer. Conclusion: The systematic guideline review including framework development, consistency analysis and validation is an effective, valid, and resource saving-approach to the development of evidence-based guidelines

    Nonlinear convex and concave relaxations for the solutions of parametric ODEs

    Get PDF
    SUMMARY Convex and concave relaxations for the parametric solutions of ordinary differential equations (ODEs) are central to deterministic global optimization methods for nonconvex dynamic optimization and open-loop optimal control problems with control parametrization. Given a general system of ODEs with parameter dependence in the initial conditions and right-hand sides, this work derives sufficient conditions under which an auxiliary system of ODEs describes convex and concave relaxations of the parametric solutions, pointwise in the independent variable. Convergence results for these relaxations are also established. A fully automatable procedure for constructing an appropriate auxiliary system has been developed previously by the authors. Thus, the developments here lead to an efficient, automatic method for computing convex and concave relaxations for the parametric solutions of a very general class nonlinear ODEs. The proposed method is presented in detail for a simple example problem
    corecore