23,953 research outputs found

    Theoretical and Experimental Analysis of a Randomized Algorithm for Sparse Fourier Transform Analysis

    Full text link
    We analyze a sublinear RAlSFA (Randomized Algorithm for Sparse Fourier Analysis) that finds a near-optimal B-term Sparse Representation R for a given discrete signal S of length N, in time and space poly(B,log(N)), following the approach given in \cite{GGIMS}. Its time cost poly(log(N)) should be compared with the superlinear O(N log N) time requirement of the Fast Fourier Transform (FFT). A straightforward implementation of the RAlSFA, as presented in the theoretical paper \cite{GGIMS}, turns out to be very slow in practice. Our main result is a greatly improved and practical RAlSFA. We introduce several new ideas and techniques that speed up the algorithm. Both rigorous and heuristic arguments for parameter choices are presented. Our RAlSFA constructs, with probability at least 1-delta, a near-optimal B-term representation R in time poly(B)log(N)log(1/delta)/ epsilon^{2} log(M) such that ||S-R||^{2}<=(1+epsilon)||S-R_{opt}||^{2}. Furthermore, this RAlSFA implementation already beats the FFTW for not unreasonably large N. We extend the algorithm to higher dimensional cases both theoretically and numerically. The crossover point lies at N=70000 in one dimension, and at N=900 for data on a N*N grid in two dimensions for small B signals where there is noise.Comment: 21 pages, 8 figures, submitted to Journal of Computational Physic

    Approximate Sparse Recovery: Optimizing Time and Measurements

    Full text link
    An approximate sparse recovery system consists of parameters k,Nk,N, an mm-by-NN measurement matrix, Φ\Phi, and a decoding algorithm, D\mathcal{D}. Given a vector, xx, the system approximates xx by x^=D(Φx)\widehat x =\mathcal{D}(\Phi x), which must satisfy x^x2Cxxk2\| \widehat x - x\|_2\le C \|x - x_k\|_2, where xkx_k denotes the optimal kk-term approximation to xx. For each vector xx, the system must succeed with probability at least 3/4. Among the goals in designing such systems are minimizing the number mm of measurements and the runtime of the decoding algorithm, D\mathcal{D}. In this paper, we give a system with m=O(klog(N/k))m=O(k \log(N/k)) measurements--matching a lower bound, up to a constant factor--and decoding time O(klogcN)O(k\log^c N), matching a lower bound up to log(N)\log(N) factors. We also consider the encode time (i.e., the time to multiply Φ\Phi by xx), the time to update measurements (i.e., the time to multiply Φ\Phi by a 1-sparse xx), and the robustness and stability of the algorithm (adding noise before and after the measurements). Our encode and update times are optimal up to log(N)\log(N) factors

    Non-Symbolic Fragmentation

    Get PDF
    This paper reports on the use of non-symbolic fragmentation of data for securing communications. Non-symbolic fragmentation, or NSF, relies on breaking up data into non-symbolic fragments, which are (usually irregularly-sized) chunks whose boundaries do not necessarily coincide with the boundaries of the symbols making up the data. For example, ASCII data is broken up into fragments which may include 8-bit fragments but also include many other sized fragments. Fragments are then separated with a form of path diversity. The secrecy of the transmission relies on the secrecy of one or more of a number of things: the ordering of the fragments, the sizes of the fragments, and the use of path diversity. Once NSF is in place, it can help secure many forms of communication, and is useful for exchanging sensitive information, and for commercial transactions. A sample implementation is described with an evaluation of the technology

    Three-dimensional dynamic rupture simulation with a high-order discontinuous Galerkin method on unstructured tetrahedral meshes

    Get PDF
    Accurate and efficient numerical methods to simulate dynamic earthquake rupture and wave propagation in complex media and complex fault geometries are needed to address fundamental questions in earthquake dynamics, to integrate seismic and geodetic data into emerging approaches for dynamic source inversion, and to generate realistic physics-based earthquake scenarios for hazard assessment. Modeling of spontaneous earthquake rupture and seismic wave propagation by a high-order discontinuous Galerkin (DG) method combined with an arbitrarily high-order derivatives (ADER) time integration method was introduced in two dimensions by de la Puente et al. (2009). The ADER-DG method enables high accuracy in space and time and discretization by unstructured meshes. Here we extend this method to three-dimensional dynamic rupture problems. The high geometrical flexibility provided by the usage of tetrahedral elements and the lack of spurious mesh reflections in the ADER-DG method allows the refinement of the mesh close to the fault to model the rupture dynamics adequately while concentrating computational resources only where needed. Moreover, ADER-DG does not generate spurious high-frequency perturbations on the fault and hence does not require artificial Kelvin-Voigt damping. We verify our three-dimensional implementation by comparing results of the SCEC TPV3 test problem with two well-established numerical methods, finite differences, and spectral boundary integral. Furthermore, a convergence study is presented to demonstrate the systematic consistency of the method. To illustrate the capabilities of the high-order accurate ADER-DG scheme on unstructured meshes, we simulate an earthquake scenario, inspired by the 1992 Landers earthquake, that includes curved faults, fault branches, and surface topography

    Reallocation Problems in Scheduling

    Full text link
    In traditional on-line problems, such as scheduling, requests arrive over time, demanding available resources. As each request arrives, some resources may have to be irrevocably committed to servicing that request. In many situations, however, it may be possible or even necessary to reallocate previously allocated resources in order to satisfy a new request. This reallocation has a cost. This paper shows how to service the requests while minimizing the reallocation cost. We focus on the classic problem of scheduling jobs on a multiprocessor system. Each unit-size job has a time window in which it can be executed. Jobs are dynamically added and removed from the system. We provide an algorithm that maintains a valid schedule, as long as a sufficiently feasible schedule exists. The algorithm reschedules only a total number of O(min{log^* n, log^* Delta}) jobs for each job that is inserted or deleted from the system, where n is the number of active jobs and Delta is the size of the largest window.Comment: 9 oages, 1 table; extended abstract version to appear in SPAA 201

    List decoding of noisy Reed-Muller-like codes

    Full text link
    First- and second-order Reed-Muller (RM(1) and RM(2), respectively) codes are two fundamental error-correcting codes which arise in communication as well as in probabilistically-checkable proofs and learning. In this paper, we take the first steps toward extending the quick randomized decoding tools of RM(1) into the realm of quadratic binary and, equivalently, Z_4 codes. Our main algorithmic result is an extension of the RM(1) techniques from Goldreich-Levin and Kushilevitz-Mansour algorithms to the Hankel code, a code between RM(1) and RM(2). That is, given signal s of length N, we find a list that is a superset of all Hankel codewords phi with dot product to s at least (1/sqrt(k)) times the norm of s, in time polynomial in k and log(N). We also give a new and simple formulation of a known Kerdock code as a subcode of the Hankel code. As a corollary, we can list-decode Kerdock, too. Also, we get a quick algorithm for finding a sparse Kerdock approximation. That is, for k small compared with 1/sqrt{N} and for epsilon > 0, we find, in time polynomial in (k log(N)/epsilon), a k-Kerdock-term approximation s~ to s with Euclidean error at most the factor (1+epsilon+O(k^2/sqrt{N})) times that of the best such approximation

    Bias in culture-independent assessments of microbial biodiversity in the global ocean

    Get PDF
    On the basis of 16S rRNA gene sequencing, the SAR11 clade of marine bacteria has almost universal distribution, being detected as abundant sequences in all marine provinces. Yet SAR11 sequences are rarely detected in fosmid libraries, suggesting that the widespread abundance may be an artefact of PCR cloning and that SAR 11 has a relatively low abundance. Here the relative abundance of SAR11 is explored in both a fosmid library and a metagenomic sequence data set from the same biological community taken from fjord surface water from Bergen, Norway. Pyrosequenced data and 16S clone data confirmed an 11-15% relative abundance of SAR11 within the community. In contrast not a single SAR11 fosmid was identified in a pooled shotgun sequenced data set of 100 fosmid clones. This under-representation was evidenced by comparative abundances of SAR11 sequences assessed by taxonomic annotation; functional metabolic profiling and fragment recruitment. Analysis revealed a similar under-representation of low-GC Flavobacteriaceae. We speculate that the fosmid bias may be due to DNA fragmentation during preparation due to the low GC content of SAR11 sequences and other underrepresented taxa. This study suggests that while fosmid libraries can be extremely useful, caution must be used when directly inferring community composition from metagenomic fosmid libraries
    corecore