14,830 research outputs found

    Metrics for Graph Comparison: A Practitioner's Guide

    Full text link
    Comparison of graph structure is a ubiquitous task in data analysis and machine learning, with diverse applications in fields such as neuroscience, cyber security, social network analysis, and bioinformatics, among others. Discovery and comparison of structures such as modular communities, rich clubs, hubs, and trees in data in these fields yields insight into the generative mechanisms and functional properties of the graph. Often, two graphs are compared via a pairwise distance measure, with a small distance indicating structural similarity and vice versa. Common choices include spectral distances (also known as λ\lambda distances) and distances based on node affinities. However, there has of yet been no comparative study of the efficacy of these distance measures in discerning between common graph topologies and different structural scales. In this work, we compare commonly used graph metrics and distance measures, and demonstrate their ability to discern between common topological features found in both random graph models and empirical datasets. We put forward a multi-scale picture of graph structure, in which the effect of global and local structure upon the distance measures is considered. We make recommendations on the applicability of different distance measures to empirical graph data problem based on this multi-scale view. Finally, we introduce the Python library NetComp which implements the graph distances used in this work

    Non-Asymptotic Analysis of Tangent Space Perturbation

    Full text link
    Constructing an efficient parameterization of a large, noisy data set of points lying close to a smooth manifold in high dimension remains a fundamental problem. One approach consists in recovering a local parameterization using the local tangent plane. Principal component analysis (PCA) is often the tool of choice, as it returns an optimal basis in the case of noise-free samples from a linear subspace. To process noisy data samples from a nonlinear manifold, PCA must be applied locally, at a scale small enough such that the manifold is approximately linear, but at a scale large enough such that structure may be discerned from noise. Using eigenspace perturbation theory and non-asymptotic random matrix theory, we study the stability of the subspace estimated by PCA as a function of scale, and bound (with high probability) the angle it forms with the true tangent space. By adaptively selecting the scale that minimizes this bound, our analysis reveals an appropriate scale for local tangent plane recovery. We also introduce a geometric uncertainty principle quantifying the limits of noise-curvature perturbation for stable recovery. With the purpose of providing perturbation bounds that can be used in practice, we propose plug-in estimates that make it possible to directly apply the theoretical results to real data sets.Comment: 53 pages. Revised manuscript with new content addressing application of results to real data set

    Lagrangian Cascade in Three-Dimensional Homogeneous and Isotropic Turbulence

    Full text link
    In this work, the scaling statistics of the dissipation along Lagrangian trajectories are investigated by using fluid tracer particles obtained from a high resolution direct numerical simulation with Reλ=400Re_{\lambda}=400. Both the energy dissipation rate ϵ\epsilon and the local time averaged ϵτ\epsilon_{\tau} agree rather well with the lognormal distribution hypothesis. Several statistics are then examined. It is found that the autocorrelation function ρ(τ)\rho(\tau) of ln(ϵ(t))\ln(\epsilon(t)) and variance σ2(τ)\sigma^2(\tau) of ln(ϵτ(t))\ln(\epsilon_{\tau}(t)) obey a log-law with scaling exponent β=β=0.30\beta'=\beta=0.30 compatible with the intermittency parameter μ=0.30\mu=0.30. The qqth-order moment of ϵτ\epsilon_{\tau} has a clear power-law on the inertial range 10<τ/τη<10010<\tau/\tau_{\eta}<100. The measured scaling exponent KL(q)K_L(q) agrees remarkably with qζL(2q)q-\zeta_L(2q) where ζL(2q)\zeta_L(2q) is the scaling exponent estimated using the Hilbert methodology. All these results suggest that the dissipation along Lagrangian trajectories could be modelled by a multiplicative cascade.Comment: 10 pages with 7 figures accepted for Journal of Fluid Mechanics as Rapid

    Driving a car with custom-designed fuzzy inferencing VLSI chips and boards

    Get PDF
    Vehicle control in a-priori unknown, unpredictable, and dynamic environments requires many calculational and reasoning schemes to operate on the basis of very imprecise, incomplete, or unreliable data. For such systems, in which all the uncertainties can not be engineered away, approximate reasoning may provide an alternative to the complexity and computational requirements of conventional uncertainty analysis and propagation techniques. Two types of computer boards including custom-designed VLSI chips were developed to add a fuzzy inferencing capability to real-time control systems. All inferencing rules on a chip are processed in parallel, allowing execution of the entire rule base in about 30 microseconds, and therefore, making control of 'reflex-type' of motions envisionable. The use of these boards and the approach using superposition of elemental sensor-based behaviors for the development of qualitative reasoning schemes emulating human-like navigation in a-priori unknown environments are first discussed. Then how the human-like navigation scheme implemented on one of the qualitative inferencing boards was installed on a test-bed platform to investigate two control modes for driving a car in a-priori unknown environments on the basis of sparse and imprecise sensor data is described. In the first mode, the car navigates fully autonomously, while in the second mode, the system acts as a driver's aid providing the driver with linguistic (fuzzy) commands to turn left or right and speed up or slow down depending on the obstacles perceived by the sensors. Experiments with both modes of control are described in which the system uses only three acoustic range (sonar) sensor channels to perceive the environment. Simulation results as well as indoors and outdoors experiments are presented and discussed to illustrate the feasibility and robustness of autonomous navigation and/or safety enhancing driver's aid using the new fuzzy inferencing hardware system and some human-like reasoning schemes which may include as little as six elemental behaviors embodied in fourteen qualitative rules

    A causal multifractal stochastic equation and its statistical properties

    Full text link
    Multiplicative cascades have been introduced in turbulence to generate random or deterministic fields having intermittent values and long-range power-law correlations. Generally this is done using discrete construction rules leading to discrete cascades. Here a causal log-normal stochastic process is introduced; its multifractal properties are demonstrated together with other properties such as the composition rule for scale dependence and stochastic differential equations for time and scale evolutions. This multifractal stochastic process is continuous in scale ratio and in time. It has a simple generating equation and can be used to generate sequentially time series of any length.Comment: Eur. Phys. J. B (in press

    Perturbation of the Eigenvectors of the Graph Laplacian: Application to Image Denoising

    Full text link
    The original contributions of this paper are twofold: a new understanding of the influence of noise on the eigenvectors of the graph Laplacian of a set of image patches, and an algorithm to estimate a denoised set of patches from a noisy image. The algorithm relies on the following two observations: (1) the low-index eigenvectors of the diffusion, or graph Laplacian, operators are very robust to random perturbations of the weights and random changes in the connections of the patch-graph; and (2) patches extracted from smooth regions of the image are organized along smooth low-dimensional structures in the patch-set, and therefore can be reconstructed with few eigenvectors. Experiments demonstrate that our denoising algorithm outperforms the denoising gold-standards

    Ex-ante evaluation of conditional cash transfer programs: the case of bolsa escola

    Get PDF
    Cash transfers targeted to poor people, but conditional on some behavior on their part, such as school attendance or regular visits to health care facilities, are being adopted in a growing number of developing countries. Even where ex-post impact evaluations have been conducted, a number of policy-relevant counterfactual questions have remained unanswered. These are questions about the potential impact of changes in program design, such as benefit levels or the choice of the means-test, on both the current welfare and the behavioral response of household members. This paper proposes a method to simulate the effects of those alternative program designs on welfare and behavior, based on microeconometrically estimated models of household behavior. In an application to Brazil's recently introduced federal Bolsa Escola program, the authors find a surprisingly strong effect of the conditionality on school attendance, but a muted impact of the transfers on the reduction of current poverty and inequality levelsEnvironmental Economics&Policies,Services&Transfers to Poor,Poverty Monitoring&Analysis,Public Health Promotion,Scientific Research&Science Parks,Youth and Governance,Street Children,Environmental Economics&Policies,Poverty Assessment,Poverty Monitoring&Analysis

    Time dependent intrinsic correlation analysis of temperature and dissolved oxygen time series using empirical mode decomposition

    Full text link
    In the marine environment, many fields have fluctuations over a large range of different spatial and temporal scales. These quantities can be nonlinear \red{and} non-stationary, and often interact with each other. A good method to study the multiple scale dynamics of such time series, and their correlations, is needed. In this paper an application of an empirical mode decomposition based time dependent intrinsic correlation, \red{of} two coastal oceanic time series, temperature and dissolved oxygen (saturation percentage) is presented. The two time series are recorded every 20 minutes \red{for} 7 years, from 2004 to 2011. The application of the Empirical Mode Decomposition on such time series is illustrated, and the power spectra of the time series are estimated using the Hilbert transform (Hilbert spectral analysis). Power-law regimes are found with slopes of 1.33 for dissolved oxygen and 1.68 for temperature at high frequencies (between 1.2 and 12 hours) \red{with} both close to 1.9 for lower frequencies (time scales from 2 to 100 days). Moreover, the time evolution and scale dependence of cross correlations between both series are considered. The trends are perfectly anti-correlated. The modes of mean year 3 and 1 year have also negative correlation, whereas higher frequency modes have a much smaller correlation. The estimation of time-dependent intrinsic correlations helps to show patterns of correlations at different scales, for different modes.Comment: 35 pages with 22 figure
    corecore