14,830 research outputs found
Metrics for Graph Comparison: A Practitioner's Guide
Comparison of graph structure is a ubiquitous task in data analysis and
machine learning, with diverse applications in fields such as neuroscience,
cyber security, social network analysis, and bioinformatics, among others.
Discovery and comparison of structures such as modular communities, rich clubs,
hubs, and trees in data in these fields yields insight into the generative
mechanisms and functional properties of the graph.
Often, two graphs are compared via a pairwise distance measure, with a small
distance indicating structural similarity and vice versa. Common choices
include spectral distances (also known as distances) and distances
based on node affinities. However, there has of yet been no comparative study
of the efficacy of these distance measures in discerning between common graph
topologies and different structural scales.
In this work, we compare commonly used graph metrics and distance measures,
and demonstrate their ability to discern between common topological features
found in both random graph models and empirical datasets. We put forward a
multi-scale picture of graph structure, in which the effect of global and local
structure upon the distance measures is considered. We make recommendations on
the applicability of different distance measures to empirical graph data
problem based on this multi-scale view. Finally, we introduce the Python
library NetComp which implements the graph distances used in this work
Non-Asymptotic Analysis of Tangent Space Perturbation
Constructing an efficient parameterization of a large, noisy data set of
points lying close to a smooth manifold in high dimension remains a fundamental
problem. One approach consists in recovering a local parameterization using the
local tangent plane. Principal component analysis (PCA) is often the tool of
choice, as it returns an optimal basis in the case of noise-free samples from a
linear subspace. To process noisy data samples from a nonlinear manifold, PCA
must be applied locally, at a scale small enough such that the manifold is
approximately linear, but at a scale large enough such that structure may be
discerned from noise. Using eigenspace perturbation theory and non-asymptotic
random matrix theory, we study the stability of the subspace estimated by PCA
as a function of scale, and bound (with high probability) the angle it forms
with the true tangent space. By adaptively selecting the scale that minimizes
this bound, our analysis reveals an appropriate scale for local tangent plane
recovery. We also introduce a geometric uncertainty principle quantifying the
limits of noise-curvature perturbation for stable recovery. With the purpose of
providing perturbation bounds that can be used in practice, we propose plug-in
estimates that make it possible to directly apply the theoretical results to
real data sets.Comment: 53 pages. Revised manuscript with new content addressing application
of results to real data set
Lagrangian Cascade in Three-Dimensional Homogeneous and Isotropic Turbulence
In this work, the scaling statistics of the dissipation along Lagrangian
trajectories are investigated by using fluid tracer particles obtained from a
high resolution direct numerical simulation with . Both the
energy dissipation rate and the local time averaged
agree rather well with the lognormal distribution hypothesis.
Several statistics are then examined. It is found that the autocorrelation
function of and variance of
obey a log-law with scaling exponent
compatible with the intermittency parameter . The
th-order moment of has a clear power-law on the inertial
range . The measured scaling exponent agrees
remarkably with where is the scaling exponent
estimated using the Hilbert methodology. All these results suggest that the
dissipation along Lagrangian trajectories could be modelled by a multiplicative
cascade.Comment: 10 pages with 7 figures accepted for Journal of Fluid Mechanics as
Rapid
Driving a car with custom-designed fuzzy inferencing VLSI chips and boards
Vehicle control in a-priori unknown, unpredictable, and dynamic environments requires many calculational and reasoning schemes to operate on the basis of very imprecise, incomplete, or unreliable data. For such systems, in which all the uncertainties can not be engineered away, approximate reasoning may provide an alternative to the complexity and computational requirements of conventional uncertainty analysis and propagation techniques. Two types of computer boards including custom-designed VLSI chips were developed to add a fuzzy inferencing capability to real-time control systems. All inferencing rules on a chip are processed in parallel, allowing execution of the entire rule base in about 30 microseconds, and therefore, making control of 'reflex-type' of motions envisionable. The use of these boards and the approach using superposition of elemental sensor-based behaviors for the development of qualitative reasoning schemes emulating human-like navigation in a-priori unknown environments are first discussed. Then how the human-like navigation scheme implemented on one of the qualitative inferencing boards was installed on a test-bed platform to investigate two control modes for driving a car in a-priori unknown environments on the basis of sparse and imprecise sensor data is described. In the first mode, the car navigates fully autonomously, while in the second mode, the system acts as a driver's aid providing the driver with linguistic (fuzzy) commands to turn left or right and speed up or slow down depending on the obstacles perceived by the sensors. Experiments with both modes of control are described in which the system uses only three acoustic range (sonar) sensor channels to perceive the environment. Simulation results as well as indoors and outdoors experiments are presented and discussed to illustrate the feasibility and robustness of autonomous navigation and/or safety enhancing driver's aid using the new fuzzy inferencing hardware system and some human-like reasoning schemes which may include as little as six elemental behaviors embodied in fourteen qualitative rules
A causal multifractal stochastic equation and its statistical properties
Multiplicative cascades have been introduced in turbulence to generate random
or deterministic fields having intermittent values and long-range power-law
correlations. Generally this is done using discrete construction rules leading
to discrete cascades. Here a causal log-normal stochastic process is
introduced; its multifractal properties are demonstrated together with other
properties such as the composition rule for scale dependence and stochastic
differential equations for time and scale evolutions. This multifractal
stochastic process is continuous in scale ratio and in time. It has a simple
generating equation and can be used to generate sequentially time series of any
length.Comment: Eur. Phys. J. B (in press
Perturbation of the Eigenvectors of the Graph Laplacian: Application to Image Denoising
The original contributions of this paper are twofold: a new understanding of
the influence of noise on the eigenvectors of the graph Laplacian of a set of
image patches, and an algorithm to estimate a denoised set of patches from a
noisy image. The algorithm relies on the following two observations: (1) the
low-index eigenvectors of the diffusion, or graph Laplacian, operators are very
robust to random perturbations of the weights and random changes in the
connections of the patch-graph; and (2) patches extracted from smooth regions
of the image are organized along smooth low-dimensional structures in the
patch-set, and therefore can be reconstructed with few eigenvectors.
Experiments demonstrate that our denoising algorithm outperforms the denoising
gold-standards
Ex-ante evaluation of conditional cash transfer programs: the case of bolsa escola
Cash transfers targeted to poor people, but conditional on some behavior on their part, such as school attendance or regular visits to health care facilities, are being adopted in a growing number of developing countries. Even where ex-post impact evaluations have been conducted, a number of policy-relevant counterfactual questions have remained unanswered. These are questions about the potential impact of changes in program design, such as benefit levels or the choice of the means-test, on both the current welfare and the behavioral response of household members. This paper proposes a method to simulate the effects of those alternative program designs on welfare and behavior, based on microeconometrically estimated models of household behavior. In an application to Brazil's recently introduced federal Bolsa Escola program, the authors find a surprisingly strong effect of the conditionality on school attendance, but a muted impact of the transfers on the reduction of current poverty and inequality levelsEnvironmental Economics&Policies,Services&Transfers to Poor,Poverty Monitoring&Analysis,Public Health Promotion,Scientific Research&Science Parks,Youth and Governance,Street Children,Environmental Economics&Policies,Poverty Assessment,Poverty Monitoring&Analysis
Time dependent intrinsic correlation analysis of temperature and dissolved oxygen time series using empirical mode decomposition
In the marine environment, many fields have fluctuations over a large range
of different spatial and temporal scales. These quantities can be nonlinear
\red{and} non-stationary, and often interact with each other. A good method to
study the multiple scale dynamics of such time series, and their correlations,
is needed. In this paper an application of an empirical mode decomposition
based time dependent intrinsic correlation, \red{of} two coastal oceanic time
series, temperature and dissolved oxygen (saturation percentage) is presented.
The two time series are recorded every 20 minutes \red{for} 7 years, from 2004
to 2011. The application of the Empirical Mode Decomposition on such time
series is illustrated, and the power spectra of the time series are estimated
using the Hilbert transform (Hilbert spectral analysis). Power-law regimes are
found with slopes of 1.33 for dissolved oxygen and 1.68 for temperature at high
frequencies (between 1.2 and 12 hours) \red{with} both close to 1.9 for lower
frequencies (time scales from 2 to 100 days). Moreover, the time evolution and
scale dependence of cross correlations between both series are considered. The
trends are perfectly anti-correlated. The modes of mean year 3 and 1 year have
also negative correlation, whereas higher frequency modes have a much smaller
correlation. The estimation of time-dependent intrinsic correlations helps to
show patterns of correlations at different scales, for different modes.Comment: 35 pages with 22 figure
- …
