2,701 research outputs found
Significance analysis and statistical mechanics: an application to clustering
This paper addresses the statistical significance of structures in random
data: Given a set of vectors and a measure of mutual similarity, how likely
does a subset of these vectors form a cluster with enhanced similarity among
its elements? The computation of this cluster p-value for randomly distributed
vectors is mapped onto a well-defined problem of statistical mechanics. We
solve this problem analytically, establishing a connection between the physics
of quenched disorder and multiple testing statistics in clustering and related
problems. In an application to gene expression data, we find a remarkable link
between the statistical significance of a cluster and the functional
relationships between its genes.Comment: to appear in Phys. Rev. Let
A New Approach to Time Domain Classification of Broadband Noise in Gravitational Wave Data
Broadband noise in gravitational wave (GW) detectors, also known as triggers,
can often be a deterrant to the efficiency with which astrophysical search
pipelines detect sources. It is important to understand their instrumental or
environmental origin so that they could be eliminated or accounted for in the
data. Since the number of triggers is large, data mining approaches such as
clustering and classification are useful tools for this task. Classification of
triggers based on a handful of discrete properties has been done in the past. A
rich information content is available in the waveform or 'shape' of the
triggers that has had a rather restricted exploration so far. This paper
presents a new way to classify triggers deriving information from both trigger
waveforms as well as their discrete physical properties using a sequential
combination of the Longest Common Sub-Sequence (LCSS) and LCSS coupled with
Fast Time Series Evaluation (FTSE) for waveform classification and the
multidimensional hierarchical classification (MHC) analysis for the grouping
based on physical properties. A generalized k-means algorithm is used with the
LCSS (and LCSS+FTSE) for clustering the triggers using a validity measure to
determine the correct number of clusters in absence of any prior knowledge. The
results have been demonstrated by simulations and by application to a segment
of real LIGO data from the sixth science run.Comment: 16 pages, 16 figure
Population genetics of the highly polymorphic RPP8 gene family
Plant nucleotide-binding domain and leucine-rich repeat containing (NLR) genes provide some of the most extreme examples of polymorphism in eukaryotic genomes, rivalling even the vertebrate major histocompatibility complex. Surprisingly, this is also true in Arabidopsis thaliana, a predominantly selfing species with low heterozygosity. Here, we investigate how gene duplication and intergenic exchange contribute to this extraordinary variation. RPP8 is a three-locus system that is configured chromosomally as either a direct-repeat tandem duplication or as a single copy locus, plus a locus 2 Mb distant. We sequenced 48 RPP8 alleles from 37 accessions of A. thaliana and 12 RPP8 alleles from Arabidopsis lyrata to investigate the patterns of interlocus shared variation. The tandem duplicates display fixed differences and share less variation with each other than either shares with the distant paralog. A high level of shared polymorphism among alleles at one of the tandem duplicates, the single-copy locus and the distal locus, must involve both classical crossing over and intergenic gene conversion. Despite these polymorphism-enhancing mechanisms, the observed nucleotide diversity could not be replicated under neutral forward-in-time simulations. Only by adding balancing selection to the simulations do they approach the level of polymorphism observed at RPP8. In this NLR gene triad, genetic architecture, gene function and selection all combine to generate diversity
Long-lived, long-period radial velocity variations in Aldebaran: A planetary companion and stellar activity
We investigate the nature of the long-period radial velocity variations in
Alpha Tau first reported over 20 years ago. We analyzed precise stellar radial
velocity measurements for Alpha Tau spanning over 30 years. An examination of
the Halpha and Ca II 8662 spectral lines, and Hipparcos photometry was also
done to help discern the nature of the long-period radial velocity variations.
Our radial velocity data show that the long-period, low amplitude radial
velocity variations are long-lived and coherent. Furthermore, Halpha equivalent
width measurements and Hipparcos photometry show no significant variations with
this period. Another investigation of this star established that there was no
variability in the spectral line shapes with the radial velocity period. An
orbital solution results in a period of P = 628.96 +/- 0.90 d, eccentricity, e
= 0.10 +/- 0.05, and a radial velocity amplitude, K = 142.1 +/- 7.2 m/s.
Evolutionary tracks yield a stellar mass of 1.13 +/- 0.11 M_sun, which
corresponds to a minimum companion mass of 6.47 +/- 0.53 M_Jup with an orbital
semi-major axis of a = 1.46 +/- 0.27 AU. After removing the orbital motion of
the companion, an additional period of ~ 520 d is found in the radial velocity
data, but only in some time spans. A similar period is found in the variations
in the equivalent width of Halpha and Ca II. Variations at one-third of this
period are also found in the spectral line bisector measurements. The 520 d
period is interpreted as the rotation modulation by stellar surface structure.
Its presence, however, may not be long-lived, and it only appears in epochs of
the radial velocity data separated by 10 years. This might be due to an
activity cycle. The data presented here provide further evidence of a planetary
companion to Alpha Tau, as well as activity-related radial velocity variations.Comment: 18 pages, 14 figures. Accepted for publication in Astronomy and
Astrophysic
The Kepler Follow-up Observation Program
The Kepler Mission was launched on March 6, 2009 to perform a photometric
survey of more than 100,000 dwarf stars to search for terrestrial-size planets
with the transit technique. Follow-up observations of planetary candidates
identified by detection of transit-like events are needed both for
identification of astrophysical phenomena that mimic planetary transits and for
characterization of the true planets and planetary systems found by Kepler. We
have developed techniques and protocols for detection of false planetary
transits and are currently conducting observations on 177 Kepler targets that
have been selected for follow-up. A preliminary estimate indicates that between
24% and 62% of planetary candidates selected for follow-up will turn out to be
true planets.Comment: 12 pages, submitted to the Astrophysical Journal Letter
Exploring The Frequency Of Close-In Jovian Planets Around M Dwarfs
We discuss our high precision radial velocity results of a sample of 90 M dwarfs observed with the Hobby-Eberly Telescope and the Harlan J. Smith 2.7 m Telescope at McDonald Observatory, as well as the ESO VLT and the Keck I telescopes, within the context of the overall frequency of Jupiter-mass planetary companions to main sequence stars. None of the stars in our sample show variability indicative of a giant planet in a short period orbit, with a 3.8 M_Jup and a 3.5 M_Jup and a < 0.7 AU. Our results point toward a generally lower frequency of close-in Jovian planets for M dwarfs as compared to FGK-type stars. This is an important piece of information for our understanding of the process of planet formation as a function of stellar mass
Statistical M-Estimation and Consistency in Large Deformable Models for Image Warping
The problem of defining appropriate distances between shapes or images and modeling the variability of natural images by group transformations is at the heart of modern image analysis. A current trend is the study of probabilistic and statistical aspects of deformation models, and the development of consistent statistical procedure for the estimation of template images. In this paper, we consider a set of images randomly warped from a mean template which has to be recovered. For this, we define an appropriate statistical parametric model to generate random diffeomorphic deformations in two-dimensions. Then, we focus on the problem of estimating the mean pattern when the images are observed with noise. This problem is challenging both from a theoretical and a practical point of view. M-estimation theory enables us to build an estimator defined as a minimizer of a well-tailored empirical criterion. We prove the convergence of this estimator and propose a gradient descent algorithm to compute this M-estimator in practice. Simulations of template extraction and an application to image clustering and classification are also provided
Autonomous clustering using rough set theory
This paper proposes a clustering technique that minimises the need for subjective
human intervention and is based on elements of rough set theory. The proposed algorithm is
unified in its approach to clustering and makes use of both local and global data properties to
obtain clustering solutions. It handles single-type and mixed attribute data sets with ease and
results from three data sets of single and mixed attribute types are used to illustrate the
technique and establish its efficiency
Propagation of an Earth-directed coronal mass ejection in three dimensions
Solar coronal mass ejections (CMEs) are the most significant drivers of
adverse space weather at Earth, but the physics governing their propagation
through the heliosphere is not well understood. While stereoscopic imaging of
CMEs with the Solar Terrestrial Relations Observatory (STEREO) has provided
some insight into their three-dimensional (3D) propagation, the mechanisms
governing their evolution remain unclear due to difficulties in reconstructing
their true 3D structure. Here we use a new elliptical tie-pointing technique to
reconstruct a full CME front in 3D, enabling us to quantify its deflected
trajectory from high latitudes along the ecliptic, and measure its increasing
angular width and propagation from 2-46 solar radii (approximately 0.2 AU).
Beyond 7 solar radii, we show that its motion is determined by an aerodynamic
drag in the solar wind and, using our reconstruction as input for a 3D
magnetohydrodynamic simulation, we determine an accurate arrival time at the
Lagrangian L1 point near Earth.Comment: 5 figures, 2 supplementary movie
- …
