1,126 research outputs found
Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms
Clustering non-Euclidean data is difficult, and one of the most used
algorithms besides hierarchical clustering is the popular algorithm
Partitioning Around Medoids (PAM), also simply referred to as k-medoids. In
Euclidean geometry the mean-as used in k-means-is a good estimator for the
cluster center, but this does not hold for arbitrary dissimilarities. PAM uses
the medoid instead, the object with the smallest dissimilarity to all others in
the cluster. This notion of centrality can be used with any (dis-)similarity,
and thus is of high relevance to many domains such as biology that require the
use of Jaccard, Gower, or more complex distances.
A key issue with PAM is its high run time cost. We propose modifications to
the PAM algorithm to achieve an O(k)-fold speedup in the second SWAP phase of
the algorithm, but will still find the same results as the original PAM
algorithm. If we slightly relax the choice of swaps performed (at comparable
quality), we can further accelerate the algorithm by performing up to k swaps
in each iteration. With the substantially faster SWAP, we can now also explore
alternative strategies for choosing the initial medoids. We also show how the
CLARA and CLARANS algorithms benefit from these modifications. It can easily be
combined with earlier approaches to use PAM and CLARA on big data (some of
which use PAM as a subroutine, hence can immediately benefit from these
improvements), where the performance with high k becomes increasingly
important.
In experiments on real data with k=100, we observed a 200-fold speedup
compared to the original PAM SWAP algorithm, making PAM applicable to larger
data sets as long as we can afford to compute a distance matrix, and in
particular to higher k (at k=2, the new SWAP was only 1.5 times faster, as the
speedup is expected to increase with k)
Discovery of the progenitor of the type Ia supernova 2007on
Type Ia supernovae are exploding stars that are used to measure the
accelerated expansion of the Universe and are responsible for most of the iron
ever produced. Although there is general agreement that the exploding star is a
white dwarf in a binary system, the exact configuration and trigger of the
explosion is unclear, which could hamper their use for precision cosmology. Two
families of progenitor models have been proposed. In the first, a white dwarf
accretes material from a companion until it exceeds the Chandrasekhar mass,
collapses and explodes. Alternatively, two white dwarfs merge, again causing
catastrophic collapse and an explosion. It has hitherto been impossible to
determine if either model is correct. Here we report the discovery of an object
in pre-supernova archival X-ray images at the position of the recent type Ia
supernova (2007on) in the elliptical galaxy NGC 1404. Deep optical images (also
archival) show no sign of this object. From this we conclude that the X-ray
source is the progenitor of the supernova, which favours the accretion model
for this supernova, although the host galaxy is older (6-9 Gyr) than the age at
which the explosions are predicted in the accreting models.Comment: Published in Nature See also the two follow-up papers: Roelofs,
Bassa, Voss, Nelemans Nelemans, Voss, Roelofs, Bassa both on astro-ph
02/15/0
Mechanical Systems with Symmetry, Variational Principles, and Integration Algorithms
This paper studies variational principles for mechanical systems with symmetry and their applications to integration algorithms. We recall some general features of how to reduce variational principles in the presence of a symmetry group along with general features of integration algorithms for mechanical systems. Then we describe some integration algorithms based directly on variational principles using a
discretization technique of Veselov. The general idea for these variational integrators is to directly discretize Hamilton’s principle rather than the equations of motion in a way that preserves the original systems invariants, notably the symplectic form and, via a discrete version of Noether’s theorem, the momentum map. The resulting mechanical integrators are second-order accurate, implicit, symplectic-momentum algorithms. We apply these integrators to the rigid body and the double spherical pendulum to show that the techniques are competitive with existing integrators
Flavor conversion of cosmic neutrinos from hidden jets
High energy cosmic neutrino fluxes can be produced inside relativistic jets
under the envelopes of collapsing stars. In the energy range E ~ (0.3 - 1e5)
GeV, flavor conversion of these neutrinos is modified by various matter effects
inside the star and the Earth. We present a comprehensive (both analytic and
numerical) description of the flavor conversion of these neutrinos which
includes: (i) oscillations inside jets, (ii) flavor-to-mass state transitions
in an envelope, (iii) loss of coherence on the way to observer, and (iv)
oscillations of the mass states inside the Earth. We show that conversion has
several new features which are not realized in other objects, in particular
interference effects ("L- and H- wiggles") induced by the adiabaticity
violation. The neutrino-neutrino scattering inside jet and inelastic neutrino
interactions in the envelope may produce some additional features at E > 1e4
GeV. We study dependence of the probabilities and flavor ratios in the
matter-affected region on angles theta13 and theta23, on the CP-phase delta, as
well as on the initial flavor content and density profile of the star. We show
that measurements of the energy dependence of the flavor ratios will, in
principle, allow to determine independently the neutrino and astrophysical
parameters.Comment: 56 pages, 19 figures. Minor changes. Accepted by JHEP
A computational approach to chemical etiologies of diabetes.
Computational meta-analysis can link environmental chemicals to genes and proteins involved in human diseases, thereby elucidating possible etiologies and pathogeneses of non-communicable diseases. We used an integrated computational systems biology approach to examine possible pathogenetic linkages in type 2 diabetes (T2D) through genome-wide associations, disease similarities, and published empirical evidence. Ten environmental chemicals were found to be potentially linked to T2D, the highest scores were observed for arsenic, 2,3,7,8-tetrachlorodibenzo-p-dioxin, hexachlorobenzene, and perfluorooctanoic acid. For these substances we integrated disease and pathway annotations on top of protein interactions to reveal possible pathogenetic pathways that deserve empirical testing. The approach is general and can address other public health concerns in addition to identifying diabetogenic chemicals, and offers thus promising guidance for future research in regard to the etiology and pathogenesis of complex diseases
A meta-analysis of long-term effects of conservation agriculture on maize grain yield under rain-fed conditions
Conservation agriculture involves reduced tillage, permanent soil cover and crop rotations to enhance soil fertility and to supply food from a dwindling land resource. Recently, conservation agriculture has been promoted in Southern Africa, mainly for maize-based farming systems. However, maize yields under rain-fed conditions are often variable. There is therefore a need to identify factors that influence crop yield under conservation agriculture and rain-fed conditions. Here, we studied maize grain yield data from experiments lasting 5 years and more under rain-fed conditions. We assessed the effect of long-term tillage and residue retention on maize grain yield under contrasting soil textures, nitrogen input and climate. Yield variability was measured by stability analysis. Our results show an increase in maize yield over time with conservation agriculture practices that include rotation and high input use in low rainfall areas. But we observed no difference in system stability under those conditions. We observed a strong relationship between maize grain yield and annual rainfall. Our meta-analysis gave the following findings: (1) 92% of the data show that mulch cover in high rainfall areas leads to lower yields due to waterlogging; (2) 85% of data show that soil texture is important in the temporal development of conservation agriculture effects, improved yields are likely on well-drained soils; (3) 73% of the data show that conservation agriculture practices require high inputs especially N for improved yield; (4) 63% of data show that increased yields are obtained with rotation but calculations often do not include the variations in rainfall within and between seasons; (5) 56% of the data show that reduced tillage with no mulch cover leads to lower yields in semi-arid areas; and (6) when adequate fertiliser is available, rainfall is the most important determinant of yield in southern Africa. It is clear from our results that conservation agriculture needs to be targeted and adapted to specific biophysical conditions for improved impact
Performance of CMS muon reconstruction in pp collision events at sqrt(s) = 7 TeV
The performance of muon reconstruction, identification, and triggering in CMS
has been studied using 40 inverse picobarns of data collected in pp collisions
at sqrt(s) = 7 TeV at the LHC in 2010. A few benchmark sets of selection
criteria covering a wide range of physics analysis needs have been examined.
For all considered selections, the efficiency to reconstruct and identify a
muon with a transverse momentum pT larger than a few GeV is above 95% over the
whole region of pseudorapidity covered by the CMS muon system, abs(eta) < 2.4,
while the probability to misidentify a hadron as a muon is well below 1%. The
efficiency to trigger on single muons with pT above a few GeV is higher than
90% over the full eta range, and typically substantially better. The overall
momentum scale is measured to a precision of 0.2% with muons from Z decays. The
transverse momentum resolution varies from 1% to 6% depending on pseudorapidity
for muons with pT below 100 GeV and, using cosmic rays, it is shown to be
better than 10% in the central region up to pT = 1 TeV. Observed distributions
of all quantities are well reproduced by the Monte Carlo simulation.Comment: Replaced with published version. Added journal reference and DO
Performance of CMS muon reconstruction in pp collision events at sqrt(s) = 7 TeV
The performance of muon reconstruction, identification, and triggering in CMS
has been studied using 40 inverse picobarns of data collected in pp collisions
at sqrt(s) = 7 TeV at the LHC in 2010. A few benchmark sets of selection
criteria covering a wide range of physics analysis needs have been examined.
For all considered selections, the efficiency to reconstruct and identify a
muon with a transverse momentum pT larger than a few GeV is above 95% over the
whole region of pseudorapidity covered by the CMS muon system, abs(eta) < 2.4,
while the probability to misidentify a hadron as a muon is well below 1%. The
efficiency to trigger on single muons with pT above a few GeV is higher than
90% over the full eta range, and typically substantially better. The overall
momentum scale is measured to a precision of 0.2% with muons from Z decays. The
transverse momentum resolution varies from 1% to 6% depending on pseudorapidity
for muons with pT below 100 GeV and, using cosmic rays, it is shown to be
better than 10% in the central region up to pT = 1 TeV. Observed distributions
of all quantities are well reproduced by the Monte Carlo simulation.Comment: Replaced with published version. Added journal reference and DO
Autoimmune gastrointestinal complications in patients with Systemic Lupus Erythematosus: case series and literature review
The association of systemic lupus erythematosus (SLE) with gastrointestinal autoimmune diseases is rare, but has been described in the literature, mostly as case reports. However, some of these diseases may be very severe, thus a correct and early diagnosis with appropriate management are fundamental. We have analysed our data from the SLE patient cohort at University College Hospital London, established in 1978, identifying those patients with an associated autoimmune gastrointestinal disease. We have also undertaken a review of the literature describing the major autoimmune gastrointestinal pathologies which may be coincident with SLE, focusing on the incidence, clinical and laboratory (particularly antibody) findings, common aetiopathogenesis and complications
- …
