1,126 research outputs found

    Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms

    Full text link
    Clustering non-Euclidean data is difficult, and one of the most used algorithms besides hierarchical clustering is the popular algorithm Partitioning Around Medoids (PAM), also simply referred to as k-medoids. In Euclidean geometry the mean-as used in k-means-is a good estimator for the cluster center, but this does not hold for arbitrary dissimilarities. PAM uses the medoid instead, the object with the smallest dissimilarity to all others in the cluster. This notion of centrality can be used with any (dis-)similarity, and thus is of high relevance to many domains such as biology that require the use of Jaccard, Gower, or more complex distances. A key issue with PAM is its high run time cost. We propose modifications to the PAM algorithm to achieve an O(k)-fold speedup in the second SWAP phase of the algorithm, but will still find the same results as the original PAM algorithm. If we slightly relax the choice of swaps performed (at comparable quality), we can further accelerate the algorithm by performing up to k swaps in each iteration. With the substantially faster SWAP, we can now also explore alternative strategies for choosing the initial medoids. We also show how the CLARA and CLARANS algorithms benefit from these modifications. It can easily be combined with earlier approaches to use PAM and CLARA on big data (some of which use PAM as a subroutine, hence can immediately benefit from these improvements), where the performance with high k becomes increasingly important. In experiments on real data with k=100, we observed a 200-fold speedup compared to the original PAM SWAP algorithm, making PAM applicable to larger data sets as long as we can afford to compute a distance matrix, and in particular to higher k (at k=2, the new SWAP was only 1.5 times faster, as the speedup is expected to increase with k)

    Discovery of the progenitor of the type Ia supernova 2007on

    Get PDF
    Type Ia supernovae are exploding stars that are used to measure the accelerated expansion of the Universe and are responsible for most of the iron ever produced. Although there is general agreement that the exploding star is a white dwarf in a binary system, the exact configuration and trigger of the explosion is unclear, which could hamper their use for precision cosmology. Two families of progenitor models have been proposed. In the first, a white dwarf accretes material from a companion until it exceeds the Chandrasekhar mass, collapses and explodes. Alternatively, two white dwarfs merge, again causing catastrophic collapse and an explosion. It has hitherto been impossible to determine if either model is correct. Here we report the discovery of an object in pre-supernova archival X-ray images at the position of the recent type Ia supernova (2007on) in the elliptical galaxy NGC 1404. Deep optical images (also archival) show no sign of this object. From this we conclude that the X-ray source is the progenitor of the supernova, which favours the accretion model for this supernova, although the host galaxy is older (6-9 Gyr) than the age at which the explosions are predicted in the accreting models.Comment: Published in Nature See also the two follow-up papers: Roelofs, Bassa, Voss, Nelemans Nelemans, Voss, Roelofs, Bassa both on astro-ph 02/15/0

    Mechanical Systems with Symmetry, Variational Principles, and Integration Algorithms

    Get PDF
    This paper studies variational principles for mechanical systems with symmetry and their applications to integration algorithms. We recall some general features of how to reduce variational principles in the presence of a symmetry group along with general features of integration algorithms for mechanical systems. Then we describe some integration algorithms based directly on variational principles using a discretization technique of Veselov. The general idea for these variational integrators is to directly discretize Hamilton’s principle rather than the equations of motion in a way that preserves the original systems invariants, notably the symplectic form and, via a discrete version of Noether’s theorem, the momentum map. The resulting mechanical integrators are second-order accurate, implicit, symplectic-momentum algorithms. We apply these integrators to the rigid body and the double spherical pendulum to show that the techniques are competitive with existing integrators

    Flavor conversion of cosmic neutrinos from hidden jets

    Full text link
    High energy cosmic neutrino fluxes can be produced inside relativistic jets under the envelopes of collapsing stars. In the energy range E ~ (0.3 - 1e5) GeV, flavor conversion of these neutrinos is modified by various matter effects inside the star and the Earth. We present a comprehensive (both analytic and numerical) description of the flavor conversion of these neutrinos which includes: (i) oscillations inside jets, (ii) flavor-to-mass state transitions in an envelope, (iii) loss of coherence on the way to observer, and (iv) oscillations of the mass states inside the Earth. We show that conversion has several new features which are not realized in other objects, in particular interference effects ("L- and H- wiggles") induced by the adiabaticity violation. The neutrino-neutrino scattering inside jet and inelastic neutrino interactions in the envelope may produce some additional features at E > 1e4 GeV. We study dependence of the probabilities and flavor ratios in the matter-affected region on angles theta13 and theta23, on the CP-phase delta, as well as on the initial flavor content and density profile of the star. We show that measurements of the energy dependence of the flavor ratios will, in principle, allow to determine independently the neutrino and astrophysical parameters.Comment: 56 pages, 19 figures. Minor changes. Accepted by JHEP

    A computational approach to chemical etiologies of diabetes.

    Get PDF
    Computational meta-analysis can link environmental chemicals to genes and proteins involved in human diseases, thereby elucidating possible etiologies and pathogeneses of non-communicable diseases. We used an integrated computational systems biology approach to examine possible pathogenetic linkages in type 2 diabetes (T2D) through genome-wide associations, disease similarities, and published empirical evidence. Ten environmental chemicals were found to be potentially linked to T2D, the highest scores were observed for arsenic, 2,3,7,8-tetrachlorodibenzo-p-dioxin, hexachlorobenzene, and perfluorooctanoic acid. For these substances we integrated disease and pathway annotations on top of protein interactions to reveal possible pathogenetic pathways that deserve empirical testing. The approach is general and can address other public health concerns in addition to identifying diabetogenic chemicals, and offers thus promising guidance for future research in regard to the etiology and pathogenesis of complex diseases

    A meta-analysis of long-term effects of conservation agriculture on maize grain yield under rain-fed conditions

    Get PDF
    Conservation agriculture involves reduced tillage, permanent soil cover and crop rotations to enhance soil fertility and to supply food from a dwindling land resource. Recently, conservation agriculture has been promoted in Southern Africa, mainly for maize-based farming systems. However, maize yields under rain-fed conditions are often variable. There is therefore a need to identify factors that influence crop yield under conservation agriculture and rain-fed conditions. Here, we studied maize grain yield data from experiments lasting 5 years and more under rain-fed conditions. We assessed the effect of long-term tillage and residue retention on maize grain yield under contrasting soil textures, nitrogen input and climate. Yield variability was measured by stability analysis. Our results show an increase in maize yield over time with conservation agriculture practices that include rotation and high input use in low rainfall areas. But we observed no difference in system stability under those conditions. We observed a strong relationship between maize grain yield and annual rainfall. Our meta-analysis gave the following findings: (1) 92% of the data show that mulch cover in high rainfall areas leads to lower yields due to waterlogging; (2) 85% of data show that soil texture is important in the temporal development of conservation agriculture effects, improved yields are likely on well-drained soils; (3) 73% of the data show that conservation agriculture practices require high inputs especially N for improved yield; (4) 63% of data show that increased yields are obtained with rotation but calculations often do not include the variations in rainfall within and between seasons; (5) 56% of the data show that reduced tillage with no mulch cover leads to lower yields in semi-arid areas; and (6) when adequate fertiliser is available, rainfall is the most important determinant of yield in southern Africa. It is clear from our results that conservation agriculture needs to be targeted and adapted to specific biophysical conditions for improved impact

    Performance of CMS muon reconstruction in pp collision events at sqrt(s) = 7 TeV

    Get PDF
    The performance of muon reconstruction, identification, and triggering in CMS has been studied using 40 inverse picobarns of data collected in pp collisions at sqrt(s) = 7 TeV at the LHC in 2010. A few benchmark sets of selection criteria covering a wide range of physics analysis needs have been examined. For all considered selections, the efficiency to reconstruct and identify a muon with a transverse momentum pT larger than a few GeV is above 95% over the whole region of pseudorapidity covered by the CMS muon system, abs(eta) < 2.4, while the probability to misidentify a hadron as a muon is well below 1%. The efficiency to trigger on single muons with pT above a few GeV is higher than 90% over the full eta range, and typically substantially better. The overall momentum scale is measured to a precision of 0.2% with muons from Z decays. The transverse momentum resolution varies from 1% to 6% depending on pseudorapidity for muons with pT below 100 GeV and, using cosmic rays, it is shown to be better than 10% in the central region up to pT = 1 TeV. Observed distributions of all quantities are well reproduced by the Monte Carlo simulation.Comment: Replaced with published version. Added journal reference and DO

    Performance of CMS muon reconstruction in pp collision events at sqrt(s) = 7 TeV

    Get PDF
    The performance of muon reconstruction, identification, and triggering in CMS has been studied using 40 inverse picobarns of data collected in pp collisions at sqrt(s) = 7 TeV at the LHC in 2010. A few benchmark sets of selection criteria covering a wide range of physics analysis needs have been examined. For all considered selections, the efficiency to reconstruct and identify a muon with a transverse momentum pT larger than a few GeV is above 95% over the whole region of pseudorapidity covered by the CMS muon system, abs(eta) < 2.4, while the probability to misidentify a hadron as a muon is well below 1%. The efficiency to trigger on single muons with pT above a few GeV is higher than 90% over the full eta range, and typically substantially better. The overall momentum scale is measured to a precision of 0.2% with muons from Z decays. The transverse momentum resolution varies from 1% to 6% depending on pseudorapidity for muons with pT below 100 GeV and, using cosmic rays, it is shown to be better than 10% in the central region up to pT = 1 TeV. Observed distributions of all quantities are well reproduced by the Monte Carlo simulation.Comment: Replaced with published version. Added journal reference and DO

    Autoimmune gastrointestinal complications in patients with Systemic Lupus Erythematosus: case series and literature review

    Get PDF
    The association of systemic lupus erythematosus (SLE) with gastrointestinal autoimmune diseases is rare, but has been described in the literature, mostly as case reports. However, some of these diseases may be very severe, thus a correct and early diagnosis with appropriate management are fundamental. We have analysed our data from the SLE patient cohort at University College Hospital London, established in 1978, identifying those patients with an associated autoimmune gastrointestinal disease. We have also undertaken a review of the literature describing the major autoimmune gastrointestinal pathologies which may be coincident with SLE, focusing on the incidence, clinical and laboratory (particularly antibody) findings, common aetiopathogenesis and complications
    corecore