3,511 research outputs found

    Is "Better Data" Better than "Better Data Miners"? (On the Benefits of Tuning SMOTE for Defect Prediction)

    Full text link
    We report and fix an important systematic error in prior studies that ranked classifiers for software analytics. Those studies did not (a) assess classifiers on multiple criteria and they did not (b) study how variations in the data affect the results. Hence, this paper applies (a) multi-criteria tests while (b) fixing the weaker regions of the training data (using SMOTUNED, which is a self-tuning version of SMOTE). This approach leads to dramatically large increases in software defect predictions. When applied in a 5*5 cross-validation study for 3,681 JAVA classes (containing over a million lines of code) from open source systems, SMOTUNED increased AUC and recall by 60% and 20% respectively. These improvements are independent of the classifier used to predict for quality. Same kind of pattern (improvement) was observed when a comparative analysis of SMOTE and SMOTUNED was done against the most recent class imbalance technique. In conclusion, for software analytic tasks like defect prediction, (1) data pre-processing can be more important than classifier choice, (2) ranking studies are incomplete without such pre-processing, and (3) SMOTUNED is a promising candidate for pre-processing.Comment: 10 pages + 2 references. Accepted to International Conference of Software Engineering (ICSE), 201

    The Minimum Shared Edges Problem on Grid-like Graphs

    Full text link
    We study the NP-hard Minimum Shared Edges (MSE) problem on graphs: decide whether it is possible to route pp paths from a start vertex to a target vertex in a given graph while using at most kk edges more than once. We show that MSE can be decided on bounded (i.e. finite) grids in linear time when both dimensions are either small or large compared to the number pp of paths. On the contrary, we show that MSE remains NP-hard on subgraphs of bounded grids. Finally, we study MSE from a parametrised complexity point of view. It is known that MSE is fixed-parameter tractable with respect to the number pp of paths. We show that, under standard complexity-theoretical assumptions, the problem parametrised by the combined parameter kk, pp, maximum degree, diameter, and treewidth does not admit a polynomial-size problem kernel, even when restricted to planar graphs

    The Complexity of Routing with Few Collisions

    Full text link
    We study the computational complexity of routing multiple objects through a network in such a way that only few collisions occur: Given a graph GG with two distinct terminal vertices and two positive integers pp and kk, the question is whether one can connect the terminals by at least pp routes (e.g. paths) such that at most kk edges are time-wise shared among them. We study three types of routes: traverse each vertex at most once (paths), each edge at most once (trails), or no such restrictions (walks). We prove that for paths and trails the problem is NP-complete on undirected and directed graphs even if kk is constant or the maximum vertex degree in the input graph is constant. For walks, however, it is solvable in polynomial time on undirected graphs for arbitrary kk and on directed graphs if kk is constant. We additionally study for all route types a variant of the problem where the maximum length of a route is restricted by some given upper bound. We prove that this length-restricted variant has the same complexity classification with respect to paths and trails, but for walks it becomes NP-complete on undirected graphs

    Does rapid urbanization aggravate health disparities? Reflections on the epidemiological transition in Pune, India

    Get PDF
    Background: Rapid urbanization in low- and middle-income countries reinforces risk and epidemiological transition in urban societies, which are characterized by high socioeconomic gradients. Limited availability of disaggregated morbidity data in these settings impedes research on epidemiological profiles of different population subgroups. Objective: The study aimed to analyze the epidemiological transition in the emerging megacity of Pune with respect to changing morbidity and mortality patterns, also taking into consideration health disparities among different socioeconomic groups. Design: A mixed-methods approach was used, comprising secondary analysis of mortality data, a survey among 900 households in six neighborhoods with different socioeconomic profiles, 46 in-depth interviews with laypeople, and expert interviews with 37 health care providers and 22 other health care workers. Results: The mortality data account for an epidemiological transition with an increasing number of deaths due to non-communicable diseases (NCDs) in Pune. The share of deaths due to infectious and parasitic diseases remained nearly constant, though the cause of deaths changed considerably within this group. The survey data and expert interviews indicated a slightly higher prevalence of diabetes and hypertension among higher socioeconomic groups, but a higher incidence and more frequent complications and comorbidities in lower socioeconomic groups. Although the self-reported morbidity for malaria, gastroenteritis, and tuberculosis did not show a socioeconomic pattern, experts estimated the prevalence in lower socioeconomic groups to be higher, though all groups in Pune would be affected. Conclusions: The rising burden of NCDs among all socioeconomic groups and the concurrent persistence of communicable diseases pose a major challenge for public health. Improvement of urban health requires a stronger focus on health promotion and disease prevention for all socioeconomic groups with a holistic understanding of urban health. In order to derive evidence-based solutions and interventions, routine surveillance data become indispensable

    Application of the Shiono and Knight Method in asymmetric compound channels with different side slopes of the internal wall

    Get PDF
    The Shiono and Knight Method (SKM) is widely used to predict the lateral distribution of depth-averaged velocity and boundary shear stress for flows in compound channels. Three calibrating coefficients need to be estimated for applying the SKM, namely eddy viscosity coefficient (λ), friction factor (f) and secondary flow coefficient (k). There are several tested methods which can satisfactorily be used to estimate λ, f. However, the calibration of secondary flow coefficients k to account for secondary flow effects correctly is still problematic. In this paper, the calibration of secondary flow coefficients is established by employing two approaches to estimate correct values of k for simulating asymmetric compound channel with different side slopes of the internal wall. The first approach is based on Abril and Knight (2004) who suggest fixed values for main channel and floodplain regions. In the second approach, the equations developed by Devi and Khatua (2017) that relate the variation of the secondary flow coefficients with the relative depth (β) and width ratio (α) are used. The results indicate that the calibration method developed by Devi and Khatua (2017) is a better choice for calibrating the secondary flow coefficients than using the first approach which assumes a fixed value of k for different flow depths. The results also indicate that the boundary condition based on the shear force continuity can successfully be used for simulating rectangular compound channels, while the continuity of depth-averaged velocity and its gradient is accepted boundary condition in simulations of trapezoidal compound channels. However, the SKM performance for predicting the boundary shear stress over the shear layer region may not be improved by only imposing the suitable calibrated values of secondary flow coefficients. This is because difficulties of modelling the complex interaction that develops between the flows in the main channel and on the floodplain in this region

    The nuclear receptors of Biomphalaria glabrata and Lottia gigantea: Implications for developing new model organisms

    Get PDF
    © 2015 Kaur et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are creditedNuclear receptors (NRs) are transcription regulators involved in an array of diverse physiological functions including key roles in endocrine and metabolic function. The aim of this study was to identify nuclear receptors in the fully sequenced genome of the gastropod snail, Biomphalaria glabrata, intermediate host for Schistosoma mansoni and compare these to known vertebrate NRs, with a view to assessing the snail's potential as a invertebrate model organism for endocrine function, both as a prospective new test organism and to elucidate the fundamental genetic and mechanistic causes of disease. For comparative purposes, the genome of a second gastropod, the owl limpet, Lottia gigantea was also investigated for nuclear receptors. Thirty-nine and thirty-three putative NRs were identified from the B. glabrata and L. gigantea genomes respectively, based on the presence of a conserved DNA-binding domain and/or ligand-binding domain. Nuclear receptor transcript expression was confirmed and sequences were subjected to a comparative phylogenetic analysis, which demonstrated that these molluscs have representatives of all the major NR subfamilies (1-6). Many of the identified NRs are conserved between vertebrates and invertebrates, however differences exist, most notably, the absence of receptors of Group 3C, which includes some of the vertebrate endocrine hormone targets. The mollusc genomes also contain NR homologues that are present in insects and nematodes but not in vertebrates, such as Group 1J (HR48/DAF12/HR96). The identification of many shared receptors between humans and molluscs indicates the potential for molluscs as model organisms; however the absence of several steroid hormone receptors indicates snail endocrine systems are fundamentally different.The National Centre for the Replacement, Refinement and Reduction of Animals in Research, Grant Ref:G0900802 to CSJ, LRN, SJ & EJR [www.nc3rs.org.uk]

    Observation of mesoscopic crystalline structures in a two-dimensional Rydberg gas

    Get PDF
    The ability to control and tune interactions in ultracold atomic gases has paved the way towards the realization of new phases of matter. Whereas experiments have so far achieved a high degree of control over short-ranged interactions, the realization of long-range interactions would open up a whole new realm of many-body physics and has become a central focus of research. Rydberg atoms are very well-suited to achieve this goal, as the van der Waals forces between them are many orders of magnitude larger than for ground state atoms. Consequently, the mere laser excitation of ultracold gases can cause strongly correlated many-body states to emerge directly when atoms are transferred to Rydberg states. A key example are quantum crystals, composed of coherent superpositions of different spatially ordered configurations of collective excitations. Here we report on the direct measurement of strong correlations in a laser excited two-dimensional atomic Mott insulator using high-resolution, in-situ Rydberg atom imaging. The observations reveal the emergence of spatially ordered excitation patterns in the high-density components of the prepared many-body state. They have random orientation, but well defined geometry, forming mesoscopic crystals of collective excitations delocalised throughout the gas. Our experiment demonstrates the potential of Rydberg gases to realise exotic phases of matter, thereby laying the basis for quantum simulations of long-range interacting quantum magnets.Comment: 10 pages, 7 figure

    Structural, elastic, mechanical and thermodynamic properties of Terbium oxide: First-principles investigations

    Get PDF
    First-principles investigations of the Terbium oxide TbO are performed on structural, elastic, mechanical and thermodynamic properties. The investigations are accomplished by employing full potential augmented plane wave FP-LAPW method framed within density functional theory DFT as implemented in the WIEN2k package. The exchange-correlation energy functional, a part of the total energy functional, is treated through Perdew Burke Ernzerhof scheme of the Generalized Gradient Approximation PBEGGA. The calculations of the ground state structural parameters, like lattice constants a0, bulk moduli B and their pressure derivative B′ values, are done for the rock-salt RS, zinc-blende ZB, cesium chloride CsCl, wurtzite WZ and nickel arsenide NiAs polymorphs of the TbO compound. The elastic constants (C11, C12, C13, C33, and C44) and mechanical properties (Young's modulus Y, Shear modulus S, Poisson's ratio σ, Anisotropic ratio A and compressibility β), were also calculated to comprehend its potential for valuable applications. From our calculations, the RS phase of TbO compound was found strongest one mechanically amongst the studied cubic structures whereas from hexagonal phases, the NiAs type structure was found stronger than WZ phase of the TbO. To analyze the ductility of the different structures of the TbO, Pugh's rule (B/SH) and Cauchy pressure (C12–C44) approaches are used. It was found that ZB, CsCl and WZ type structures of the TbO were of ductile nature with the obvious dominance of the ionic bonding while RS and NiAs structures exhibited brittle nature with the covalent bonding dominance. Moreover, Debye temperature was calculated for both cubic and hexagonal structures of TbO in question by averaging the computed sound velocities
    corecore