266 research outputs found

    Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline

    Full text link
    From medical charts to national census, healthcare has traditionally operated under a paper-based paradigm. However, the past decade has marked a long and arduous transformation bringing healthcare into the digital age. Ranging from electronic health records, to digitized imaging and laboratory reports, to public health datasets, today, healthcare now generates an incredible amount of digital information. Such a wealth of data presents an exciting opportunity for integrated machine learning solutions to address problems across multiple facets of healthcare practice and administration. Unfortunately, the ability to derive accurate and informative insights requires more than the ability to execute machine learning models. Rather, a deeper understanding of the data on which the models are run is imperative for their success. While a significant effort has been undertaken to develop models able to process the volume of data obtained during the analysis of millions of digitalized patient records, it is important to remember that volume represents only one aspect of the data. In fact, drawing on data from an increasingly diverse set of sources, healthcare data presents an incredibly complex set of attributes that must be accounted for throughout the machine learning pipeline. This chapter focuses on highlighting such challenges, and is broken down into three distinct components, each representing a phase of the pipeline. We begin with attributes of the data accounted for during preprocessing, then move to considerations during model building, and end with challenges to the interpretation of model output. For each component, we present a discussion around data as it relates to the healthcare domain and offer insight into the challenges each may impose on the efficiency of machine learning techniques.Comment: Healthcare Informatics, Machine Learning, Knowledge Discovery: 20 Pages, 1 Figur

    Design of Experiments for Screening

    Full text link
    The aim of this paper is to review methods of designing screening experiments, ranging from designs originally developed for physical experiments to those especially tailored to experiments on numerical models. The strengths and weaknesses of the various designs for screening variables in numerical models are discussed. First, classes of factorial designs for experiments to estimate main effects and interactions through a linear statistical model are described, specifically regular and nonregular fractional factorial designs, supersaturated designs and systematic fractional replicate designs. Generic issues of aliasing, bias and cancellation of factorial effects are discussed. Second, group screening experiments are considered including factorial group screening and sequential bifurcation. Third, random sampling plans are discussed including Latin hypercube sampling and sampling plans to estimate elementary effects. Fourth, a variety of modelling methods commonly employed with screening designs are briefly described. Finally, a novel study demonstrates six screening methods on two frequently-used exemplars, and their performances are compared

    Influences on gum feeding in primates

    Get PDF
    This chapter reviews the factors that may affect patterns of gum feeding by primates. These are then examined for mixed-species troops of saddleback (S. fuscicollis) and mustached (S. mystax) tamarins. An important distinction is made between gums produced by tree trunks and branches as a result of damage and those produced by seed pods as part of a dispersal strategy as these may be expected to differ in their biochemistry. Feeding on fruit and Parkia seed pod exudates was more prevalent in the morning whereas other exudates were eaten in the afternoon. This itinerary may represent a deliberate strategy to retain trunk gums in the gut overnight, thus maximising the potential for microbial fermentation of their β-linked oligosaccharides. Both types of exudates were eaten more in the dry than the wet season. Consumption was linked to seasonal changes in resource availability and not the tamarins’ reproductive status pro-viding no support for the suggestion that gums are eaten as a pri-mary calcium source in the later stages of gestation and lactation. The role of availability in determining patterns of consumption is further supported by the finding that dietary overlap for the trunk gums eaten was greater between species within mixed-species troops within years than it was within species between years. These data and those for pygmy marmosets (Cebuella pygmaea) suggest that patterns of primate gummivory may reflect the interaction of prefer-ence and availability for both those able to stimulate gum production and those not

    Statistical process control of mortality series in the Australian and New Zealand Intensive Care Society (ANZICS) adult patient database: implications of the data generating process

    Get PDF
    for the ANZICS Centre for Outcome and Resource Evaluation (CORE) of the Australian and New Zealand Intensive Care Society (ANZICS)BACKGROUND Statistical process control (SPC), an industrial sphere initiative, has recently been applied in health care and public health surveillance. SPC methods assume independent observations and process autocorrelation has been associated with increase in false alarm frequency. METHODS Monthly mean raw mortality (at hospital discharge) time series, 1995–2009, at the individual Intensive Care unit (ICU) level, were generated from the Australia and New Zealand Intensive Care Society adult patient database. Evidence for series (i) autocorrelation and seasonality was demonstrated using (partial)-autocorrelation ((P)ACF) function displays and classical series decomposition and (ii) “in-control” status was sought using risk-adjusted (RA) exponentially weighted moving average (EWMA) control limits (3 sigma). Risk adjustment was achieved using a random coefficient (intercept as ICU site and slope as APACHE III score) logistic regression model, generating an expected mortality series. Application of time-series to an exemplar complete ICU series (1995-(end)2009) was via Box-Jenkins methodology: autoregressive moving average (ARMA) and (G)ARCH ((Generalised) Autoregressive Conditional Heteroscedasticity) models, the latter addressing volatility of the series variance. RESULTS The overall data set, 1995-2009, consisted of 491324 records from 137 ICU sites; average raw mortality was 14.07%; average(SD) raw and expected mortalities ranged from 0.012(0.113) and 0.013(0.045) to 0.296(0.457) and 0.278(0.247) respectively. For the raw mortality series: 71 sites had continuous data for assessment up to or beyond lag ₄₀ and 35% had autocorrelation through to lag ₄₀; and of 36 sites with continuous data for ≥ 72 months, all demonstrated marked seasonality. Similar numbers and percentages were seen with the expected series. Out-of-control signalling was evident for the raw mortality series with respect to RA-EWMA control limits; a seasonal ARMA model, with GARCH effects, displayed white-noise residuals which were in-control with respect to EWMA control limits and one-step prediction error limits (3SE). The expected series was modelled with a multiplicative seasonal autoregressive model. CONCLUSIONS The data generating process of monthly raw mortality series at the ICU level displayed autocorrelation, seasonality and volatility. False-positive signalling of the raw mortality series was evident with respect to RA-EWMA control limits. A time series approach using residual control charts resolved these issues.John L Moran, Patricia J Solomo

    Hydrokinetic Turbine Effects on Fish Swimming Behaviour

    Get PDF
    Hydrokinetic turbines, targeting the kinetic energy of fast-flowing currents, are under development with some turbines already deployed at ocean sites around the world. It remains virtually unknown as to how these technologies affect fish, and rotor collisions have been postulated as a major concern. In this study the effects of a vertical axis hydrokinetic rotor with rotational speeds up to 70 rpm were tested on the swimming patterns of naturally occurring fish in a subtropical tidal channel. Fish movements were recorded with and without the rotor in place. Results showed that no fish collided with the rotor and only a few specimens passed through rotor blades. Overall, fish reduced their movements through the area when the rotor was present. This deterrent effect on fish increased with current speed. Fish that passed the rotor avoided the near-field, about 0.3 m from the rotor for benthic reef fish. Large predatory fish were particularly cautious of the rotor and never moved closer than 1.7 m in current speeds above 0.6 ms-1. The effects of the rotor differed among taxa and feeding guilds and it is suggested that fish boldness and body shape influenced responses. In conclusion, the tested hydrokinetic turbine rotor proved non-hazardous to fish during the investigated conditions. However, the results indicate that arrays comprising multiple turbines may restrict fish movements, particularly for large species, with possible effects on habitat connectivity if migration routes are exploited. Arrays of the investigated turbine type and comparable systems should therefore be designed with gaps of several metres width to allow large fish to pass through. In combination with further research the insights from this study can be used for guiding the design of hydrokinetic turbine arrays where needed, so preventing ecological impacts

    Role of MC1R variants in uveal melanoma

    Get PDF
    Variants of the melanocortin-1 receptor (MC1R) gene have been linked to sun-sensitive skin types and hair colour, and may independently play a role in susceptibility to cutaneous melanoma. To assess the role of MC1R variants in uveal melanoma, we have analysed a cohort of 350 patients for the changes within the major region of the gene displaying sequence variation. Eight variants were detected – V60L, D84E, V92M, R151C, I155T, R160W, R163Q and D294H – 63% of these patients being hetero- or homozygous for at least one variant. Standard melanoma risk factor data were available on 119 of the patients. MC1R variants were significantly associated with hair colour (P¼0.03) but not skin or eye colour. The frequency of the variants detected in the 350 patients was comparable with those in the general population, and comparison of the cumulative tumour distribution by age at diagnosis in carriers and noncarriers provided no evidence that MC1R variants confer an increased risk of uveal melanoma. We interpret the data as indicating that MC1R variants do not appear to be major determinants of susceptibility to uveal melanoma. © 2003 Cancer Research U

    Seismic air gun exposure during early-stage embryonic development does not negatively affect spiny lobster Jasus edwardsii larvae (Decapoda:Palinuridae)

    Get PDF
    Marine seismic surveys are used to explore for sub-seafloor oil and gas deposits. These surveys are conducted using air guns, which release compressed air to create intense sound impulses, which are repeated around every 8-12 seconds and can travel large distances in the water column. Considering the ubiquitous worldwide distribution of seismic surveys, the potential impact of exposure on marine invertebrates is poorly understood. In this study, egg-bearing female spiny lobsters (Jasus edwardsii) were exposed to signals from three air gun configurations, all of which exceeded sound exposure levels (SEL) of 185 dB re 1 µPa2·s. Lobsters were maintained until their eggs hatched and the larvae were then counted for fecundity, assessed for abnormal morphology using measurements of larval length and width, tested for larval competency using an established activity test and measured for energy content. Overall there were no differences in the quantity or quality of hatched larvae, indicating that the condition and development of spiny lobster embryos were not adversely affected by air gun exposure. These results suggest that embryonic spiny lobster are resilient to air gun signals and highlight the caution necessary in extrapolating results from the laboratory to real world scenarios or across life history stages

    Rapid response of Helheim Glacier in Greenland to climate variability over the past century

    Get PDF
    Author Posting. © The Author(s), 2011. This is the author's version of the work. It is posted here by permission of Nature Publishing Group for personal use, not for redistribution. The definitive version was published in Nature Geoscience 5 (2012): 37-41, doi:10.1038/ngeo1349.During the early 2000s the Greenland Ice Sheet experienced the largest ice mass loss observed on the instrumental record1, largely as a result of the acceleration, thinning and retreat of major outlet glaciers in West and Southeast Greenland2-5. The quasi-simultaneous change in the glaciers suggests a common climate forcing and increasing air6 and ocean7-8 temperatures have been indicated as potential triggers. Here, we present a new record of calving activity of Helheim Glacier, East Greenland, extending back to c. 1890 AD. This record was obtained by analysing sedimentary deposits from Sermilik Fjord, where Helheim Glacier terminates, and uses the annual deposition of sand grains as a proxy for iceberg discharge. The 120 year long record reveals large fluctuations in calving rates, but that the present high rate was reproduced only in the 1930s. A comparison with climate indices indicates that high calving activity coincides with increased Atlantic Water and decreased Polar Water influence on the shelf, warm summers and a negative phase of the North Atlantic Oscillation. Our analysis provides evidence that Helheim Glacier responds to short-term (3-10 years) large-scale oceanic and atmospheric fluctuations.This study has been supported by Geocenter Denmark in financial support to the SEDIMICE project. CSA was supported by the Danish Council for Independent Research│Nature and Universe (Grant no. 09-064954/FNU). FSt was supported by NSF ARC 0909373 and by WHOI’s Ocean and Climate Change Institute and MHRI was supported by the Danish Agency for Science, Technology and Innovation.2012-06-1
    corecore