2,885 research outputs found

    elPrep 4 : a multithreaded framework for sequence analysis

    Get PDF
    We present elPrep 4, a reimplementation from scratch of the elPrep framework for processing sequence alignment map files in the Go programming language. elPrep 4 includes multiple new features allowing us to process all of the preparation steps defined by the GATK Best Practice pipelines for variant calling. This includes new and improved functionality for sorting, (optical) duplicate marking, base quality score recalibration, BED and VCF parsing, and various filtering options. The implementations of these options in elPrep 4 faithfully reproduce the outcomes of their counterparts in GATK 4, SAMtools, and Picard, even though the underlying algorithms are redesigned to take advantage of elPrep's parallel execution framework to vastly improve the runtime and resource use compared to these tools. Our benchmarks show that elPrep executes the preparation steps of the GATK Best Practices up to 13x faster on WES data, and up to 7.4x faster for WGS data compared to running the same pipeline with GATK 4, while utilizing fewer compute resources

    Halvade: scalable sequence analysis with MapReduce

    Get PDF
    Motivation: Post-sequencing DNA analysis typically consists of read mapping followed by variant calling. Especially for whole genome sequencing, this computational step is very time-consuming, even when using multithreading on a multi-core machine. Results: We present Halvade, a framework that enables sequencing pipelines to be executed in parallel on a multi-node and/or multi-core compute infrastructure in a highly efficient manner. As an example, a DNA sequencing analysis pipeline for variant calling has been implemented according to the GATK Best Practices recommendations, supporting both whole genome and whole exome sequencing. Using a 15-node computer cluster with 360 CPU cores in total, Halvade processes the NA12878 dataset (human, 100 bp paired-end reads, 50x coverage) in <3 h with very high parallel efficiency. Even on a single, multi-core machine, Halvade attains a significant speedup compared with running the individual tools with multithreading. Availability and implementation: Halvade is written in Java and uses the Hadoop MapReduce 2.0 API. It supports a wide range of distributions of Hadoop, including Cloudera and Amazon EMR

    Virtual Machine Support for Many-Core Architectures: Decoupling Abstract from Concrete Concurrency Models

    Get PDF
    The upcoming many-core architectures require software developers to exploit concurrency to utilize available computational power. Today's high-level language virtual machines (VMs), which are a cornerstone of software development, do not provide sufficient abstraction for concurrency concepts. We analyze concrete and abstract concurrency models and identify the challenges they impose for VMs. To provide sufficient concurrency support in VMs, we propose to integrate concurrency operations into VM instruction sets. Since there will always be VMs optimized for special purposes, our goal is to develop a methodology to design instruction sets with concurrency support. Therefore, we also propose a list of trade-offs that have to be investigated to advise the design of such instruction sets. As a first experiment, we implemented one instruction set extension for shared memory and one for non-shared memory concurrency. From our experimental results, we derived a list of requirements for a full-grown experimental environment for further research

    Multithreaded variant calling in elPrep 5

    Get PDF
    We present elPrep 5, which updates the elPrep framework for processing sequencing alignment/map files with variant calling. elPrep 5 can now execute the full pipeline described by the GATK Best Practices for variant calling, which consists of PCR and optical duplicate marking, sorting by coordinate order, base quality score recalibration, and variant calling using the haplotype caller algorithm. elPrep 5 produces identical BAM and VCF output as GATK4 while significantly reducing the runtime by parallelizing and merging the execution of the pipeline steps. Our benchmarks show that elPrep 5 speeds up the runtime of the variant calling pipeline by a factor 8-16x on both whole-exome and whole-genome data while using the same hardware resources as GATK4. This makes elPrep 5 a suitable drop-in replacement for GATK4 when faster execution times are needed

    An Introduction to Context-Oriented Programming with ContextS

    Full text link

    A note on comonotonicity and positivity of the control components of decoupled quadratic FBSDE

    Get PDF
    In this small note we are concerned with the solution of Forward-Backward Stochastic Differential Equations (FBSDE) with drivers that grow quadratically in the control component (quadratic growth FBSDE or qgFBSDE). The main theorem is a comparison result that allows comparing componentwise the signs of the control processes of two different qgFBSDE. As a byproduct one obtains conditions that allow establishing the positivity of the control process.Comment: accepted for publicatio

    Evaluation of Enterprise Ireland Research, Development and Innovation Programme. ESRI General Papers March 2021.

    Get PDF
    This independent evaluation examined the impact of direct financial supports to firms through the Enterprise Ireland Research, Development and Innovation Programme on the innovation and economic performance of beneficiaries over the period 2007-2018

    Severe early onset preeclampsia: short and long term clinical, psychosocial and biochemical aspects

    Get PDF
    Preeclampsia is a pregnancy specific disorder commonly defined as de novo hypertension and proteinuria after 20 weeks gestational age. It occurs in approximately 3-5% of pregnancies and it is still a major cause of both foetal and maternal morbidity and mortality worldwide1. As extensive research has not yet elucidated the aetiology of preeclampsia, there are no rational preventive or therapeutic interventions available. The only rational treatment is delivery, which benefits the mother but is not in the interest of the foetus, if remote from term. Early onset preeclampsia (<32 weeks’ gestational age) occurs in less than 1% of pregnancies. It is, however often associated with maternal morbidity as the risk of progression to severe maternal disease is inversely related with gestational age at onset2. Resulting prematurity is therefore the main cause of neonatal mortality and morbidity in patients with severe preeclampsia3. Although the discussion is ongoing, perinatal survival is suggested to be increased in patients with preterm preeclampsia by expectant, non-interventional management. This temporising treatment option to lengthen pregnancy includes the use of antihypertensive medication to control hypertension, magnesium sulphate to prevent eclampsia and corticosteroids to enhance foetal lung maturity4. With optimal maternal haemodynamic status and reassuring foetal condition this results on average in an extension of 2 weeks. Prolongation of these pregnancies is a great challenge for clinicians to balance between potential maternal risks on one the eve hand and possible foetal benefits on the other. Clinical controversies regarding prolongation of preterm preeclamptic pregnancies still exist – also taking into account that preeclampsia is the leading cause of maternal mortality in the Netherlands5 - a debate which is even more pronounced in very preterm pregnancies with questionable foetal viability6-9. Do maternal risks of prolongation of these very early pregnancies outweigh the chances of neonatal survival? Counselling of women with very early onset preeclampsia not only comprises of knowledge of the outcome of those particular pregnancies, but also knowledge of outcomes of future pregnancies of these women is of major clinical importance. This thesis opens with a review of the literature on identifiable risk factors of preeclampsia

    Impacts of the Tropical Pacific/Indian Oceans on the Seasonal Cycle of the West African Monsoon

    Get PDF
    The current consensus is that drought has developed in the Sahel during the second half of the twentieth century as a result of remote effects of oceanic anomalies amplified by local land–atmosphere interactions. This paper focuses on the impacts of oceanic anomalies upon West African climate and specifically aims to identify those from SST anomalies in the Pacific/Indian Oceans during spring and summer seasons, when they were significant. Idealized sensitivity experiments are performed with four atmospheric general circulation models (AGCMs). The prescribed SST patterns used in the AGCMs are based on the leading mode of covariability between SST anomalies over the Pacific/Indian Oceans and summer rainfall over West Africa. The results show that such oceanic anomalies in the Pacific/Indian Ocean lead to a northward shift of an anomalous dry belt from the Gulf of Guinea to the Sahel as the season advances. In the Sahel, the magnitude of rainfall anomalies is comparable to that obtained by other authors using SST anomalies confined to the proximity of the Atlantic Ocean. The mechanism connecting the Pacific/Indian SST anomalies with West African rainfall has a strong seasonal cycle. In spring (May and June), anomalous subsidence develops over both the Maritime Continent and the equatorial Atlantic in response to the enhanced equatorial heating. Precipitation increases over continental West Africa in association with stronger zonal convergence of moisture. In addition, precipitation decreases over the Gulf of Guinea. During the monsoon peak (July and August), the SST anomalies move westward over the equatorial Pacific and the two regions where subsidence occurred earlier in the seasons merge over West Africa. The monsoon weakens and rainfall decreases over the Sahel, especially in August.Peer reviewe
    corecore