767 research outputs found
Integrative Bioinformatics Approaches toward Systems-level Understanding of Breast Cancers
Genome-wide expression profiling technologies, such as microarray expression and next-generation sequencing, have allowed unprecedented opportunities to study complex diseases at systems-level. However, the ever increasing amounts of high-throughput genomic data are extremely heterogeneous and each individual experiment provides a different aspect of the phenotype of interest. Meanwhile, numerous bioinformatics tools have been developed for genomic data analysis. Current methods for integration of data and analysis tools are scattered. Developing a systematic approach for the integration of various experiments would greatly benefit the research community.
In this thesis, we present a variety of integrative data-driven bioinformatics strategies to facilitate the data analysis and derive biological knowledge from gene expression data of breast cancer. First, we show that the use of meta-analysis on multiple microarray profiles of estrogen receptor positive and estrogen receptor negative breast cancers reveals important biological functions not found from the individual analysis. By applying a network analysis, we identify the change of gene expression between Luminal A and Luminal B breast cancer subtypes and genes representing the change. Next, we demonstrate a bioinformatics strategy to detect genes that play important roles in endocrine resistance in estrogen receptor positive breast cancers. By combining the analyses of differentially expressed genes, enriched gene set, co-expressed genes, the expression of drug-treated cancer cell lines, and clinical information, we demonstrate how our proposed strategy identifies the key genes in tamoxifen-resistant tumors and the potential new therapeutics against the resistance. Lastly, by using matched mRNA and microRNA expression we develop an integrative approach for the prediction of important transcription factors and microRNAs that are involved in dysregulated pathways in breast cancer. Our method employs random forests and robust rank aggregation to derive a reliable importance ranking for candidate regulators predicted by other bioinformatics tools. In conclusion, this thesis demonstrates that the proposed integrative bioinformatics strategies can efficiently combine heterogeneous genomic data and provide new insights on breast cancers
Automated Force Field Parameterization for Nonpolarizable and Polarizable Atomic Models Based on Ab Initio Target Data
Classical
molecular dynamics (MD) simulations based on atomistic
models are increasingly used to study a wide range of biological systems.
A prerequisite for meaningful results from such simulations is an
accurate molecular mechanical force field. Most biomolecular simulations
are currently based on the widely used AMBER and CHARMM force fields,
which were parametrized and optimized to cover a small set of basic
compounds corresponding to the natural amino acids and nucleic acid
bases. Atomic models of additional compounds are commonly generated
by analogy to the parameter set of a given force field. While this
procedure yields models that are internally consistent, the accuracy
of the resulting models can be limited. In this work, we propose a
method, general automated atomic model parameterization (GAAMP), for
generating automatically the parameters of atomic models of small
molecules using the results from ab initio quantum mechanical (QM)
calculations as target data. Force fields that were previously developed
for a wide range of model compounds serve as initial guesses, although
any of the final parameter can be optimized. The electrostatic parameters
(partial charges, polarizabilities, and shielding) are optimized on
the basis of QM electrostatic potential (ESP) and, if applicable,
the interaction energies between the compound and water molecules.
The soft dihedrals are automatically identified and parametrized by
targeting QM dihedral scans as well as the energies of stable conformers.
To validate the approach, the solvation free energy is calculated
for more than 200 small molecules and MD simulations of three different
proteins are carried out
Multimedia Model for Polycyclic Aromatic Hydrocarbons (PAHs) and Nitro-PAHs in Lake Michigan
Polycyclic aromatic
hydrocarbon (PAH) contamination in the U.S.
Great Lakes has long been of concern, but information regarding the
current sources, distribution, and fate of PAH contamination is lacking,
and very little information exists for the potentially more toxic
nitro-derivatives of PAHs (NPAHs). This study uses fugacity, food
web, and Monte Carlo models to examine 16 PAHs and five NPAHs in Lake
Michigan, and to derive PAH and NPAH emission estimates. Good agreement
was found between predicted and measured PAH concentrations in air,
but concentrations in water and sediment were generally under-predicted,
possibly due to incorrect parameter estimates for degradation rates,
discharges to water, or inputs from tributaries. The food web model
matched measurements of heavier PAHs (≥5 rings) in lake trout,
but lighter PAHs (≤4 rings) were overpredicted, possibly due
to overestimates of metabolic half-lives or gut/gill absorption efficiencies.
Derived PAH emission rates peaked in the 1950s, and rates now approach
those in the mid-19th century. The derived emission rates far exceed
those in the source inventories, suggesting the need to reconcile
differences and reduce uncertainties. Although additional measurements
and physiochemical data are needed to reduce uncertainties and for
validation purposes, the models illustrate the behavior of PAHs and
NPAHs in Lake Michigan, and they provide useful and potentially diagnostic
estimates of emission rates
Automated Force Field Parameterization for Nonpolarizable and Polarizable Atomic Models Based on Ab Initio Target Data
Classical
molecular dynamics (MD) simulations based on atomistic
models are increasingly used to study a wide range of biological systems.
A prerequisite for meaningful results from such simulations is an
accurate molecular mechanical force field. Most biomolecular simulations
are currently based on the widely used AMBER and CHARMM force fields,
which were parametrized and optimized to cover a small set of basic
compounds corresponding to the natural amino acids and nucleic acid
bases. Atomic models of additional compounds are commonly generated
by analogy to the parameter set of a given force field. While this
procedure yields models that are internally consistent, the accuracy
of the resulting models can be limited. In this work, we propose a
method, general automated atomic model parameterization (GAAMP), for
generating automatically the parameters of atomic models of small
molecules using the results from ab initio quantum mechanical (QM)
calculations as target data. Force fields that were previously developed
for a wide range of model compounds serve as initial guesses, although
any of the final parameter can be optimized. The electrostatic parameters
(partial charges, polarizabilities, and shielding) are optimized on
the basis of QM electrostatic potential (ESP) and, if applicable,
the interaction energies between the compound and water molecules.
The soft dihedrals are automatically identified and parametrized by
targeting QM dihedral scans as well as the energies of stable conformers.
To validate the approach, the solvation free energy is calculated
for more than 200 small molecules and MD simulations of three different
proteins are carried out
Langevin Dynamics Simulations of the Diffusion of Molecular Knots in Tensioned Polymer Chains<sup>†</sup>
Motivated by recent experiments, in which knots have been tied in individual biopolymer molecules, we use
Langevin dynamics simulations to study the diffusion of a knot along a tensioned polymer chain. We find
that the dependence of the knot diffusion coefficient on the tension can be non-monotonic. This behavior can
be explained by the model, in which the motion of the knot involves cooperative displacement of a local knot
region. At low tension, the overall viscous drag force that acts on the knot region is proportional to the
number N of monomers that participate in the knot, which decreases as the tension is increased, leading to
faster diffusion. At high tension the knot becomes tight and its dynamics are dominated by the chain's internal
friction, which increases with the increasing tension, thereby slowing down the knot diffusion. This model is
further supported by the observation that the knot diffusion coefficient measured across a set of different
knot types is inversely proportional to N. We propose that the lack of tension dependence of the knot diffusion
coefficients measured in recent experiments is due to the fact that the experimental values of the tension are
close to the turnover between the high- and low-force regimes
DataSheet1_Effects of total dissolved gas supersaturation and sediment on environmental DNA persistence of grass carp (Ctenopharyngodon idella) in water.pdf
Environmental DNA (eDNA) technology has become an alternative tool for monitoring aquatic communities due to its sensitive, economical, and non-invasive properties. However, the application of this technique is often limited by the complexity of environmental conditions, which often poses a barrier to the transmission of biological information. Here, we conducted a series of experiments with grass carp as the target species to evaluate the effects of total dissolved gas (TDG) supersaturation and sediment on the persistence of eDNA under different flow conditions. The results showed TDG supersaturation promoted eDNA decay in still water but with no significant effect in flowing water for rapid dissipation of TDG. For sediment, its presence accelerated the decay of eDNA no matter the flow conditions. The grass carp eDNA showed an exponential decay pattern in water and the decay rate constant decreased gradually with time. Our study highlights the importance of integrating experimental results with the natural environment and provides an important reference for species monitoring using eDNA technology in aquatic ecosystems with high dams built.</p
Data_Sheet_1_Magnitude and determinants of plant root hydraulic redistribution: A global synthesis analysis.xls
Plant root hydraulic redistribution (HR) has been widely recognized as a phenomenon that helps alleviate vegetation drought stress. However, a systematic assessment of the magnitude of HR and its drivers at the global scale are lacking. We collected 37 peer-reviewed papers (comprising 47 research sites) published in 1900–2018 and comprehensively analyzed the magnitude of HR and its underlying factors. We used a weighting method to analyze HR magnitude and its effect on plant transpiration. Machine learning algorithms (boosted regression trees) and structural equation modeling were used to determine the influence of each factor on HR magnitude. We found that the magnitude of HR was 0.249 mm H2O d−1 (95% CI, 0.113–0.384) and its contribution to plant transpiration was 27.4% (3–79%). HR varied significantly among different terrestrial biomes and mainly occurred in forests with drier conditions, such as temperate forest ecosystems (HR = 0.502 mm H2O d−1), where HR was significantly higher than in other ecosystems (p < 0.01). The magnitude of HR in angiosperms was significantly higher than that in gymnosperms (p < 0.05). The mean magnitude of HR first increased and then decreased with an increase in humidity index; conversely, the mean magnitude of HR decreased with an increase in water table depth. HR was significantly positively correlated with root length and transpiration. Plant characteristics and environmental factors jointly accounted for 61.0% of the variation in HR, and plant transpiration was the major factor that directly influenced HR (43.1% relative importance; p < 0.001), and soil texture was an important indirect driver of HR. Our synthesis offers a comprehensive perspective of how plant characteristics and environmental factors influence HR magnitude.</p
- …
