525 research outputs found
Connectionist Theory Refinement: Genetically Searching the Space of Network Topologies
An algorithm that learns from a set of examples should ideally be able to
exploit the available resources of (a) abundant computing power and (b)
domain-specific knowledge to improve its ability to generalize. Connectionist
theory-refinement systems, which use background knowledge to select a neural
network's topology and initial weights, have proven to be effective at
exploiting domain-specific knowledge; however, most do not exploit available
computing power. This weakness occurs because they lack the ability to refine
the topology of the neural networks they produce, thereby limiting
generalization, especially when given impoverished domain theories. We present
the REGENT algorithm which uses (a) domain-specific knowledge to help create an
initial population of knowledge-based neural networks and (b) genetic operators
of crossover and mutation (specifically designed for knowledge-based networks)
to continually search for better network topologies. Experiments on three
real-world domains indicate that our new algorithm is able to significantly
increase generalization compared to a standard connectionist theory-refinement
system, as well as our previous algorithm for growing knowledge-based networks.Comment: See http://www.jair.org/ for any accompanying file
Genetic Variants Improve Breast Cancer Risk Prediction on Mammograms
Several recent genome-wide association studies have identified genetic variants associated with breast cancer. However, how much these genetic variants may help advance breast cancer risk prediction based on other clinical features, like mammographic findings, is unknown. We conducted a retrospective case-control study, collecting mammographic findings and high-frequency/low-penetrance genetic variants from an existing personalized medicine data repository. A Bayesian network was developed using Tree Augmented Naive Bayes (TAN) by training on the mammographic findings, with and without the 22 genetic variants collected. We analyzed the predictive performance using the area under the ROC curve, and found that the genetic variants significantly improved breast cancer risk prediction on mammograms. We also identified the interaction effect between the genetic variants and collected mammographic findings in an attempt to link genotype to mammographic phenotype to better understand disease patterns, mechanisms, and/or natural history.
Investigating Protein Evolution Through Sequence Space Using a Biophysical Lens
How do the underlying biophysical properties of proteins dictate the “rules” that govern molecular evolution? Understanding the principles and mechanisms that determine which evolutionary trajectories proteins take is crucial to protecting humans against viral protein evolution and developing therapeutic, custom, drugs through protein engineering. Although many approaches have been developed to investigate the process of protein evolution, a deep understanding of the relationship between sequence space and protein biophysics can alleviate key deficiencies in our knowledge. What is the underlying distribution of functional proteins in sequence? Do specific biophysical properties dictate the interconnectedness of these functional proteins? How does the protein energy landscape change across evolutionary time and how can that inform our understanding of evolution? This dissertation will explore two methods of answering these questions: 1) High-throughput mutagenesis and phenotype characterization to explore sequence space using fluorescent proteins and 2) Ancestral Sequence Reconstruction linked to a biophysical lens using protein energy landscapes
The Immortal Life of Henrietta Lacks: How a Best-Seller Diffused Online
This study describes how information spread on the internet by examining diffusion, framing and source use surrounding coverage of the 2010 best-selling book, The Immortal Life of Henrietta Lacks. The book presented a rare opportunity to view how a story about science, discovery and race became a best-seller within weeks after its publication. Through a mixed-methods and case study approach, the author examines patterns of coverage using Google Alerts that traced the book\u27s online coverage in the first six months of its release. The author found that online information clustered around several themes with the most prominent describing aspects of science and scientific discovery, followed by the book\u27s characterization as a best seller or good read. Another recurring theme centered on issues surrounding exploitation in human research. In addition, the study reveals that sources who set the frame for coverage were most likely to be media figures, including Oprah Winfrey, Alan Ball and HBO films, in addition to newspapers and individual journalists and science writers. By examining the relationship of online frames with sources, the author found that a diversity of frames is paired with key sources: that is, multiple themes co-occur with source mentions, although the themes may not have been generated by the sources themselves. Rather, sources are linked to narrative frames by others who generate online coverage. The author concludes that, while key sources initially set a message\u27s frame, once diffused, the message may take on other qualities
- …
