6,741 research outputs found
Modeling truncated pixel values of faint reflections in MicroED images.
The weak pixel counts surrounding the Bragg spots in a diffraction image are important for establishing a model of the background underneath the peak and estimating the reliability of the integrated intensities. Under certain circumstances, particularly with equipment not optimized for low-intensity measurements, these pixel values may be corrupted by corrections applied to the raw image. This can lead to truncation of low pixel counts, resulting in anomalies in the integrated Bragg intensities, such as systematically higher signal-to-noise ratios. A correction for this effect can be approximated by a three-parameter lognormal distribution fitted to the weakly positive-valued pixels at similar scattering angles. The procedure is validated by the improved refinement of an atomic model against structure factor amplitudes derived from corrected micro-electron diffraction (MicroED) images
PMLB: A Large Benchmark Suite for Machine Learning Evaluation and Comparison
The selection, development, or comparison of machine learning methods in data
mining can be a difficult task based on the target problem and goals of a
particular study. Numerous publicly available real-world and simulated
benchmark datasets have emerged from different sources, but their organization
and adoption as standards have been inconsistent. As such, selecting and
curating specific benchmarks remains an unnecessary burden on machine learning
practitioners and data scientists. The present study introduces an accessible,
curated, and developing public benchmark resource to facilitate identification
of the strengths and weaknesses of different machine learning methodologies. We
compare meta-features among the current set of benchmark datasets in this
resource to characterize the diversity of available data. Finally, we apply a
number of established machine learning methods to the entire benchmark suite
and analyze how datasets and algorithms cluster in terms of performance. This
work is an important first step towards understanding the limitations of
popular benchmarking suites and developing a resource that connects existing
benchmarking standards to more diverse and efficient standards in the future.Comment: 14 pages, 5 figures, submitted for review to JML
Automating biomedical data science through tree-based pipeline optimization
Over the past decade, data science and machine learning has grown from a
mysterious art form to a staple tool across a variety of fields in academia,
business, and government. In this paper, we introduce the concept of tree-based
pipeline optimization for automating one of the most tedious parts of machine
learning---pipeline design. We implement a Tree-based Pipeline Optimization
Tool (TPOT) and demonstrate its effectiveness on a series of simulated and
real-world genetic data sets. In particular, we show that TPOT can build
machine learning pipelines that achieve competitive classification accuracy and
discover novel pipeline operators---such as synthetic feature
constructors---that significantly improve classification accuracy on these data
sets. We also highlight the current challenges to pipeline optimization, such
as the tendency to produce pipelines that overfit the data, and suggest future
research paths to overcome these challenges. As such, this work represents an
early step toward fully automating machine learning pipeline design.Comment: 16 pages, 5 figures, to appear in EvoBIO 2016 proceeding
AWARE: Platform for Autonomous self-deploying and operation of Wireless sensor-actuator networks cooperating with unmanned AeRial vehiclEs
This paper presents the AWARE platform that seeks to enable the cooperation of autonomous aerial vehicles with ground wireless sensor-actuator networks comprising both static and mobile nodes carried by vehicles or people. Particularly, the paper presents the middleware, the wireless sensor network, the node deployment by means of an autonomous helicopter, and the surveillance and tracking functionalities of the platform. Furthermore, the paper presents the first general experiments of the AWARE project that took place in March 2007 with the assistance of the Seville fire brigades
MicroED data collection and processing.
MicroED, a method at the intersection of X-ray crystallography and electron cryo-microscopy, has rapidly progressed by exploiting advances in both fields and has already been successfully employed to determine the atomic structures of several proteins from sub-micron-sized, three-dimensional crystals. A major limiting factor in X-ray crystallography is the requirement for large and well ordered crystals. By permitting electron diffraction patterns to be collected from much smaller crystals, or even single well ordered domains of large crystals composed of several small mosaic blocks, MicroED has the potential to overcome the limiting size requirement and enable structural studies on difficult-to-crystallize samples. This communication details the steps for sample preparation, data collection and reduction necessary to obtain refined, high-resolution, three-dimensional models by MicroED, and presents some of its unique challenges
- …
