1,564 research outputs found
Natural Notation for the Domestic Internet of Things
This study explores the use of natural language to give instructions that
might be interpreted by Internet of Things (IoT) devices in a domestic `smart
home' environment. We start from the proposition that reminders can be
considered as a type of end-user programming, in which the executed actions
might be performed either by an automated agent or by the author of the
reminder. We conducted an experiment in which people wrote sticky notes
specifying future actions in their home. In different conditions, these notes
were addressed to themselves, to others, or to a computer agent.We analyse the
linguistic features and strategies that are used to achieve these tasks,
including the use of graphical resources as an informal visual language. The
findings provide a basis for design guidance related to end-user development
for the Internet of Things.Comment: Proceedings of the 5th International symposium on End-User
Development (IS-EUD), Madrid, Spain, May, 201
Longest Common Extensions in Sublinear Space
The longest common extension problem (LCE problem) is to construct a data
structure for an input string of length that supports LCE
queries. Such a query returns the length of the longest common prefix of the
suffixes starting at positions and in . This classic problem has a
well-known solution that uses space and query time. In this paper
we show that for any trade-off parameter , the problem can
be solved in space and query time. This
significantly improves the previously best known time-space trade-offs, and
almost matches the best known time-space product lower bound.Comment: An extended abstract of this paper has been accepted to CPM 201
Improved Algorithms for Approximate String Matching (Extended Abstract)
The problem of approximate string matching is important in many different
areas such as computational biology, text processing and pattern recognition. A
great effort has been made to design efficient algorithms addressing several
variants of the problem, including comparison of two strings, approximate
pattern identification in a string or calculation of the longest common
subsequence that two strings share.
We designed an output sensitive algorithm solving the edit distance problem
between two strings of lengths n and m respectively in time
O((s-|n-m|)min(m,n,s)+m+n) and linear space, where s is the edit distance
between the two strings. This worst-case time bound sets the quadratic factor
of the algorithm independent of the longest string length and improves existing
theoretical bounds for this problem. The implementation of our algorithm excels
also in practice, especially in cases where the two strings compared differ
significantly in length. Source code of our algorithm is available at
http://www.cs.miami.edu/\~dimitris/edit_distanceComment: 10 page
The hidden perils of read mapping as a quality assessment tool in genome sequencing
This article provides a comparative analysis of the various methods of genome sequencing focusing on verification of the assembly quality. The results of a comparative assessment of various de novo assembly tools, as well as sequencing technologies, are presented using a recently completed sequence of the genome of Lactobacillus fermentum 3872. In particular, quality of assemblies is assessed by using CLC Genomics Workbench read mapping and Optical mapping developed by OpGen. Over-extension of contigs without prior knowledge of contig location can lead to misassembled contigs, even when commonly used quality indicators such as read mapping suggest that a contig is well assembled. Precautions must also be undertaken when using long read sequencing technology, which may also lead to misassembled contigs
Improving Phrap-Based Assembly of the Rat Using “Reliable” Overlaps
The assembly methods used for whole-genome shotgun (WGS) data have a major impact on the quality of resulting draft genomes. We present a novel algorithm to generate a set of “reliable” overlaps based on identifying repeat k-mers. To demonstrate the benefits of using reliable overlaps, we have created a version of the Phrap assembly program that uses only overlaps from a specific list. We call this version PhrapUMD. Integrating PhrapUMD and our “reliable-overlap” algorithm with the Baylor College of Medicine assembler, Atlas, we assemble the BACs from the Rattus norvegicus genome project. Starting with the same data as the Nov. 2002 Atlas assembly, we compare our results and the Atlas assembly to the 4.3 Mb of rat sequence in the 21 BACs that have been finished. Our version of the draft assembly of the 21 BACs increases the coverage of finished sequence from 93.4% to 96.3%, while simultaneously reducing the base error rate from 4.5 to 1.1 errors per 10,000 bases. There are a number of ways of assessing the relative merits of assemblies when the finished sequence is available. If one views the overall quality of an assembly as proportional to the inverse of the product of the error rate and sequence missed, then the assembly presented here is seven times better. The UMD Overlapper with options for reliable overlaps is available from the authors at http://www.genome.umd.edu. We also provide the changes to the Phrap source code enabling it to use only the reliable overlaps
Insights into enterotoxigenic Escherichia coli diversity in Bangladesh utilizing genomic epidemiology
Enzyme‐assisted aqueous extraction of Kalahari melon seed oil: optimization using response surface methodology
Enzymatic extraction of oil from Kalahari melon seeds was investigated and evaluated by response surface methodology (RSM). Two commercial protease enzyme products were used separately: Neutrase® 0.8 L and Flavourzyme® 1000 L from Novozymes (Bagsvaerd, Denmark). RSM was applied to model and optimize the reaction conditions namely concentration of enzyme (20–50 g kg−1 of seed mass), initial pH of mixture (pH 5–9), incubation temperature (40–60 °C), and incubation time (12–36 h). Well fitting models were successfully established for both enzymes: Neutrase 0.8 L (R 2 = 0.9410) and Flavourzyme 1000 L (R 2 = 0.9574) through multiple linear regressions with backward elimination. Incubation time was the most significant reaction factor on oil yield for both enzymes. The optimal conditions for Neutrase 0.8 L were: an enzyme concentration of 25 g kg−1, an initial pH of 7, a temperature at 58 °C and an incubation time of 31 h with constant shaking at 100 rpm. Centrifuging the mixture at 8,000g for 20 min separated the oil with a recovery of 68.58 ± 3.39%. The optimal conditions for Flavourzyme 1000 L were enzyme concentration of 21 g kg−1, initial pH of 6, temperature at 50 °C and incubation time of 36 h. These optimum conditions yielded a 71.55 ± 1.28% oil recovery
Conducting a large, multi-site survey about patients' views on broad consent: challenges and solutions
BACKGROUND: As biobanks play an increasing role in the genomic research that will lead to precision medicine, input from diverse and large populations of patients in a variety of health care settings will be important in order to successfully carry out such studies. One important topic is participants’ views towards consent and data sharing, especially since the 2011 Advanced Notice of Proposed Rulemaking (ANPRM), and subsequently the 2015 Notice of Proposed Rulemaking (NPRM) were issued by the Department of Health and Human Services (HHS) and Office of Science and Technology Policy (OSTP). These notices required that participants consent to research uses of their de-identified tissue samples and most clinical data, and allowing such consent be obtained in a one-time, open-ended or “broad” fashion. Conducting a survey across multiple sites provides clear advantages to either a single site survey or using a large online database, and is a potentially powerful way of understanding the views of diverse populations on this topic.
METHODS: A workgroup of the Electronic Medical Records and Genomics (eMERGE) Network, a national consortium of 9 sites (13 separate institutions, 11 clinical centers) supported by the National Human Genome Research Institute (NHGRI) that combines DNA biorepositories with electronic medical record (EMR) systems for large-scale genetic research, conducted a survey to understand patients’ views on consent, sample and data sharing for future research, biobank governance, data protection, and return of research results.
RESULTS: Working across 9 sites to design and conduct a national survey presented challenges in organization, meeting human subjects guidelines at each institution, and survey development and implementation. The challenges were met through a committee structure to address each aspect of the project with representatives from all sites. Each committee’s output was integrated into the overall survey plan. A number of site-specific issues were successfully managed allowing the survey to be developed and implemented uniformly across 11 clinical centers.
CONCLUSIONS: Conducting a survey across a number of institutions with different cultures and practices is a methodological and logistical challenge. With a clear infrastructure, collaborative attitudes, excellent lines of communication, and the right expertise, this can be accomplished successfully
Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding
We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics
Dynamical Boson Stars
The idea of stable, localized bundles of energy has strong appeal as a model
for particles. In the 1950s John Wheeler envisioned such bundles as smooth
configurations of electromagnetic energy that he called {\em geons}, but none
were found. Instead, particle-like solutions were found in the late 1960s with
the addition of a scalar field, and these were given the name {\em boson
stars}. Since then, boson stars find use in a wide variety of models as sources
of dark matter, as black hole mimickers, in simple models of binary systems,
and as a tool in finding black holes in higher dimensions with only a single
killing vector. We discuss important varieties of boson stars, their dynamic
properties, and some of their uses, concentrating on recent efforts.Comment: 79 pages, 25 figures, invited review for Living Reviews in
Relativity; major revision in 201
- …
