324 research outputs found
Online Pattern Matching for String Edit Distance with Moves
Edit distance with moves (EDM) is a string-to-string distance measure that
includes substring moves in addition to ordinal editing operations to turn one
string to the other. Although optimizing EDM is intractable, it has many
applications especially in error detections. Edit sensitive parsing (ESP) is an
efficient parsing algorithm that guarantees an upper bound of parsing
discrepancies between different appearances of the same substrings in a string.
ESP can be used for computing an approximate EDM as the L1 distance between
characteristic vectors built by node labels in parsing trees. However, ESP is
not applicable to a streaming text data where a whole text is unknown in
advance. We present an online ESP (OESP) that enables an online pattern
matching for EDM. OESP builds a parse tree for a streaming text and computes
the L1 distance between characteristic vectors in an online manner. For the
space-efficient computation of EDM, OESP directly encodes the parse tree into a
succinct representation by leveraging the idea behind recent results of a
dynamic succinct tree. We experimentally test OESP on the ability to compute
EDM in an online manner on benchmark datasets, and we show OESP's efficiency.Comment: This paper has been accepted to the 21st edition of the International
Symposium on String Processing and Information Retrieval (SPIRE2014
Composite repetition-aware data structures
In highly repetitive strings, like collections of genomes from the same
species, distinct measures of repetition all grow sublinearly in the length of
the text, and indexes targeted to such strings typically depend only on one of
these measures. We describe two data structures whose size depends on multiple
measures of repetition at once, and that provide competitive tradeoffs between
the time for counting and reporting all the exact occurrences of a pattern, and
the space taken by the structure. The key component of our constructions is the
run-length encoded BWT (RLBWT), which takes space proportional to the number of
BWT runs: rather than augmenting RLBWT with suffix array samples, we combine it
with data structures from LZ77 indexes, which take space proportional to the
number of LZ77 factors, and with the compact directed acyclic word graph
(CDAWG), which takes space proportional to the number of extensions of maximal
repeats. The combination of CDAWG and RLBWT enables also a new representation
of the suffix tree, whose size depends again on the number of extensions of
maximal repeats, and that is powerful enough to support matching statistics and
constant-space traversal.Comment: (the name of the third co-author was inadvertently omitted from
previous version
One-dimensional staged self-assembly
17th International Conference, DNA 17, Pasadena, CA, USA, September 19-23, 2011. ProceedingsWe introduce the problem of staged self-assembly of one-dimensional nanostructures, which becomes interesting when the elements are labeled (e.g., representing functional units that must be placed at specific locations). In a restricted model in which each operation has a single terminal assembly, we prove that assembling a given string of labels with the fewest stages is equivalent, up to constant factors, to compressing the string to be uniquely derived from the smallest possible context-free grammar (a well-studied O(logn)-approximable problem). Without this restriction, we show that the optimal assembly can be substantially smaller than the optimal context-free grammar, by a factor of Ω √n/log n even for binary strings of length n. Fortunately, we can bound this separation in model power by a quadratic function in the number of distinct glues or tiles allowed in the assembly, which is typically small in practice
On the maximal number of cubic subwords in a string
We investigate the problem of the maximum number of cubic subwords (of the
form ) in a given word. We also consider square subwords (of the form
). The problem of the maximum number of squares in a word is not well
understood. Several new results related to this problem are produced in the
paper. We consider two simple problems related to the maximum number of
subwords which are squares or which are highly repetitive; then we provide a
nontrivial estimation for the number of cubes. We show that the maximum number
of squares such that is not a primitive word (nonprimitive squares) in
a word of length is exactly , and the
maximum number of subwords of the form , for , is exactly .
In particular, the maximum number of cubes in a word is not greater than
either. Using very technical properties of occurrences of cubes, we improve
this bound significantly. We show that the maximum number of cubes in a word of
length is between and . (In particular, we improve the
lower bound from the conference version of the paper.)Comment: 14 page
Fingerprints in Compressed Strings
The Karp-Rabin fingerprint of a string is a type of hash value that due to its strong properties has been used in many string algorithms. In this paper we show how to construct a data structure for a string S of size N compressed by a context-free grammar of size n that answers fingerprint queries. That is, given indices i and j, the answer to a query is the fingerprint of the substring S[i,j]. We present the first O(n) space data structures that answer fingerprint queries without decompressing any characters. For Straight Line Programs (SLP) we get O(logN) query time, and for Linear SLPs (an SLP derivative that captures LZ78 compression and its variations) we get O(log log N) query time. Hence, our data structures has the same time and space complexity as for random access in SLPs. We utilize the fingerprint data structures to solve the longest common extension problem in query time O(log N log l) and O(log l log log l + log log N) for SLPs and Linear SLPs, respectively. Here, l denotes the length of the LCE
Willow short-rotation production systems in Canada and Northern United States: A review
Willow short rotation coppice (SRC) systems are becoming an attractive practice because they are a sustainable system fulfilling multiple ecological objectives with significant environmental benefits. A sustainable supply of bioenergy feedstock can be produced by willow on marginal land using well-adapted or tolerant cultivars. Across Canada and northern U.S.A., there are millions of hectares of available degraded land that have the potential for willow SRC biomass production, with a C sequestration potential capable of offsetting appreciable amount of anthropogenic green-house gas emissions. A fundamental question concerning 1 sustainable SRC willow yields was whether long-term soil productivity is maintained within a multi-rotation SRC system, given the rapid growth rate and associated nutrient exports offsite when harvesting the willow biomass after repeated short rotations. Based on early results from the first willow SRC rotation, it was found willow systems are relatively low nutrient-demanding, with minimal nutrient output other than in harvested biomass.
The overall aim of this manuscript is to summarize the literature and present findings and data from ongoing research trials across Canada and northern U.S.A. examining willow SRC system establishment and viability. The research areas of interest presented here are the crop production of willow SRC systems, above- and below-ground biomass dynamics and the C budget, comprehensive soil-willow system nutrient budget, and soil nutrient amendments (via fertilization) in willow SRC systems. Areas of existing research gaps were also identified for the Canadian context
Cryptosporidium Priming Is More Effective than Vaccine for Protection against Cryptosporidiosis in a Murine Protein Malnutrition Model
Cryptosporidium is a major cause of severe diarrhea, especially in malnourished children. Using a murine model of C. parvum oocyst challenge that recapitulates clinical features of severe cryptosporidiosis during malnutrition, we interrogated the effect of protein malnutrition (PM) on primary and secondary responses to C. parvum challenge, and tested the differential ability of mucosal priming strategies to overcome the PM-induced susceptibility. We determined that while PM fundamentally alters systemic and mucosal primary immune responses to Cryptosporidium, priming with C. parvum (106 oocysts) provides robust protective immunity against re-challenge despite ongoing PM. C. parvum priming restores mucosal Th1-type effectors (CD3+CD8+CD103+ T-cells) and cytokines (IFNγ, and IL12p40) that otherwise decrease with ongoing PM. Vaccination strategies with Cryptosporidium antigens expressed in the S. Typhi vector 908htr, however, do not enhance Th1-type responses to C. parvum challenge during PM, even though vaccination strongly boosts immunity in challenged fully nourished hosts. Remote non-specific exposures to the attenuated S. Typhi vector alone or the TLR9 agonist CpG ODN-1668 can partially attenuate C. parvum severity during PM, but neither as effectively as viable C. parvum priming. We conclude that although PM interferes with basal and vaccine-boosted immune responses to C. parvum, sustained reductions in disease severity are possible through mucosal activators of host defenses, and specifically C. parvum priming can elicit impressively robust Th1-type protective immunity despite ongoing protein malnutrition. These findings add insight into potential correlates of Cryptosporidium immunity and future vaccine strategies in malnourished children
The effectiveness of ω-3 polyunsaturated fatty acid interventions during pregnancy on obesity measures in the offspring: an up-to-date systematic review and meta-analysis.
BACKGROUND: The potential role of ω-3 long chain polyunsaturated fatty acid (LCPUFA) supplementation during pregnancy on subsequent risk of obesity outcomes in the offspring is not clear and there is a need to synthesise this evidence. OBJECTIVE: A systematic review and meta-analysis of randomised controlled trials (RCTs), including the most recent studies, was conducted to assess the effectiveness of ω-3 LCPUFA interventions during pregnancy on obesity measures, e.g. BMI, body weight, fat mass in offspring. METHODS: Included RCTs had a minimum of 1-month follow-up post-partum. The search included CENTRAL, MEDLINE, SCOPUS, WHO's International Clinical Trials Reg., E-theses and Web of Science databases. Study quality was evaluated using the Cochrane Collaboration's risk of bias tool. RESULTS: Eleven RCTs, from ten unique trials, (3644 children) examined the effectiveness of ω-3 LCPUFA maternal supplementation during pregnancy on the development of obesity outcomes in offspring. There were heterogeneities between the trials in terms of their sample, type and duration of intervention and follow-up. Pooled estimates did not show an association between prenatal intake of fatty acids and obesity measures in offspring. CONCLUSION: These results indicate that maternal supplementation with ω-3 LCPUFA during pregnancy does not have a beneficial effect on obesity risk. Due to the high heterogeneity between studies along with small sample sizes and high rates of attrition, the effects of ω-3 LCPUFA supplementation during pregnancy for prevention of childhood obesity in the long-term remains unclear. Large high-quality RCTs are needed that are designed specifically to examine the effect of prenatal intake of fatty acids for prevention of childhood obesity. There is also a need to determine specific sub-groups in the population that might get a greater benefit and whether different ω-3 LCPUFA, i.e. eicosapentaenoic (EPA) vs. docosahexanoic (DHA) acids might potentially have different effects
Minimal Holocene retreat of large tidewater glaciers in Køge Bugt, southeast Greenland
Abstract Køge Bugt, in southeast Greenland, hosts three of the largest glaciers of the Greenland Ice Sheet; these have been major contributors to ice loss in the last two decades. Despite its importance, the Holocene history of this area has not been investigated. We present a 9100 year sediment core record of glaciological and oceanographic changes from analysis of foraminiferal assemblages, the abundance of ice-rafted debris, and sortable silt grain size data. Results show that ice-rafted debris accumulated constantly throughout the core; this demonstrates that glaciers in Køge Bugt remained in tidewater settings throughout the last 9100 years. This observation constrains maximum Holocene glacier retreat here to less than 6 km from present-day positions. Retreat was minimal despite oceanic and climatic conditions during the early-Holocene that were at least as warm as the present-day. The limited Holocene retreat of glaciers in Køge Bugt was controlled by the subglacial topography of the area; the steeply sloping bed allowed glaciers here to stabilise during retreat. These findings underscore the need to account for individual glacier geometry when predicting future behaviour. We anticipate that glaciers in Køge Bugt will remain in stable configurations in the near-future, despite the predicted continuation of atmospheric and oceanic warming
- …
