1,559 research outputs found
Document Retrieval on Repetitive Collections
Document retrieval aims at finding the most important documents where a
pattern appears in a collection of strings. Traditional pattern-matching
techniques yield brute-force document retrieval solutions, which has motivated
the research on tailored indexes that offer near-optimal performance. However,
an experimental study establishing which alternatives are actually better than
brute force, and which perform best depending on the collection
characteristics, has not been carried out. In this paper we address this
shortcoming by exploring the relationship between the nature of the underlying
collection and the performance of current methods. Via extensive experiments we
show that established solutions are often beaten in practice by brute-force
alternatives. We also design new methods that offer superior time/space
trade-offs, particularly on repetitive collections.Comment: Accepted to ESA 2014. Implementation and experiments at
http://www.cs.helsinki.fi/group/suds/rlcsa
Space current division in small pentodes
Thesis (B.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering, 1948.by James Darr, Jr. and Solomon Manber.B.S
Scheduling Jobs in Flowshops with the Introduction of Additional Machines in the Future
This is the author's peer-reviewed final manuscript, as accepted by the publisher. The published article is copyrighted by Elsevier and can be found at: http://www.journals.elsevier.com/expert-systems-with-applications/.The problem of scheduling jobs to minimize total weighted tardiness in flowshops,\ud
with the possibility of evolving into hybrid flowshops in the future, is investigated in\ud
this paper. As this research is guided by a real problem in industry, the flowshop\ud
considered has considerable flexibility, which stimulated the development of an\ud
innovative methodology for this research. Each stage of the flowshop currently has\ud
one or several identical machines. However, the manufacturing company is planning\ud
to introduce additional machines with different capabilities in different stages in the\ud
near future. Thus, the algorithm proposed and developed for the problem is not only\ud
capable of solving the current flow line configuration but also the potential new\ud
configurations that may result in the future. A meta-heuristic search algorithm based\ud
on Tabu search is developed to solve this NP-hard, industry-guided problem. Six\ud
different initial solution finding mechanisms are proposed. A carefully planned\ud
nested split-plot design is performed to test the significance of different factors and\ud
their impact on the performance of the different algorithms. To the best of our\ud
knowledge, this research is the first of its kind that attempts to solve an industry-guided\ud
problem with the concern for future developments
Wave Energy: a Pacific Perspective
This is the author's peer-reviewed final manuscript, as accepted by the publisher. The published article is copyrighted by The Royal Society and can be found at: http://rsta.royalsocietypublishing.org/.This paper illustrates the status of wave energy development in Pacific Rim countries by characterizing the available resource and introducing the region‟s current and potential future leaders in wave energy converter development. It also describes the existing licensing and permitting process as well as potential environmental concerns. Capabilities of Pacific Ocean testing facilities are described in addition to the region‟s vision of the future of wave energy
Recommended from our members
Tracing diagnosis trajectories over millions of patients reveal an unexpected risk in schizophrenia.
The identification of novel disease associations using big-data for patient care has had limited success. In this study, we created a longitudinal disease network of traced readmissions (disease trajectories), merging data from over 10.4 million inpatients through the Healthcare Cost and Utilization Project, which allowed the representation of disease progression mapping over 300 diseases. From these disease trajectories, we discovered an interesting association between schizophrenia and rhabdomyolysis, a rare muscle disease (incidence < 1E-04) (relative risk, 2.21 [1.80-2.71, confidence interval = 0.95], P-value 9.54E-15). We validated this association by using independent electronic medical records from over 830,000 patients at the University of California, San Francisco (UCSF) medical center. A case review of 29 rhabdomyolysis incidents in schizophrenia patients at UCSF demonstrated that 62% are idiopathic, without the use of any drug known to lead to this adverse event, suggesting a warning to physicians to watch for this unexpected risk of schizophrenia. Large-scale analysis of disease trajectories can help physicians understand potential sequential events in their patients
Comparative evaluation of image reconstruction methods for the siemens PET-MR scanner using the stir library
With the introduction of Positron Emission Tomography - Magnetic Resonance (PET-MR) scanners the development of new algorithms and the comparison of the performance of different iterative reconstruction algorithms and the characteristics of the reconstructed images data is relevant. In this work, we perform a quantitative assessment of the currently used ordered subset (OS) algorithms for low-counts PET-MR data taken from a Siemens Biograph mMR scanner using the Software for Tomographic Image Reconstruction (STIR, stir.sf.net). A comparison has been performed in terms of bias and coefficient of variation (CoV). Within the STIR library different algorithms are available, such as Order Subsets Expectation Maximization (OSEM), OS Maximum A Posteriori One Step Late (OSMAPOSL) with Quadratic Prior (QP) and with Median Root Prior (MRP), OS Separable Paraboloidal Surrogate (OSSPS) with QP and Filtered Back-Projection (FBP). In addition, List Mode (LM) reconstruction is available. Corrections for attenuation, scatter and random events are performed using STIR instead of using the scanner. Data from the Hoffman brain phantom are acquired, processed and reconstructed. Clinical data from the thorax of a patient have also been reconstructed with the same algorithms. The number of subsets does not appreciably affect the bias nor the coefficient of variation (CoV=11%) at a fixed sub-iteration number. The percentage relative bias and CoV maximum values for OSMAPOSL-MRP are 10% and 15% at 360 s acquisition and 12% and 15% for the 36 s, whilst for OSMAPOSL-QP they are 6% and 16% for 360 s acquisition and 11% and 23% at 36 s and for OSEM 6% and 11% for the 360 s acquisition and 10% and 15% for the 36 s. Our findings demonstrate that when it comes to low-counts, noise and bias become significant. The methodology for reconstructing Siemens mMR data with STIR is included in the CCP-PET-MR website
Automated user modeling for personalized digital libraries
Digital libraries (DL) have become one of the most typical ways of accessing any kind of digitalized information. Due to this key role, users welcome any improvements on the services they receive from digital libraries. One trend used to
improve digital services is through personalization. Up to now, the most common approach for personalization in digital libraries has been user-driven. Nevertheless, the design of efficient personalized services has to be done, at least in part, in
an automatic way. In this context, machine learning techniques automate the process of constructing user models. This paper proposes a new approach to construct digital libraries that satisfy user’s necessity for information: Adaptive Digital Libraries, libraries that automatically learn user preferences and goals and personalize their interaction using this information
Faster algorithms for 1-mappability of a sequence
In the k-mappability problem, we are given a string x of length n and
integers m and k, and we are asked to count, for each length-m factor y of x,
the number of other factors of length m of x that are at Hamming distance at
most k from y. We focus here on the version of the problem where k = 1. The
fastest known algorithm for k = 1 requires time O(mn log n/ log log n) and
space O(n). We present two algorithms that require worst-case time O(mn) and
O(n log^2 n), respectively, and space O(n), thus greatly improving the state of
the art. Moreover, we present an algorithm that requires average-case time and
space O(n) for integer alphabets if m = {\Omega}(log n/ log {\sigma}), where
{\sigma} is the alphabet size
- …
