19,315 research outputs found
Using relevance feedback in expert search
In Enterprise settings, expert search is considered an important task. In this search task, the user has a need for expertise - for instance, they require assistance from someone about a topic of interest. An expert search system assists users with their "expertise need" by suggesting people with relevant expertise to the topic of interest. In this work, we apply an expert search approach that does not explicitly rank candidates in response to a query, but instead implicitly ranks candidates by taking into account a ranking of document with respect to the query topic. Pseudo-relevance feedback, aka query expansion, has been shown to improve retrieval performance in adhoc search tasks. In this work, we investigate to which extent query expansion can be applied in an expert search task to improve the accuracy of the generated ranking of candidates. We define two approaches for query expansion, one based on the initial of ranking of documents for the query topic. The second approach is based on the final ranking of candidates. The aims of this paper are two-fold. Firstly, to determine if query expansion can be successfully applied in the expert search task, and secondly, to ascertain if either of the two forms of query expansion can provide robust, improved retrieval performance. We perform a thorough evaluation contrasting the two query expansion approaches in the context of the TREC 2005 and 2006 Enterprise tracks
Voting for candidates: adapting data fusion techniques for an expert search task
In an expert search task, the users' need is to identify people who have relevant expertise to a topic of interest. An expert search system predicts and ranks the expertise of a set of candidate persons with respect to the users' query. In this paper, we propose a novel approach for predicting and ranking candidate expertise with respect to a query. We see the problem of ranking experts as a voting problem, which we model by adapting eleven data fusion techniques.We investigate the effectiveness of the voting approach and the associated data fusion techniques across a range of document weighting models, in the context of the TREC 2005 Enterprise track. The evaluation results show that the voting paradigm is very effective, without using any collection specific heuristics. Moreover, we show that improving the quality of the underlying document representation can significantly improve the retrieval performance of the data fusion techniques on an expert search task. In particular, we demonstrate that applying field-based weighting models improves the ranking of candidates. Finally, we demonstrate that the relative performance of the adapted data fusion techniques for the proposed approach is stable regardless of the used weighting models
Combining fields in known-item email search
Emails are examples of structured documents with various fields. These fields can be exploited to enhance the retrieval effectiveness of an Information Retrieval (IR) system that mailing list archives. In recent experiments of the TREC2005 Enterprise track, various fields were applied to varying degrees of success by the participants. In his work, using a field-based weighting model, we investigate the retrieval performance attainable by each field, and examine when fields evidence should be combined or not
Hatching Strategies in Monogenean (Platyhelminth) Parasites that Facilitate Host Infection
In parasites, environmental cues may influence hatching of eggs and enhance the success of infections. The two major endoparasitic groups of parasitic platyhelminths, cestodes (tapeworms) and digeneans (flukes), typically have high fecundity, infect more than one host species, and transmit trophically. Monogeneans are parasitic flatworms that are among the most host specific of all parasites. Most are ectoparasites with relatively low fecundity and direct life cycles tied to water. They infect a single host species, usually a fish, although some are endoparasites of amphibians and aquatic chelonian reptiles. Monogenean eggs have strong shells and mostly release ciliated larvae, which, against all odds, must find, identify, and infect a suitable specific host. Some monogeneans increase their chances of finding a host by greatly extending the hatching period (possible bet-hedging). Others respond to cues for hatching such as shadows, chemicals, mechanical disturbance, and osmotic changes, most of which may be generated by the host. Hatching may be rhythmical, larvae emerging at times when the host is more vulnerable to invasion, and this may be combined with responses to other environmental cues. Different monogenean species that infect the same host species may adopt different strategies of hatching, indicating that tactics may be more complex than first thought. Control of egg assembly and egg-laying, possibly by host hormones, has permitted colonization of frogs and toads by polystomatid monogeneans. Some monogeneans further improve the chances of infection by attaching eggs to the host or by retaining eggs on, or in, the body of the parasite. The latter adaptation has led ultimately to viviparity in gyrodactylid monogeneans
Effective order strong stability preserving Runge–Kutta methods
We apply the concept of effective order to strong stability preserving (SSP) explicit Runge–Kutta methods. Relative to classical Runge–Kutta methods, effective order methods are designed to satisfy a relaxed set of order conditions, but yield higher order accuracy when composed with special starting and stopping methods. The relaxed order conditions allow for greater freedom in the design of effective order methods. We show that this allows the construction of four-stage SSP methods with effective order four (such methods cannot have classical order four). However, we also prove that effective order five methods—like classical order five methods—require the use of non-positive weights and so cannot be SSP. By numerical optimization, we construct explicit SSP Runge–Kutta methods up to effective order four and establish the optimality of many of them. Numerical experiments demonstrate the validity of these methods in practice
Spatially partitioned embedded Runge-Kutta Methods
We study spatially partitioned embedded Runge–Kutta (SPERK) schemes for partial differential equations (PDEs), in which each of the component schemes is applied over a different part of the spatial domain. Such methods may be convenient for problems in which the smoothness of the solution or the magnitudes of the PDE coefficients vary strongly in space. We focus on embedded partitioned methods as they offer greater efficiency and avoid the order reduction that may occur in non-embedded schemes. We demonstrate that the lack of conservation in partitioned schemes can lead to non-physical effects and propose conservative additive schemes based on partitioning the fluxes rather than the ordinary differential equations. A variety of SPERK schemes are presented, including an embedded pair suitable for the time evolution of fifth-order weighted non-oscillatory (WENO) spatial discretizations. Numerical experiments are provided to support the theory
University of Glasgow at WebCLEF 2005: experiments in per-field normalisation and language specific stemming
We participated in the WebCLEF 2005 monolingual task. In this task, a search system aims to retrieve relevant documents from a multilingual corpus of Web documents from Web sites of European governments. Both the documents and the queries are written in a wide range of European languages. A challenge in this setting is to detect the language of documents and topics, and to process them appropriately. We develop a language specific technique for applying the correct stemming approach, as well as for removing the correct stopwords from the queries. We represent documents using three fields, namely content, title, and anchor text of incoming hyperlinks. We use a technique called per-field normalisation, which extends the Divergence From Randomness (DFR) framework, to normalise the term frequencies, and to combine them across the three fields. We also employ the length of the URL path of Web documents. The ranking is based on combinations of both the language specific stemming, if applied, and the per-field normalisation. We use our Terrier platform for all our experiments. The overall performance of our techniques is outstanding, achieving the overall top four performing runs, as well as the top performing run without metadata in the monolingual task. The best run only uses per-field normalisation, without applying stemming
Modelling the usefulness of document collections for query expansion in patient search
Dealing with the medical terminology is a challenge when searching for patients based on the relevance of their medical records towards a given query. Existing work used query expansion (QE) to extract expansion terms from different document collections to improve query representation. However, the usefulness of particular document collections for QE was not measured and taken into account during retrieval. In this work, we investigate two automatic approaches that measure and leverage the usefulness of document collections when exploiting multiple document collections to improve query representation. These two approaches are based on resource selection and learning to rank techniques, respectively. We evaluate our approaches using the TREC Medical Records track’s test collection. Our results show the potential of the proposed approaches, since they can effectively exploit 14 different document collections, including both domain-specific (e.g. MEDLINE abstracts) and generic (e.g. blogs and webpages) collections, and significantly outperform existing effective baselines, including the best systems participating at the TREC Medical Records track. Our analysis shows that the different collections are not equally useful for QE, while our two approaches can automatically weight the usefulness of expansion terms extracted from different document collections effectively.This is the author accepted manuscript. The final version is available from ACM via http://dx.doi.org/10.1145/2806416.280661
Sparse spatial selection for novelty-based search result diversification
Abstract. Novelty-based diversification approaches aim to produce a diverse ranking by directly comparing the retrieved documents. However, since such approaches are typically greedy, they require O(n 2) documentdocument comparisons in order to diversify a ranking of n documents. In this work, we propose to model novelty-based diversification as a similarity search in a sparse metric space. In particular, we exploit the triangle inequality property of metric spaces in order to drastically reduce the number of required document-document comparisons. Thorough experiments using three TREC test collections show that our approach is at least as effective as existing novelty-based diversification approaches, while improving their efficiency by an order of magnitude.
- …
