5,352 research outputs found
TypEx : a type based approach to XML stream querying
We consider the topic of query evaluation over semistructured information streams, and XML data streams in particular. Streaming evaluation methods are necessarily eventdriven, which is in tension with high-level query models; in general, the more expressive the query language, the harder it is to translate queries into an event-based implementation with finite resource bounds
Vertical axis non-linearities in wavelength scanning interferometry
The uncertainty of measurements made on an areal surface topography instrument is directly influenced by its metrological characteristics. In this work, the vertical axis deviation from linearity of a wavelength scanning interferometer is evaluated. The vertical axis non-linearities are caused by the spectral leakage resulting from the Fourier transform algorithm for phase slope estimation. These non-linearities are simulated and the results are compared with experimental measurements. In order to reduce the observed non-linearities, a
modification of the algorithm is proposed. The application of a Hamming window and the exclusion of edge points in the extracted phase are shown to increase the accuracy over the whole instrument range
Active architecture for pervasive contextual services
Pervasive services may be defined as services that are available to any client (anytime, anywhere). Here we focus on the software and network infrastructure required to support pervasive contextual services operating over a wide area. One of the key requirements is a matching service capable of assimilating and filtering information from various sources and determining matches relevant to those services. We consider some of the challenges in engineering a globally distributed matching service that is scalable, manageable, and able to evolve incrementally as usage patterns, data formats, services, network topologies and deployment technologies change. We outline an approach based on the use of a peer-to-peer architecture to distribute user events and data, and to support the deployment and evolution of the infrastructure itself
Retrieval methods for ground-based millimeter-wave measurements for the network for the detection of stratospheric change
The fundamental objective is to determine the information available in ground-based millimeter-wave measurements of stratospheric constituent profiles, to identify the optimum method of retrieving this profile information, and to characterize the errors in the final result. A secondary objective is to produce retrieval software for operational use with Network for the Detection of Stratospheric Change (NDSC) measurements of O3, H2O, ClO, and perhaps N2O. Tests were performed on existing ozone retrieval programs in support of ongoing NDSC field measurements. The results show that if random spectral errors and retrieval bias errors are considered, accuracy of the retrieved profile is about 5 percent from 20-50 km, and about 10 percent from 50-60 km
Quantifying the specificity of near-duplicate image classification functions
There are many published methods for detecting similar and near-duplicate images. Here, we consider their use in the context of unsupervised near-duplicate detection, where the task is to find a (relatively small) near-duplicate intersection of two large candidate sets. Such scenarios are of particular importance in forensic near-duplicate detection. The essential properties of a such a function are: performance, sensitivity, and specificity. We show that, as collection sizes increase, then specificity becomes the most important of these, as without very high specificity huge numbers of false positive matches will be identified. This makes even very fast, highly sensitive methods completely useless. Until now, to our knowledge, no attempt has been made to measure the specificity of near-duplicate finders, or even to compare them with each other. Recently, a benchmark set of near-duplicate images has been established which allows such assessment by giving a near-duplicate ground truth over a large general image collection. Using this we establish a methodology for calculating specificity. A number of the most likely candidate functions are compared with each other and accurate measurement of sensitivity vs. specificity are given. We believe these are the first such figures be to calculated for any such function
Value from free-text maintenance records : converting wind farm work orders into quantifiable, actionable information using text mining
The aim of this project is to demonstrate how text mining can help wind farm operators extract unique, quantifiable maintenance information from historic work orders. A good overview of past maintenance efforts can help develop an reliability-centred maintenance strategy for the future in terms of labour intensity, budgeting and spare parts logistics [1, 2]. However, work orders - where significant information is entered by a human in the form of free text – do not provide any straightforward means for automated analysis [3, 4]. Our approach introduces a novel combination of machine learning techniques supported by expert judgement. Significant focus is on the vocabulary - spelling error correction, semantic matching of synonyms and abbreviations. This allows tasks to be grouped by their underlying meaning, not only the characters they contain. The principal output is a frequency distribution of all groups of equivalent tasks. Further categorical analysis allows to focus on specific plant systems or components, as well as failure modes. Data from an industrial partner’s major onshore wind farms in Scotland was used to test our approach against manual analysis. Potential savings were identified in weeks of effort, or £2-9k in labour cost per site, in addition to an improved maintenance strategy. The remaining challenges mainly lie in increasing accuracy and reducing operator input. These are being addressed by our continued research, but also highlight opportunities for collaboration and standardisation across the industry to maximise the value of data
The Web as an Adaptive Network: Coevolution of Web Behavior and Web Structure
Much is known about the complex network structure of the Web, and about behavioral dynamics on the Web. A number of studies address how behaviors on the Web are affected by different network topologies, whilst others address how the behavior of users on the Web alters network topology. These represent complementary directions of influence, but they are generally not combined within any one study. In network science, the study of the coupled interaction between topology and behavior, or state-topology coevolution, is known as 'adaptive networks', and is a rapidly developing area of research. In this paper, we review the case for considering the Web as an adaptive network and several examples of state-topology coevolution on the Web. We also review some abstract results from recent literature in adaptive networks and discuss their implications for Web Science. We conclude that adaptive networks provide a formal framework for characterizing processes acting 'on' and 'of' the Web, and offers potential for identifying general organizing principles that seem otherwise illusive in Web Scienc
Projector - a partially typed language for querying XML
We describe Projector, a language that can be used to perform a mixture of typed and untyped computation against data represented in XML. For some problems, notably when the data is unstructured or semistructured, the most desirable programming model is against the tree structure underlying the document. When this tree structure has been used to model regular data structures, then these regular structures themselves are a more desirable programming model. The language Projector, described here in outline, gives both models within a single partially typed algebra and is well suited for hybrid applications, for example when fragments of a known structure are embedded in a document whose overall structure is unknown. Projector is an extension of ECMA-262 (aka JavaScript), and therefore inherits an untyped DOM interface. To this has been added some static typing and a dynamic projection primitive, which can be used to assert the presence of a regular structure modelled within the XML. If this structure does exist, the data is extracted and presented as a typed value within the programming language
- …
