103 research outputs found
Recommended from our members
On requirements for federated data integration as a compilation process
Data integration problems are commonly viewed as interoperability issues, where the burden of reaching a common ground for exchanging data is distributed across the peers involved in the process. While apparently an effective approach towards standardization and interoperability, it poses a constraint to data providers who, for a variety of reasons, require backwards compatibility with proprietary or non-standard mechanisms. Publishing a holistic data API is one such use case, where a single peer performs most of the integration work in a many-to-one scenario. Incidentally, this is also the base setting of software compilers, whose operational model is comprised of phases that perform analysis, linkage and assembly of source code and generation of intermediate code. There are several analogies with a data integration process, more so with data that live in the Semantic Web, but what requirements would a data provider need to satisfy, for an integrator to be able to query and transform its data effectively, with no further enforcements on the provider? With this paper, we inquire into what practices and essential prerequisites could turn this intuition into a concrete and exploitable vision, within Linked Data and beyond
SPARQL Query Recommendations by Example
In this demo paper, a SPARQL Query Recommendation Tool (called SQUIRE) based on query reformulation is presented. Based on three steps, Generalization, Specialization and Evaluation, SQUIRE implements the logic of reformulating a SPARQL query that is satisfiable w.r.t a source RDF dataset, into others that are satisfiable w.r.t a target RDF dataset. In contrast with existing approaches, SQUIRE aims at rec- ommending queries whose reformulations: i) reflect as much as possible the same intended meaning, structure, type of results and result size as the original query and ii) do not require to have a mapping between the two datasets. Based on a set of criteria to measure the similarity between the initial query and the recommended ones, SQUIRE demonstrates the feasibility of the underlying query reformulation process, ranks appropriately the recommended queries, and offers a valuable support for query recommendations over an unknown and unmapped target RDF dataset, not only assisting the user in learning the data model and content of an RDF dataset, but also supporting its use without requiring the user to have intrinsic knowledge of the data
SPARQL Query Recommendation by Example: Assessing the Impact of Structural Analysis on Star-Shaped Queries
One of the existing query recommendation strategies for unknown datasets is "by example", i.e. based on a query that the user already knows how to formulate on another dataset within a similar domain. In this paper we measure what contribution a structural analysis of the query and the datasets can bring to a recommendation strategy, to go alongside approaches that provide a semantic analysis. Here we concentrate on the case of star-shaped SPARQL queries over RDF datasets.
The illustrated strategy performs a least general generalization on the given query, computes the specializations of it that are satisfiable by the target dataset, and organizes them into a graph. It then visits the graph to recommend first the reformulated queries that reflect the original query as closely as possible. This approach does not rely upon a semantic mapping between the two datasets. An implementation as part of the SQUIRE query recommendation library is discussed
Crowdsourcing Linked Data on listening experiences through reuse and enhancement of library data
Research has approached the practice of musical reception in a multitude of ways, such as the analysis of professional critique, sales figures and psychological processes activated by the act of listening. Studies in the Humanities, on the other hand, have been hindered by the lack of structured evidence of actual experiences of listening as reported by the listeners themselves, a concern that was voiced since the early Web era. It was however assumed that such evidence existed, albeit in pure textual form, but could not be leveraged until it was digitised and aggregated. The Listening Experience Database (LED) responds to this research need by providing a centralised hub for evidence of listening in the literature. Not only does LED support search and reuse across nearly 10,000 records, but it also provides machine-readable structured data of the knowledge around the contexts of listening. To take advantage of the mass of formal knowledge that already exists on the Web concerning these contexts, the entire framework adopts Linked Data principles and technologies. This also allows LED to directly reuse open data from the British Library for the source documentation that is already published. Reused data are re-published as open data with enhancements obtained by expanding over the model of the original data, such as the partitioning of published books and collections into individual stand-alone documents. The database was populated through crowdsourcing and seamlessly incorporates data reuse from the very early data entry phases. As the sources of the evidence often contain vague, fragmentary of uncertain information, facilities were put in place to generate structured data out of such fuzziness. Alongside elaborating on these functionalities, this article provides insights into the most recent features of the latest instalment of the dataset and portal, such as the interlinking with the MusicBrainz database, the relaxation of geographical input constraints through text mining, and the plotting of key locations in an interactive geographical browser
The Transnational and the Text-Searchable: Digitized Sources and the Shadows They Cast
This working paper explores the consequences for historians' research practice of the twinned transnational and digital turns. The accelerating digitization of historians' sources (scholarly, periodical, and archival) and the radical shift in the granularity of access to information within them has radically changes historians' research practice. Yet this has incited remarkably little reflection regarding the consequences for individual projects or collective knowledge generation. What are the implications for international research in particular? This essay heralds the new kinds of historical knowledge-generation made possible by web access to digitized, text-searchable sources. It also attempts an accounting of all that we formerly, unwittingly, gained from the frictions inherent to international research in an analog world. What are the intellectual and political consequences of that which has been lost
Tourism Specialization, Absorptive Capacity and Economic Growth
This paper investigates the relationship between tourism specialization and economic growth whilst accounting for the absorptive capacity of host (tourism destination) countries, defined in terms of financial system development. We use the system generalized methods-of-moments (SYS-GMM) estimation methodology to investigate this relationship for 129 countries over the period 1995-2011. The results support the hypothesis that the positive effect of tourism specialization on growth is contingent on the level of economic development as well as the financial system absorptive capacity of recipient economies. Consistent with the law of diminishing returns, we also find that for countries with a developed financial system, at exponential levels of tourism specialization its effect on growth turns negative. Significant policy implications flow from these findings
Shout LOUD on a road trip to FAIRness: experience with integrating open research data at the Bibliotheca Hertziana
Modern-day research in digital humanities is an inherently intersectional activity that borrows from, and in turn contributes to, a multitude of domains previously seen as having little bearing on the discipline at hand. Art history, for instance, operates today at the crossroads of social studies, digital libraries, geographical information systems, data modelling, and cognitive computing, yet its problems inform research questions within all of these fields, which veer towards making the output of prior research readily available to humanists in their interaction with digital resources. This is reflected in the way data are represented, stored and published: with various intra- and inter-institutional research endeavours relying upon output that could and should be shared, the notion of ‘leaving the data silo’ with a view on interoperability acquires even greater significance. Scholars and policymakers are supporting this view with guidelines, such as the FAIR principles, and standards, such as Linked Open Data, that implement them, with technologies whose coverage, complexity and lifespans vary. A point is being approached, however, where the technological opportunities permit a continuous interoperability between established and concluded data-intensive projects, and current projects whose underlying datasets evolve. This enables the data production of one institution to be viewed as one harmonically interlinked knowledge graph, which can be queried through a global understanding of the ontological models that dominate the fields involved. This paper is an overview of past and present efforts of mine in the creation of digital humanities knowledge graphs over the past decade, from music history to the societal ramifications of the history of architecture. This contribution highlights the variability of concurrent research environments at the Bibliotheca Hertziana, not only in the state of their activities, but also in the ways they manage their data life-cycles, and exemplifies possible combinations of FAIR data management platforms and integration techniques, suitable for different scenarios resulting from such variability. The paper concludes with an example of how feedback from the art history domain called for novel directions for data science and Semantic Web scholars to follow, by proposing that the Linked Open Data paradigm adopt a notion of usability in the very morphology of published data, thus becoming Linked Open Usable Data
Addressing exploitability of Smart City data
Central to a number of emerging Smart Cities are online platforms for data sharing and reuse: Data Hubs and Data Catalogues. These systems support the use of data by developers through enabling data discoverability and access. As such, the effectiveness of a Data Catalogue can be seen as the way in which it supports `data exploitability': the ability to assess whether the provided data is appropriate to the given task. Beyond technical compatibility, this also regards validating the policies attached to data. Here, we present a methodology to enable Smart City Data Hubs to better address exploitability by considering the way policies propagate across the data flows applied in the system
Recommended from our members
LED: curated and crowdsourced linked data on music listening experiences
We present the Listening Experience Database (LED), a structured knowledge base of accounts of listening to music in documented sources. LED aggregates scholarly and crowdsourced contributions and is heavily focused on data reuse. To that end, both the storage system and the governance model are natively implemented as Linked Data. Reuse of data from datasets such as the BNB and DBpedia is integrated with the data lifecycle since the entry phase, and several content management functionalities are implemented using semantic technologies. Imported data are enhanced through curation and specialisation with degrees of granularity not provided by the original datasets
- …
