597 research outputs found

    Design Challenges for GDPR RegTech

    Get PDF
    The Accountability Principle of the GDPR requires that an organisation can demonstrate compliance with the regulations. A survey of GDPR compliance software solutions shows significant gaps in their ability to demonstrate compliance. In contrast, RegTech has recently brought great success to financial compliance, resulting in reduced risk, cost saving and enhanced financial regulatory compliance. It is shown that many GDPR solutions lack interoperability features such as standard APIs, meta-data or reports and they are not supported by published methodologies or evidence to support their validity or even utility. A proof of concept prototype was explored using a regulator based self-assessment checklist to establish if RegTech best practice could improve the demonstration of GDPR compliance. The application of a RegTech approach provides opportunities for demonstrable and validated GDPR compliance, notwithstanding the risk reductions and cost savings that RegTech can deliver. This paper demonstrates a RegTech approach to GDPR compliance can facilitate an organisation meeting its accountability obligations

    Towards an automatic data value analysis method for relational databases

    Get PDF
    Data is becoming one of the world’s most valuable resources and it is suggested that those who own the data will own the future. However, despite data being an important asset, data owners struggle to assess its value. Some recent pioneer works have led to an increased awareness of the necessity for measuring data value. They have also put forward some simple but engaging survey-based methods to help with the first-level data assessment in an organisation. However, these methods are manual and they depend on the costly input of domain experts. In this paper, we propose to extend the manual survey-based approaches with additional metrics and dimensions derived from the evolving literature on data value dimensions and tailored specifically for our use case study. We also developed an automatic, metric-based data value assessment approach that (i) automatically quantifies the business value of data in Relational Databases (RDB), and (ii) provides a scoring method that facilitates the ranking and extraction of the most valuable RDB tables. We evaluate our proposed approach on a real-world RDB database from a small online retailer (MyVolts) and show in our experimental study that the data value assessments made by our automated system match those expressed by the domain expert approach

    An intelligent linked data quality dashboard

    Get PDF
    This paper describes a new intelligent, data-driven dashboard for linked data quality assessment. The development goal was to assist data quality engineers to interpret data quality problems found when evaluating a dataset us-ing a metrics-based data quality assessment. This required construction of a graph linking the problematic things identified in the data, the assessment metrics and the source data. This context and supporting user interfaces help the user to un-derstand data quality problems. An analysis widget also helped the user identify the root cause multiple problems. This supported the user in identification and prioritization of the problems that need to be fixed and to improve data quality. The dashboard was shown to be useful for users to clean data. A user evaluation was performed with both expert and novice data quality engineers

    Semantic data ingestion for intelligent, value-driven big data analytics

    Get PDF
    In this position paper we describe a conceptual model for intelligent Big Data analytics based on both semantic and machine learning AI techniques (called AI ensembles). These processes are linked to business outcomes by explicitly modelling data value and using semantic technologies as the underlying mode for communication between the diverse processes and organisations creating AI ensembles. Furthermore, we show how data governance can direct and enhance these ensembles by providing recommendations and insights that to ensure the output generated produces the highest possible value for the organisation

    Saffron: a data value assessment tool for quantifying the value of data assets

    Get PDF
    Data has become an indispensable commodity and it is the basis for many products and services. It has become increasingly important to understand the value of this data in order to be able to exploit it and reap the full benefits. Yet, many businesses and entities are simply hoarding data without understanding its true potential. We here present Saffron; a Data Value Assessment Tool that enables the quantification of the value of data assets based on a number of different data value dimensions. Based on the Data Value Vocabulary (DaVe), Saffron enables the extensible representation of the calculated value of data assets, whilst also catering for the subjective and contextual nature of data value. The tool exploits semantic technologies in order to provide traceable explanations of the calculated data value. Saffron therefore provides the first step towards the efficient and effective exploitation of data assets

    Understanding information professionals: a survey on the quality of Linked Data sources for digital libraries

    Get PDF
    In this paper we provide an in-depth analysis of a survey related to Information Professionals (IPs) experiences with Linked Data quality. We discuss and highlight shortcomings in linked data sources following a survey related to the quality issues IPs find when using such sources for their daily tasks such as metadata creation

    Milan: automatic generation of R2RML mappings

    Get PDF
    Milan automatically generates R2RML mappings between a source relational database and a target ontology, using a novel multi-level algorithms. It address real world inter-model semantic gap by resolving naming conflicts, structural and semantic heterogeneity, thus enabling high fidelity mapping generation for realistic databases. Despite the importance of mappings for interoperability across relational databases and ontologies, a labour and expertise-intensive task, the current state of the art has achieved only limited automation. The paper describes an experimental evaluation of Milan with respect to the state of the art systems using the RODI benchmarking tool which shows that Milan outperforms all systems in all categorie

    DELTA-R: a change detection approach for RDF datasets

    Get PDF
    This paper presents the DELTA-R approach that detects and classifies the changes between two versions of a linked dataset. It contributes to the state of the art firstly: by proposing a more granular classification of the resource level changes, and secondly: by automatically selecting the appropriate resource properties to identify the same resources in different versions of a linked dataset with different URIs and similar representation. The paper also presents the DELTA-R change model to represent the changes detected by the DELTA-R approach. This model bridges the gap between resource-centric and triple-centric views of changes in linked datasets. As a result, a single change detection mechanism will be able to support the use cases like interlink maintenance and dataset or replica synchronization. Additionally, the paper describes an experiment conducted to examine the accuracy of the DELTA-R approach in detecting the changes between two versions of a linked dataset. The result indicates that the accuracy of DELTA-R approach outperforms the state of the art approaches by up to 4%. It is demonstrated that the proposed more granular classification of changes helped to identifyup to 1529 additional updated resources compered to X.By means of a case study, we demonstrate the support of DELTA-R approach and change model for an interlink maintenance use case. The result shows that 100% of the broken interlinks were repaired between DBpedia person snapshot 3.7 and Freebase

    Assessing the quality of geospatial linked data – experiences from Ordnance Survey Ireland (OSi)

    Get PDF
    Ordnance Survey Ireland (OSi) is Ireland’s national mapping agency that is responsible for the digitisation of the island’s infrastructure in terms of mapping. Generating data from various sensors (e.g. spatial sensors), OSi build its knowledge in the Prime2 framework, a subset of which is transformed into geo-Linked Data. In this paper we discuss how the quality of the generated sematic data fares against datasets in the LOD cloud. We set up Luzzu, a scalable Linked Data quality assessment framework, in the OSi pipeline to continuously assess produced data in order to tackle any quality problems prior to publishing

    The genetic basis for adaptation of model-designed syntrophic co-cultures.

    Get PDF
    Understanding the fundamental characteristics of microbial communities could have far reaching implications for human health and applied biotechnology. Despite this, much is still unknown regarding the genetic basis and evolutionary strategies underlying the formation of viable synthetic communities. By pairing auxotrophic mutants in co-culture, it has been demonstrated that viable nascent E. coli communities can be established where the mutant strains are metabolically coupled. A novel algorithm, OptAux, was constructed to design 61 unique multi-knockout E. coli auxotrophic strains that require significant metabolite uptake to grow. These predicted knockouts included a diverse set of novel non-specific auxotrophs that result from inhibition of major biosynthetic subsystems. Three OptAux predicted non-specific auxotrophic strains-with diverse metabolic deficiencies-were co-cultured with an L-histidine auxotroph and optimized via adaptive laboratory evolution (ALE). Time-course sequencing revealed the genetic changes employed by each strain to achieve higher community growth rates and provided insight into mechanisms for adapting to the syntrophic niche. A community model of metabolism and gene expression was utilized to predict the relative community composition and fundamental characteristics of the evolved communities. This work presents new insight into the genetic strategies underlying viable nascent community formation and a cutting-edge computational method to elucidate metabolic changes that empower the creation of cooperative communities
    corecore