Search CORE

224 research outputs found

Redundancy, Deduction Schemes, and Minimum-Size Bases for Association Rules

Author: A Freitas
C C Aggarwal P S Y
D Gunopulos R Khardon, H Mannila, S Sal
Georg Gottlob
J-F Boulicaut A Bykowski, C Rigotti: Fr
J-L Guigues V Duquenne:
José Balcázar
R Agrawal T Imielinski, A Swam
R Dechter J Pearl:
R Khardon D Roth
T Calders B Goethals:
T Eiter G Gottlob
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 01/01/2009
Field of study

Association rules are among the most widely employed data analysis methods in the field of Data Mining. An association rule is a form of partial implication between two sets of binary variables. In the most common approach, association rules are parameterized by a lower bound on their confidence, which is the empirical conditional probability of their consequent given the antecedent, and/or by some other parameter bounds such as "support" or deviation from independence. We study here notions of redundancy among association rules from a fundamental perspective. We see each transaction in a dataset as an interpretation (or model) in the propositional logic sense, and consider existing notions of redundancy, that is, of logical entailment, among association rules, of the form "any dataset in which this first rule holds must obey also that second rule, therefore the second is redundant". We discuss several existing alternative definitions of redundancy between association rules and provide new characterizations and relationships among them. We show that the main alternatives we discuss correspond actually to just two variants, which differ in the treatment of full-confidence implications. For each of these two notions of redundancy, we provide a sound and complete deduction calculus, and we show how to construct complete bases (that is, axiomatizations) of absolutely minimum size in terms of the number of rules. We explore finally an approach to redundancy with respect to several association rules, and fully characterize its simplest case of two partial premises.Comment: LMCS accepted pape

arXiv.org e-Print Archive

CiteSeerX

Crossref

Episciences.org

Directory of Open Access Journals

Algebraic Comparison of Partial Lists in Bioinformatics

Author: A Gobbi
A Kalousis
A Kossenkov
A Sboner
AC Haury
AL Boulesteix
Arkady B. Khodursky
B Di Camillo
B Efron
B Efron
B Efron
B Schowe
C Cortes
C Cortes
C Furlanello
C Schneider
C Schneider
C Soneson
C Yao
Cesare Furlanello
Consortium The MicroArray Quality Control (MAQC)
D Albanese
D Cai
D Corrada
D Critchlow
D Saari
D Witten
G Guzzetta
G Jurman
G Jurman
G Lance
G Lance
G Smyth
Giuseppe Jurman
GS Cheon
I Guyon
I Jeffery
I Lönnstedt
J Bar-Ilan
J Borda
J Chen
J Ioannidis
J Neter
J Storey
L Ein-Dor
L Kuncheva
L Yu
L Zhang
M Desarkar
M Kauers
M Kauers
M Kendall
M Schimek
M Schimek
M Slawski
M Villarino
M Villarino
O Bousquet
P Baldi
P Diaconis
P Diaconis
P Hall
P Hall
P Krízek
PC Boutros
R Fagin
R Gentleman
R Graham
R Pearson
R Pique-Regi
R Pique-Regi
R Simon
Roberto Visintainer
S Abramov
S Dudoit
S Lin
S Lin
S Mukherjee
S Setlur
S Simićc
S Vanderlooy
Samantha Riccadonna
SK Lau
T Bø
T Calders
V Tusher
Visintainer
W Fury
W Hoeffding
W Shi
X Wang
X Yang
Y Xiao
Y Xiao
Z He
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 08/04/2010
Field of study

The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset

arXiv.org e-Print Archive

Crossref

Archivio della ricerca - Fondazione Bruno Kessler

Directory of Open Access Journals

PubMed Central

Space-Time Structure of Loop Quantum Black Hole

Author: A. Inokuchi
C. Chekuri
H. Mannila
L. Dehaspe
N. Pasquier
P. Roy
T. Calders
Publication venue
Publication date: 01/01/2002
Field of study

In this paper we have improved the semiclassical analysis of loop quantum black hole (LQBH) in the conservative approach of constant polymeric parameter. In particular we have focused our attention on the space-time structure. We have introduced a very simple modification of the spherically symmetric Hamiltonian constraint in its holonomic version. The new quantum constraint reduces to the classical constraint when the polymeric parameter goes to zero.Using this modification we have obtained a large class of semiclassical solutions parametrized by a generic function of the polymeric parameter. We have found that only a particular choice of this function reproduces the black hole solution with the correct asymptotic flat limit. In r=0 the semiclassical metric is regular and the Kretschmann invariant has a maximum peaked in L-Planck. The radial position of the pick does not depend on the black hole mass and the polymeric parameter. The semiclassical solution is very similar to the Reissner-Nordstrom metric. We have constructed the Carter-Penrose diagrams explicitly, giving a causal description of the space-time and its maximal extension. The LQBH metric interpolates between two asymptotically flat regions, the r to infinity region and the r to 0 region. We have studied the thermodynamics of the semiclassical solution. The temperature, entropy and the evaporation process are regular and could be defined independently from the polymeric parameter. We have studied the particular metric when the polymeric parameter goes towards to zero. This metric is regular in r=0 and has only one event horizon in r = 2m. The Kretschmann invariant maximum depends only on L-Planck. The polymeric parameter does not play any role in the black hole singularity resolution. The thermodynamics is the same.Comment: 17 pages, 19 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Institutional Repository Universiteit Antwerpen

Active children through individual vouchers – evaluation (ACTIVE): protocol for a mixed method randomised control trial to increase physical activity levels in teenagers

Author: A Bandyopadhyay
AC Moller
Charlotte Todd
CJ Riddoch
D Christian
D Mayorga-Vega
Danielle Christian
DF Dias
DW Shin
Elizabeth Ellins
EMF Sluijs van
F Gillison
F Starkey
Gareth Stratton
IM Lee
J Li
JM Garcia
JM Vaterlaus
Julian Halcox
K Corder
K Haye De la
K Morgan
L Piwek
Michaela James
MS Patel
P Calders
R Brockman
R Sutherland
RE Glasgow
RH Bradley
S Brophy
S Gorard
S Laurent
Samantha Scott
Sarah McCoubrey
SE Crouter
Sinead Brophy
SJ Fairclough
Suzanne Audrey
TG Pickering
TMF Scarapicchia
V Braun
W Hollingworth
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

BackgroundMany teenagers are insufficiently active despite the health benefits of physical activity (PA). There is strong evidence to show that inactivity and low fitness levels increase the risk of non-communicable diseases such as coronary heart disease (CHD), type 2 diabetes and breast and colon cancers (Lee et al. Lancet 380:219–29, 2012). A major barrier facing adolescents is accessibility (e.g. cost and lack of local facilities). The ACTIVE project aims to tackle this barrier through a multi-faceted intervention, giving teenagers vouchers to spend on activities of their choice and empowering young people to improve their fitness and PA levels.DesignACTIVE is a mixed methods randomised control trial in 7 secondary schools in Swansea, South Wales. Quantitative and qualitative measures including PA (cooper run test (CRT), accelerometery over 7 days), cardiovascular (CV) measures (blood pressure, pulse wave analysis) and focus groups will be undertaken at 4 separate time points (baseline, 6 months,12 months and follow-up at 18 months). Intervention schools will receive a multi-component intervention involving 12 months of £20 vouchers to spend on physical activities of their choice, a peer mentor scheme and opportunities to attend advocacy meetings. Control schools are encouraged to continue usual practice. The primary aim is to examine the effect of the intervention in improving cardiovascular fitness.DiscussionThis paper describes the protocol for the ACTIVE randomised control trial, which aims to increase fitness, physical activity and socialisation of teenagers in Swansea, UK via a voucher scheme combined with peer mentoring. Results can contribute to the evidence base on teenage physical activity and, if effective, the intervention has the potential to inform future physical activity interventions and policy

Crossref

Directory of Open Access Journals

Edge Hill University Research Information Repository

Cronfa at Swansea University

Explore Bristol Research

Limitations of estimating branch volume from terrestrial laser scanning

Author: Calders Kim
Disney Mathias
Frey Julian
Morhart Christopher
Morsdorf Felix
Raumonen Pasi
Schindler Zoe
Seifert Thomas
Sheppard Jonathan P
Publication venue: Springer
Publication date: 01/04/2024
Field of study

Quantitative structural models (QSMs) are frequently used to simplify single tree point clouds obtained by terrestrial laser scanning (TLS). QSMs use geometric primitives to derive topological and volumetric information about trees. Previous studies have shown a high agreement between TLS and QSM total volume estimates alongside field measured data for whole trees. Although already broadly applied, the uncertainties of the combination of TLS and QSM modelling are still largely unexplored. In our study, we investigated the effect of scanning distance on length and volume estimates of branches when deriving QSMs from TLS data. We scanned ten European beech (Fagus sylvatica L.) branches with an average length of 2.6 m. The branches were scanned from distances ranging from 5 to 45 m at step intervals of 5 m from three scan positions each. Twelve close-range scans were performed as a benchmark. For each distance and branch, QSMs were derived. We found that with increasing distance, the point cloud density and the cumulative length of the reconstructed branches decreased, whereas individual volumes increased. Dependent on the QSM hyperparameters, at a scanning distance of 45 m, cumulative branch length was on average underestimated by − 75%, while branch volume was overestimated by up to + 539%. We assume that the high deviations are related to point cloud quality. As the scanning distance increases, the size of the individual laser footprints and the distances between them increase, making it more difficult to fully capture small branches and to adjust suitable QSMs

ZORA

Limitations of estimating branch volume from terrestrial laser scanning

Author: Calders Kim
Disney Mathias
Frey Julian
Morhart Christopher
Morsdorf Felix
Raumonen Pasi
Schindler Zoe
Seifert Thomas
Sheppard Jonathan P
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/01/2024
Field of study

UCL Discovery

Prediction of falling among stroke patients in rehabilitation

Author: A Kegel
D Cambier
G Vanderstraeten
P Calders
T Baetens
Publication venue: 'Acta Dermato-Venereologica'
Publication date: 01/01/2011
Field of study

Crossref

Assessment of Bias in Pan-Tropical Biomass Predictions

Author: Burt A
Calders K
Cuni-Sanchez A
Disney M
Gómez-Dans J
Lewis P
Lewis SL
Malhi Y
Phillips OL
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2020
Field of study

Above-ground biomass (AGB) is an essential descriptor of forests, of use in ecological and climate-related research. At tree- and stand-scale, destructive but direct measurements of AGB are replaced with predictions from allometric models characterizing the correlational relationship between AGB, and predictor variables including stem diameter, tree height and wood density. These models are constructed from harvested calibration data, usually via linear regression. Here, we assess systematic error in out-of-sample predictions of AGB introduced during measurement, compilation and modeling of in-sample calibration data. Various conventional bivariate and multivariate models are constructed from open access data of tropical forests. Metadata analysis, fit diagnostics and cross-validation results suggest several model misspecifications: chiefly, unaccounted for inconsistent measurement error in predictor variables between in- and out-of-sample data. Simulations demonstrate conservative inconsistencies can introduce significant bias into tree- and stand-scale AGB predictions. When tree height and wood density are included as predictors, models should be modified to correct for bias. Finally, we explore a fundamental assumption of conventional allometry, that model parameters are independent of tree size. That is, the same model can provide predictions of consistent trueness irrespective of size-class. Most observations in current calibration datasets are from smaller trees, meaning the existence of a size dependency would bias predictions for larger trees. We determine that detecting the absence or presence of a size dependency is currently prevented by model misspecifications and calibration data imbalances. We call for the collection of additional harvest data, specifically under-represented larger trees

Ghent University Academic Bibliography

White Rose Research Online

Benchmarking airborne laser scanning tree segmentation algorithms in broadleaf forests shows high accuracy only for canopy trees

Author: Ball JGC
Burt A
Calders K
Cao Y
Coomes DA
Disney M
Jackson TD
Knapp N
Lin Y
Steinmeier L
Wilkes P
Publication venue: 'Elsevier BV'
Publication date: 01/09/2023
Field of study

Individual tree segmentation from airborne laser scanning data is a longstanding and important challenge in forest remote sensing. Tree segmentation algorithms are widely available, but robust intercomparison studies are rare due to the difficulty of obtaining reliable reference data. Here we provide a benchmark data set for temperate and tropical broadleaf forests generated from labelled terrestrial laser scanning data. We compared the performance of four widely used tree segmentation algorithms against this benchmark data set. All algorithms performed reasonably well on the canopy trees. The point cloud-based algorithm AMS3D (Adaptive Mean Shift 3D) had the highest overall accuracy, closely followed by the 2D raster based region growing algorithm Dalponte2016 +. However, all algorithms failed to accurately segment the understory trees. This result was consistent across both forest types. This study emphasises the need to assess tree segmentation algorithms directly using benchmark data, rather than comparing with forest indices such as biomass or the number and size distribution of trees. We provide the first openly available benchmark data set for tropical forests and we hope future studies will extend this work to other regions

UCL Discovery

Leaf and wood classification framework for terrestrial LiDAR point clouds

Author: Burt A
Calders K
Disney M
Vicari MB
Wilkes P
Woodgate W
Publication venue
Publication date: 30/01/2019
Field of study

Methods in Ecology and Evolution published by John Wiley & Sons Ltd on behalf of British Ecological Society. Leaf and wood separation is a key step to allow a new range of estimates from Terrestrial LiDAR data, such as quantifying above-ground biomass, leaf and wood area and their 3D spatial distributions. We present a new method to separate leaf and wood from single tree point clouds automatically. Our approach combines unsupervised classification of geometric features and shortest path analysis. The automated separation algorithm and its intermediate steps are presented and validated. Validation consisted of using a testing framework with synthetic point clouds, simulated using ray-tracing and 3D tree models and 10 field scanned tree point clouds. To evaluate results we calculated accuracy, kappa coefficient and F-score. Validation using simulated data resulted in an overall accuracy of 0.83, ranging from 0.71 to 0.94. Per tree average accuracy from synthetic data ranged from 0.77 to 0.89. Field data results presented and overall average accuracy of 0.89. Analysis of each step showed accuracy ranging from 0.75 to 0.98. F-scores from both simulated and field data were similar, with scores from leaf usually higher than for wood. Our separation method showed results similar to others in literature, albeit from a completely automated workflow. Analysis of each separation step suggests that the addition of path analysis improved the robustness of our algorithm. Accuracy can be improved with per tree parameter optimization. The library containing our separation script can be easily installed and applied to single tree point cloud. Average processing times are below 10 min for each tree

UCL Discovery