Search CORE

1,008 research outputs found

Space and Time Efficient Parallel Graph Decomposition, Clustering, and Diameter Approximation

Author: Ceccarello Matteo
Pietracaprina Andrea
Pucci Geppino
Upfal Eli
Publication venue
Publication date: 01/01/2015
Field of study

We develop a novel parallel decomposition strategy for unweighted, undirected graphs, based on growing disjoint connected clusters from batches of centers progressively selected from yet uncovered nodes. With respect to similar previous decompositions, our strategy exercises a tighter control on both the number of clusters and their maximum radius. We present two important applications of our parallel graph decomposition: (1)

k

-center clustering approximation; and (2) diameter approximation. In both cases, we obtain algorithms which feature a polylogarithmic approximation factor and are amenable to a distributed implementation that is geared for massive (long-diameter) graphs. The total space needed for the computation is linear in the problem size, and the parallel depth is substantially sublinear in the diameter for graphs with low doubling dimension. To the best of our knowledge, ours are the first parallel approximations for these problems which achieve sub-diameter parallel time, for a relevant class of graphs, using only linear space. Besides the theoretical guarantees, our algorithms allow for a very simple implementation on clustered architectures: we report on extensive experiments which demonstrate their effectiveness and efficiency on large graphs as compared to alternative known approaches.Comment: 14 page

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Padova

A Practical Parallel Algorithm for Diameter Approximation of Massive Weighted Graphs

Author: Ceccarello Matteo
Pietracaprina Andrea
Pucci Geppino
Upfal Eli
Publication venue
Publication date: 09/11/2015
Field of study

We present a space and time efficient practical parallel algorithm for approximating the diameter of massive weighted undirected graphs on distributed platforms supporting a MapReduce-like abstraction. The core of the algorithm is a weighted graph decomposition strategy generating disjoint clusters of bounded weighted radius. Theoretically, our algorithm uses linear space and yields a polylogarithmic approximation guarantee; moreover, for important practical classes of graphs, it runs in a number of rounds asymptotically smaller than those required by the natural approximation provided by the state-of-the-art

\Delta

-stepping SSSP algorithm, which is its only practical linear-space competitor in the aforementioned computational scenario. We complement our theoretical findings with an extensive experimental analysis on large benchmark graphs, which demonstrates that our algorithm attains substantial improvements on a number of key performance indicators with respect to the aforementioned competitor, while featuring a similar approximation ratio (a small constant less than 1.4, as opposed to the polylogarithmic theoretical bound)

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Padova

An Efficient Rigorous Approach for Identifying Statistically Significant Frequent Itemsets

Author: Kirsch Adam
Mitzenmacher Michael
Pietracaprina Andrea
Pucci Geppino
Upfal Eli
Vandin Fabio
Publication venue
Publication date: 01/01/2009
Field of study

As advances in technology allow for the collection, storage, and analysis of vast amounts of data, the task of screening and assessing the significance of discovered patterns is becoming a major challenge in data mining applications. In this work, we address significance in the context of frequent itemset mining. Specifically, we develop a novel methodology to identify a meaningful support threshold s* for a dataset, such that the number of itemsets with support at least s* represents a substantial deviation from what would be expected in a random dataset with the same number of transactions and the same individual item frequencies. These itemsets can then be flagged as statistically significant with a small false discovery rate. We present extensive experimental results to substantiate the effectiveness of our methodology.Comment: A preliminary version of this work was presented in ACM PODS 2009. 20 pages, 0 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Archivio istituzionale della ricerca - Università di Padova

MapReduce and Streaming Algorithms for Diversity Maximization in Metric Spaces of Bounded Doubling Dimension

Author: Ceccarello Matteo
Pietracaprina Andrea
Pucci Geppino
Upfal Eli
Publication venue
Publication date: 01/01/2017
Field of study

Given a dataset of points in a metric space and an integer

k

, a diversity maximization problem requires determining a subset of

k

points maximizing some diversity objective measure, e.g., the minimum or the average distance between two points in the subset. Diversity maximization is computationally hard, hence only approximate solutions can be hoped for. Although its applications are mainly in massive data analysis, most of the past research on diversity maximization focused on the sequential setting. In this work we present space and pass/round-efficient diversity maximization algorithms for the Streaming and MapReduce models and analyze their approximation guarantees for the relevant class of metric spaces of bounded doubling dimension. Like other approaches in the literature, our algorithms rely on the determination of high-quality core-sets, i.e., (much) smaller subsets of the input which contain good approximations to the optimal solution for the whole input. For a variety of diversity objective functions, our algorithms attain an

(\alpha+\epsilon)

-approximation ratio, for any constant

\epsilon>0

, where

\alpha

is the best approximation ratio achieved by a polynomial-time, linear-space sequential algorithm for the same diversity objective. This improves substantially over the approximation ratios attainable in Streaming and MapReduce by state-of-the-art algorithms for general metric spaces. We provide extensive experimental evidence of the effectiveness of our algorithms on both real world and synthetic datasets, scaling up to over a billion points.Comment: Extended version of http://www.vldb.org/pvldb/vol10/p469-ceccarello.pdf, PVLDB Volume 10, No. 5, January 201

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Padova

A new polystyrene-based ionomer/MWCNT nanocomposite for wearable skin temperature sensors

Author: Alessio Giuliani
Andrea Pucci
Fabio Di Francesco
Massimo Placidi
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

The present work outlines the fabrication and testing of a novel skin temperature sensor based on exfoliated and undamaged multi-walled carbon nanotubes (MWCNTs) dispersed in a poly(vinylbenzyl chloride) derivative with triethylamine (PVBC_Et3N). The dispersions were prepared by sonicating MWCNT/ PVBC_Et3N mixtures in dimethylformamide for 5 min and the quantification of the MWCNTs dispersed was evaluated by UV–vis spectroscopy investigations and thermogravimetric analyses. The investigations demonstrated the realization of MWCNT/PVBC_Et3N sensors with a resistance sensitivity to temperature close to 0.004 K1, an absolute value that is comparable to the highest values found in metals. The temperature dependence of the resistance was also found very reproducible in the range 20–40 C, thus suggesting the possibility of using the MWCNT/PVBC_Et3N system for the fabrication of small wearable temperature sensors for the monitoring of chronic wounds

Crossref

Archivio della Ricerca - Università di Pisa

Diabetes mellitus and ischemic heart disease. the role of ion channels

Author: De Marchis Marialaura
D’Amato Andrea
Fedele Francesco
Mancone Massimo
Netti Lucrezia
Palmirotta Raffaele
Pucci Mariateresa
Severino Paolo
Volterrani Maurizio
Publication venue: 'MDPI AG'
Publication date: 01/01/2018
Field of study

Diabetes mellitus is one the strongest risk factors for cardiovascular disease and, in particular, for ischemic heart disease (IHD). The pathophysiology of myocardial ischemia in diabetic patients is complex and not fully understood: some diabetic patients have mainly coronary stenosis obstructing blood flow to the myocardium; others present with coronary microvascular disease with an absence of plaques in the epicardial vessels. Ion channels acting in the cross-talk between the myocardial energy state and coronary blood flow may play a role in the pathophysiology of IHD in diabetic patients. In particular, some genetic variants for ATP-dependent potassium channels seem to be involved in the determinism of IH

Multidisciplinary Digital Publishing Institute

Crossref

Directory of Open Access Journals

Archivio della ricerca- Università di Roma La Sapienza

Accurate MapReduce Algorithms for k-Median and k-Means in General Metric Spaces

Author: Mazzetto Alessio
Pietracaprina Andrea
Pucci Geppino
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th International Symposium on Algorithms and Computation (ISAAC 2019)
Publication date: 01/01/2019
Field of study

Center-based clustering is a fundamental primitive for data analysis and becomes very challenging for large datasets. In this paper, we focus on the popular k-median and k-means variants which, given a set P of points from a metric space and a parameter k<|P|, require to identify a set S of k centers minimizing, respectively, the sum of the distances and of the squared distances of all points in P from their closest centers. Our specific focus is on general metric spaces, for which it is reasonable to require that the centers belong to the input set (i.e., S subseteq P). We present coreset-based 3-round distributed approximation algorithms for the above problems using the MapReduce computational model. The algorithms are rather simple and obliviously adapt to the intrinsic complexity of the dataset, captured by the doubling dimension D of the metric space. Remarkably, the algorithms attain approximation ratios that can be made arbitrarily close to those achievable by the best known polynomial-time sequential approximations, and they are very space efficient for small D, requiring local memory sizes substantially sublinear in the input size. To the best of our knowledge, no previous distributed approaches were able to attain similar quality-performance guarantees in general metric spaces

arXiv.org e-Print Archive

DROPS Dagstuhl Research Online Publication Server

Archivio istituzionale della ricerca - Università di Padova

Indole-3-acetic acid improves Escherichia coli's defences to stress.

Author: AMORESANO ANGELA
BIANCO C
CALOGERO R
CARPENTIERI ANDREA
DEFEZ R.
IMPERLINI E
PUCCI PIETRO
SENATORE B
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Archivio della ricerca - Università degli studi di Napoli Federico II