Search CORE

253 research outputs found

Efficiently Clustering Very Large Attributed Graphs

Author: Akoglu L.
Boldi P.
Combe D.
Deza M.M.
Diestel R.
Duong K.-C.
Protter M. H.
Villa-Vialaneix N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Attributed graphs model real networks by enriching their nodes with attributes accounting for properties. Several techniques have been proposed for partitioning these graphs into clusters that are homogeneous with respect to both semantic attributes and to the structure of the graph. However, time and space complexities of state of the art algorithms limit their scalability to medium-sized graphs. We propose SToC (for Semantic-Topological Clustering), a fast and scalable algorithm for partitioning large attributed graphs. The approach is robust, being compatible both with categorical and with quantitative attributes, and it is tailorable, allowing the user to weight the semantic and topological components. Further, the approach does not require the user to guess in advance the number of clusters. SToC relies on well known approximation techniques such as bottom-k sketches, traditional graph-theoretic concepts, and a new perspective on the composition of heterogeneous distance measures. Experimental results demonstrate its ability to efficiently compute high-quality partitions of large scale attributed graphs.Comment: This work has been published in ASONAM 2017. This version includes an appendix with validation of our attribute model and distance function, omitted in the converence version for lack of space. Please refer to the published versio

arXiv.org e-Print Archive

Crossref

Archivio della Ricerca - Università di Pisa

Archivio della Ricerca - Università di Roma 3

Outlier Edge Detection Using Random Graph Generation Models and Applications

Author: A Lancichinetti
AK Jain
DJ Watts
G Karypis
H Zhang
J Leskovec
J Shi
J Yang
L Akoglu
L Danon
L Danon
L Liu
L Lu
L Waltman
LC Freeman
M Choudhury De
M Coscia
M Newman
M Rosvall
ME Newman
ME Newman
MEJ Newman
MR Brito
R Yu
S Fortunato
S Lloyd
S Papadopoulos
SE Schaeffer
VD Blondel
VJ Hodge
X Dong
Publication venue
Publication date: 21/06/2016
Field of study

Outliers are samples that are generated by different mechanisms from other normal data samples. Graphs, in particular social network graphs, may contain nodes and edges that are made by scammers, malicious programs or mistakenly by normal users. Detecting outlier nodes and edges is important for data mining and graph analytics. However, previous research in the field has merely focused on detecting outlier nodes. In this article, we study the properties of edges and propose outlier edge detection algorithms using two random graph generation models. We found that the edge-ego-network, which can be defined as the induced graph that contains two end nodes of an edge, their neighboring nodes and the edges that link these nodes, contains critical information to detect outlier edges. We evaluated the proposed algorithms by injecting outlier edges into some real-world graph data. Experiment results show that the proposed algorithms can effectively detect outlier edges. In particular, the algorithm based on the Preferential Attachment Random Graph Generation model consistently gives good performance regardless of the test graph data. Further more, the proposed algorithms are not limited in the area of outlier edge detection. We demonstrate three different applications that benefit from the proposed algorithms: 1) a preprocessing tool that improves the performance of graph clustering algorithms; 2) an outlier node detection algorithm; and 3) a novel noisy data clustering algorithm. These applications show the great potential of the proposed outlier edge detection techniques.Comment: 14 pages, 5 figures, journal pape

arXiv.org e-Print Archive

Qatar University Institutional Repository

Crossref

Directory of Open Access Journals

Trepo - Institutional Repository of Tampere University

Recommended from our members

Recalcitrant Nicolau syndrome following repeated intramuscular diclofenac injections

Author: Akoglu Gulsen
Simsek Gulcin
Unal Ismail H.
Publication venue: eScholarship, University of California
Publication date: 01/01/2025
Field of study

eScholarship - University of California

Reducing Controversy by Connecting Opposing Views

Author: Akoglu L.
Conover M.
Golub G. H.
Guerra P. H. C.
Guo G.
Mejova Y.
Munson S. A.
Pariser E.
Publication venue: International Joint Conference on Artificial Intelligence, Inc
Publication date: 24/05/2018
Field of study

Peer reviewe

arXiv.org e-Print Archive

Crossref

Helsingin yliopiston digitaalinen arkisto

The iPlant Collaborative: Cyberinfrastructure for Plant Biology

Author: Akoglu A.
Andrews G.
Ane C.
Boyle B.
Brutnell T.
Cazes J.
Cranston K.
Donoghue M. J.
Dooley R.
Enquist B. J.
Feng X.
Gendler K.
Gessler D.
Goff S. A
Gonzales M
Grene R.
Hanlon M.
Helmke M.
Hilgert U.
Hopkins N.
Jordan C.
Kim S. J.
Kleibenstein D. J.
Koesterke L.
Kubach A.
Kvilekval K.
Leebens-Mack J.
Lenards A.
Lent M.
Lowenthal D.
Lowry S.
Lu Z.
Lyons E.
Manjunath B.S.
Matasci N.
McKay S.
McLay R.
Merchant N.
Micklos D.
Mock S.
Muir A.
Myers C. R.
Narro M.
Noutsos C.
O'Meara B.
Pasternak S.
Piel W. H.
Ram S.
Sanderson M. J.
Skidmore E.
Soltis D.
Soltis P.
Spalding E. P.
Stamatakis A.
Stanzione D.
Stapleton A. E
Stein L.
Tang C.
Tannen V.
Vaughn M.
Vision T. J.
Wang L.
Ware D.
Welch S. M.
White J. W.
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2011
Field of study

The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure in support of plant biology research (PSCIC, 2006). iPlant is developing cyberinfrastructure that uniquely enables scientists throughout the diverse fields that comprise plant biology to address Grand Challenges in new ways, to stimulate and facilitate cross-disciplinary research, to promote biology and computer science research interactions, and to train the next generation of scientists on the use of cyberinfrastructure in research and education. Meeting humanity's projected demands for agricultural and forest products and the expectation that natural ecosystems be managed sustainably will require synergies from the application of information technologies. The iPlant cyberinfrastructure design is based on an unprecedented period of research community input, and leverages developments in high-performance computing, data storage, and cyberinfrastructure for the physical sciences. iPlant is an open-source project with application programming interfaces that allow the community to extend the infrastructure to meet its needs. iPlant is sponsoring community-driven workshops addressing specific scientific questions via analysis tool integration and hypothesis testing. These workshops teach researchers how to add bioinformatics tools and/or datasets into the iPlant cyberinfrastructure enabling plant scientists to perform complex analyses on large datasets without the need to master the command-line or high-performance computational services

Cold Spring Harbor Laboratory Institutional Repository

On defining rules for cancer data fabrication

Author: A Adir
A Silvina
CE Roffman
DB Rubin
E Bilgory
E Tsang
G Caiola
H Akoglu
H-M Adorf
JP Reiter
L de Moura
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Funding: This research is partially funded by the Data Lab, and the EU H2020 project Serums: Securing Medical Data in Smart Patient-Centric Healthcare Systems (grant 826278).Data is essential for machine learning projects, and data accuracy is crucial for being able to trust the results obtained from the associated machine learning models. Previously, we have developed machine learning models for predicting the treatment outcome for breast cancer patients that have undergone chemotherapy, and developed a monitoring system for their treatment timeline showing interactively the options and associated predictions. Available cancer datasets, such as the one used earlier, are often too small to obtain significant results, and make it difficult to explore ways to improve the predictive capability of the models further. In this paper, we explore an alternative to enhance our datasets through synthetic data generation. From our original dataset, we extract rules to generate fabricated data that capture the different characteristics inherent in the dataset. Additional rules can be used to capture general medical knowledge. We show how to formulate rules for our cancer treatment data, and use the IBM solver to obtain a corresponding synthetic dataset. We discuss challenges for future work.Postprin

Crossref

University of St. Andrews - Pure

St Andrews Research Repository

A model-driven framework for developing android-based classic multiplayer 2D board games

Author: H Akoglu
G Albaum
J Bézivin
M Brambilla
F Budinsky
JP Hinebaugh
C Kelly
A Kleppe
M Lachgar
R Meier
J Novak
M Núñez
ER Núñez-Valdez
RS Pressman
EM Reyno
C Rieger
A Rollings
P Schober
S Sendall
D Steinberg
FT Tschang
ERN Valdez
S Vaupel
C Wohlin
Publication venue: Springer
Publication date: 01/01/2021
Field of study

Mobile applications and game development are attractive fields in software engineering. Despite the advancement of programming languages and integrated development environments, there have always been many challenges for software and mobile game developers. Model-Driven Engineering (MDE) is a software engineering methodology that applies software modeling languages for modeling the problem domain. In this paradigm, the code is to be automatically generated from the models by applying different model transformations. Besides, manipulating models instead of code facilitates the discovery and resolution of errors due to the high level of abstraction. This study presents an approach and framework, called MAndroid, that generates Android-based classic multiplayer 2D board games in a fully automated fashion, relying on the concepts of MDE. Structural and behavioral dimensions of the game are first modeled in MAndroid. Models are then automatically transformed to code that can be run on any mobile phone and tablet running Android 4.4 or higher. In order to evaluate the proposed approach, three board games are fully implemented. Additionally, applicability, developer performance, simplicity and attractiveness of MAndroid are evaluated through a set of questionnaires. MAndroid is also evaluated technically by comparing it to other Android game-development frameworks. Results demonstrate the benefits of using MAndroid.PGC2018-094905-B-I00 US-1264651 RTI2018-101204-B-C21 P18-FR-289

Crossref

Repositorio Institucional Universidad de Málaga

King's Research Portal

Improving shared decision-making about cancer treatment through design-based data-driven decision-support tools and redesigning care paths:an overview of the 4D PICTURE project

Background: Patients with cancer often have to make complex decisions about treatment, with the options varying in risk profiles and effects on survival and quality of life. Moreover, inefficient care paths make it hard for patients to participate in shared decision-making. Data-driven decision-support tools have the potential to empower patients, support personalized care, improve health outcomes and promote health equity. However, decision-support tools currently seldom consider quality of life or individual preferences, and their use in clinical practice remains limited, partly because they are not well integrated in patients' care paths.Aim and objectives: The central aim of the 4D PICTURE project is to redesign patients' care paths and develop and integrate evidence-based decision-support tools to improve decision-making processes in cancer care delivery. This article presents an overview of this international, interdisciplinary project.Design, methods and analysis: In co-creation with patients and other stakeholders, we will develop data-driven decision-support tools for patients with breast cancer, prostate cancer and melanoma. We will support treatment decisions by using large, high-quality datasets with state-of-the-art prognostic algorithms. We will further develop a conversation tool, the Metaphor Menu, using text mining combined with citizen science techniques and linguistics, incorporating large datasets of patient experiences, values and preferences. We will further develop a promising methodology, MetroMapping, to redesign care paths. We will evaluate MetroMapping and these integrated decision-support tools, and ensure their sustainability using the Nonadoption, Abandonment, Scale-Up, Spread, and Sustainability (NASSS) framework. We will explore the generalizability of MetroMapping and the decision-support tools for other types of cancer and across other EU member states.Ethics: Through an embedded ethics approach, we will address social and ethical issues.Discussion: Improved care paths integrating comprehensive decision-support tools have the potential to empower patients, their significant others and healthcare providers in decision-making and improve outcomes. This project will strengthen health care at the system level by improving its resilience and efficiency.Improving the cancer patient journey and respecting personal preferences: an overview of the 4D PICTURE projectThe 4D PICTURE project aims to help cancer patients, their families and healthcare providers better undertstand their options. It supports their treatment and care choices, at each stage of disease, by drawing on large amounts of evidence from different types of European data. The project involves experts from many different specialist areas who are based in nine European countries. The overall aim is to improve the cancer patient journey and ensure personal preferences are respected

EUR Research Repository

Ecological indicators to capture the effects of fishing on biodiversityand conservation status of marine ecosystems

Author: Akoglu A.G.
Banaru D.
Boldt J.L.
Borges M.F.
Bundy A.
Coll M.
Cook A.
Diallo I.
Fox C.
Fu C.
Gascuel D.
Gurney L.J.
Hattab T.
Heymans J.J.
Jouffre D.
Juan-Jordá M.J.
Kleisner K.M.
Knight B.R.
Kucukavsar S.
Large S.I.
Lynam C.
Machias A.
Marshall K.N.
Masski H.
Ojaveer H.
Piroddi C.
Shannon L. J.
Shin Y.-J.
Tam J.
Thiao D.
Thiaw M.
Torres M.A.
Travers-Trolet M.
Tsagarakis K.
Tuck I.
van der Meeren G.I.
Yemane D.
Zador S.G.
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2016
Field of study

IndiSeas (“Indicators for the Seas”) is a collaborative international working group that was established in2005 to evaluate the status of exploited marine ecosystems using a suite of indicators in a comparative framework. An initial shortlist of seven ecological indicators was selected to quantify the effects of fishing on the broader ecosystem using several criteria (i.e., ecological meaning, sensitivity to fishing, data avail-ability, management objectives and public awareness). The suite comprised: (i) the inverse coefficient of variation of total biomass of surveyed species, (ii) mean fish length in the surveyed community, (iii)mean maximum life span of surveyed fish species, (iv) proportion of predatory fish in the surveyed community, (v) proportion of under and moderately exploited stocks, (vi) total biomass of surveyed species,and (vii) mean trophic level of the landed catch. In line with the Nagoya Strategic Plan of the Convention on Biological Diversity (2011–2020), we extended this suite to emphasize the broader biodiversity and conservation risks in exploited marine ecosystems. We selected a subset of indicators from a list of empirically based candidate biodiversity indicators initially established based on ecological significance to complement the original IndiSeas indicators. The additional selected indicators were: (viii) mean intrinsic vulnerability index of the fish landed catch, (ix) proportion of non-declining exploited species in the surveyed community, (x) catch-based marine trophic index, and (xi) mean trophic level of the surveyed community. Despite the lack of data in some ecosystems, we also selected (xii) mean trophic level of the modelled community, and (xiii) proportion of discards in the fishery as extra indicators. These additional indicators were examined, along with the initial set of IndiSeas ecological indicators, to evaluate whether adding new biodiversity indicators provided useful additional information to refine our under-standing of the status evaluation of 29 exploited marine ecosystems. We used state and trend analyses,and we performed correlation, redundancy and multivariate tests. Existing developments in ecosystem-based fisheries management have largely focused on exploited species. Our study, using mostly fisheries independent survey-based indicators, highlights that biodiversity and conservation-based indicators are complementary to ecological indicators of fishing pressure. Thus, they should be used to provide additional information to evaluate the overall impact of fishing on exploited marine ecosystems

Effectiveness of septoplasty versus non-surgical management for nasal obstruction due to a deviated nasal septum in adults: study protocol for a randomized controlled trial

Author: AY Korkut
C Hopkins
C. T. M. Hendriks
D Fayter
D Manestar
DA Nunez
DG Roblin
DR Taves
E Akoglu
EH Akker Van Den
G Berger
J Sipila
JF Piccirillo
JR Buckland
K Robinson
LF Grymer
M Jessen
M Moore
M Zwarenstein
M. M. H. T. van Egmond
M. M. Rovers
MG Stewart
MG Stewart
ML Hytonen
MM Elahi
N. van Heerbeek
OK Kahveci
P Illum
PD Manoukian
S Joniau
SJ Pocock
TF Bezerra
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref