Search CORE

1,217 research outputs found

Comparing the hierarchy of author given tags and repository given tags in a large document archive

Author: Palla Gergely
Pollner Péter
Tibély Gergely
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/06/2015
Field of study

Folksonomies - large databases arising from collaborative tagging of items by independent users - are becoming an increasingly important way of categorizing information. In these systems users can tag items with free words, resulting in a tripartite item-tag-user network. Although there are no prescribed relations between tags, the way users think about the different categories presumably has some built in hierarchy, in which more special concepts are descendants of some more general categories. Several applications would benefit from the knowledge of this hierarchy. Here we apply a recent method to check the differences and similarities of hierarchies resulting from tags given by independent individuals and from tags given by a centrally managed repository system. The results from out method showed substantial differences between the lower part of the hierarchies, and in contrast, a relatively high similarity at the top of the hierarchies.Comment: 10 page

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Parallel clustering with CFinder

Author: Palla Gergely
Pollner Peter
Vicsek Tamas
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/03/2012
Field of study

The amount of available data about complex systems is increasing every year, measurements of larger and larger systems are collected and recorded. A natural representation of such data is given by networks, whose size is following the size of the original system. The current trend of multiple cores in computing infrastructures call for a parallel reimplementation of earlier methods. Here we present the grid version of CFinder, which can locate overlapping communities in directed, weighted or undirected networks based on the clique percolation method (CPM). We show that the computation of the communities can be distributed among several CPU-s or computers. Although switching to the parallel version not necessarily leads to gain in computing time, it definitely makes the community structure of extremely large networks accessible.Comment: Electronic version of an article published as http://www.worldscinet.com/ppl/22/2201/S0129626412400014.html copyright World Scientific Publishing Compan

arXiv.org e-Print Archive

Crossref

Designing the payout phase of funded pension pillars in central and eastern European countries

Author: Pollner John
Rudolph Heinz
Vittas Dimitri
Publication venue
Publication date
Field of study

Over the past decade or so, most Central and Eastern European countries have reformed their pension systems, significantly downsizing their public pillars and creating private pillars based on capitalization accounts. Early policy attention was focused on the accumulation phase but several countries are now reaching the stage where they need to address the design of the payout phase. This paper reviews the complex policy issues that will confront policymakers in this effort and summarizes recent plans and developments in four countries (Poland, Hungary, Estonia, and Lithuania). The paper concludes by highlighting a number of options that merit detailed consideration.Debt Markets,Pensions&Retirement Systems,Financial Literacy,Insurance&Risk Mitigation,Investment and Investment Climate

Research Papers in Economics

Extracting tag hierarchies

Author: Palla Gergely
Pollner Péter
Tibély Gergely
Vicsek Tamás
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Tagging items with descriptive annotations or keywords is a very natural way to compress and highlight information about the properties of the given entity. Over the years several methods have been proposed for extracting a hierarchy between the tags for systems with a "flat", egalitarian organization of the tags, which is very common when the tags correspond to free words given by numerous independent people. Here we present a complete framework for automated tag hierarchy extraction based on tag occurrence statistics. Along with proposing new algorithms, we are also introducing different quality measures enabling the detailed comparison of competing approaches from different aspects. Furthermore, we set up a synthetic, computer generated benchmark providing a versatile tool for testing, with a couple of tunable parameters capable of generating a wide range of test beds. Beside the computer generated input we also use real data in our studies, including a biological example with a pre-defined hierarchy between the tags. The encouraging similarity between the pre-defined and reconstructed hierarchy, as well as the seemingly meaningful hierarchies obtained for other real systems indicate that tag hierarchy extraction is a very promising direction for further research with a great potential for practical applications.Comment: 25 pages with 21 pages of supporting information, 25 figure

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

PubMed Central

Repository of the Academy's Library

ELTE Digital Institutional Repository (EDIT)

The Francis Crick Institute

New Query Lower Bounds for Submodular Function Minimization

Author: Graur Andrei
Pollner Tristan
Ramaswamy Vidhya
Weinberg S. Matthew
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 11th Innovations in Theoretical Computer Science Conference (ITCS 2020)
Publication date: 15/11/2019
Field of study

We consider submodular function minimization in the oracle model: given black-box access to a submodular set function

f:2^{[n]}\rightarrow \mathbb{R}

, find an element of

\arg\min_S \{f(S)\}

using as few queries to

f(\cdot)

as possible. State-of-the-art algorithms succeed with

\tilde{O}(n^2)

queries [LeeSW15], yet the best-known lower bound has never been improved beyond

n

[Harvey08]. We provide a query lower bound of

2n

for submodular function minimization, a

3n/2-2

query lower bound for the non-trivial minimizer of a symmetric submodular function, and a

\binom{n}{2}

query lower bound for the non-trivial minimizer of an asymmetric submodular function. Our

3n/2-2

lower bound results from a connection between SFM lower bounds and a novel concept we term the cut dimension of a graph. Interestingly, this yields a

3n/2-2

cut-query lower bound for finding the global mincut in an undirected, weighted graph, but we also prove it cannot yield a lower bound better than

n+1

for

s

t

mincut, even in a directed, weighted graph

arXiv.org e-Print Archive

Princeton University Open Access Repository

DROPS Dagstuhl Research Online Publication Server

Scientometrics: Untangling the topics

Author: Farkas Illes J.
Pollner Peter
Szanto-Varnagy Adam
Vicsek Tamas
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2014
Field of study

Measuring science is based on comparing articles to similar others. However, keyword-based groups of thematically similar articles are dominantly small. These small sizes keep the statistical errors of comparisons high. With the growing availability of bibliographic data such statistical errors can be reduced by merging methods of thematic grouping, citation networks and keyword co-usage.Comment: 2 pages, 2 figure

arXiv.org e-Print Archive

ELTE Digital Institutional Repository (EDIT)