773 research outputs found
The evolution of interdisciplinarity in physics research
Science, being a social enterprise, is subject to fragmentation into groups
that focus on specialized areas or topics. Often new advances occur through
cross-fertilization of ideas between sub-fields that otherwise have little
overlap as they study dissimilar phenomena using different techniques. Thus to
explore the nature and dynamics of scientific progress one needs to consider
the large-scale organization and interactions between different subject areas.
Here, we study the relationships between the sub-fields of Physics using the
Physics and Astronomy Classification Scheme (PACS) codes employed for
self-categorization of articles published over the past 25 years (1985-2009).
We observe a clear trend towards increasing interactions between the different
sub-fields. The network of sub-fields also exhibits core-periphery
organization, the nucleus being dominated by Condensed Matter and General
Physics. However, over time Interdisciplinary Physics is steadily increasing
its share in the network core, reflecting a shift in the overall trend of
Physics research.Comment: Published version, 10 pages, 8 figures + Supplementary Informatio
Navigability is a Robust Property
The Small World phenomenon has inspired researchers across a number of
fields. A breakthrough in its understanding was made by Kleinberg who
introduced Rank Based Augmentation (RBA): add to each vertex independently an
arc to a random destination selected from a carefully crafted probability
distribution. Kleinberg proved that RBA makes many networks navigable, i.e., it
allows greedy routing to successfully deliver messages between any two vertices
in a polylogarithmic number of steps. We prove that navigability is an inherent
property of many random networks, arising without coordination, or even
independence assumptions
Fast matrix computations for pair-wise and column-wise commute times and Katz scores
We first explore methods for approximating the commute time and Katz score
between a pair of nodes. These methods are based on the approach of matrices,
moments, and quadrature developed in the numerical linear algebra community.
They rely on the Lanczos process and provide upper and lower bounds on an
estimate of the pair-wise scores. We also explore methods to approximate the
commute times and Katz scores from a node to all other nodes in the graph.
Here, our approach for the commute times is based on a variation of the
conjugate gradient algorithm, and it provides an estimate of all the diagonals
of the inverse of a matrix. Our technique for the Katz scores is based on
exploiting an empirical localization property of the Katz matrix. We adopt
algorithms used for personalized PageRank computing to these Katz scores and
theoretically show that this approach is convergent. We evaluate these methods
on 17 real world graphs ranging in size from 1000 to 1,000,000 nodes. Our
results show that our pair-wise commute time method and column-wise Katz
algorithm both have attractive theoretical properties and empirical
performance.Comment: 35 pages, journal version of
http://dx.doi.org/10.1007/978-3-642-18009-5_13 which has been submitted for
publication. Please see
http://www.cs.purdue.edu/homes/dgleich/publications/2011/codes/fast-katz/ for
supplemental code
Risk-Averse Matchings over Uncertain Graph Databases
A large number of applications such as querying sensor networks, and
analyzing protein-protein interaction (PPI) networks, rely on mining uncertain
graph and hypergraph databases. In this work we study the following problem:
given an uncertain, weighted (hyper)graph, how can we efficiently find a
(hyper)matching with high expected reward, and low risk?
This problem naturally arises in the context of several important
applications, such as online dating, kidney exchanges, and team formation. We
introduce a novel formulation for finding matchings with maximum expected
reward and bounded risk under a general model of uncertain weighted
(hyper)graphs that we introduce in this work. Our model generalizes
probabilistic models used in prior work, and captures both continuous and
discrete probability distributions, thus allowing to handle privacy related
applications that inject appropriately distributed noise to (hyper)edge
weights. Given that our optimization problem is NP-hard, we turn our attention
to designing efficient approximation algorithms. For the case of uncertain
weighted graphs, we provide a -approximation algorithm, and a
-approximation algorithm with near optimal run time. For the case
of uncertain weighted hypergraphs, we provide a
-approximation algorithm, where is the rank of the
hypergraph (i.e., any hyperedge includes at most nodes), that runs in
almost (modulo log factors) linear time.
We complement our theoretical results by testing our approximation algorithms
on a wide variety of synthetic experiments, where we observe in a controlled
setting interesting findings on the trade-off between reward, and risk. We also
provide an application of our formulation for providing recommendations of
teams that are likely to collaborate, and have high impact.Comment: 25 page
Theories for influencer identification in complex networks
In social and biological systems, the structural heterogeneity of interaction
networks gives rise to the emergence of a small set of influential nodes, or
influencers, in a series of dynamical processes. Although much smaller than the
entire network, these influencers were observed to be able to shape the
collective dynamics of large populations in different contexts. As such, the
successful identification of influencers should have profound implications in
various real-world spreading dynamics such as viral marketing, epidemic
outbreaks and cascading failure. In this chapter, we first summarize the
centrality-based approach in finding single influencers in complex networks,
and then discuss the more complicated problem of locating multiple influencers
from a collective point of view. Progress rooted in collective influence
theory, belief-propagation and computer science will be presented. Finally, we
present some applications of influencer identification in diverse real-world
systems, including online social platforms, scientific publication, brain
networks and socioeconomic systems.Comment: 24 pages, 6 figure
World citation and collaboration networks: uncovering the role of geography in science
Modern information and communication technologies, especially the Internet,
have diminished the role of spatial distances and territorial boundaries on the
access and transmissibility of information. This has enabled scientists for
closer collaboration and internationalization. Nevertheless, geography remains
an important factor affecting the dynamics of science. Here we present a
systematic analysis of citation and collaboration networks between cities and
countries, by assigning papers to the geographic locations of their authors'
affiliations. The citation flows as well as the collaboration strengths between
cities decrease with the distance between them and follow gravity laws. In
addition, the total research impact of a country grows linearly with the amount
of national funding for research & development. However, the average impact
reveals a peculiar threshold effect: the scientific output of a country may
reach an impact larger than the world average only if the country invests more
than about 100,000 USD per researcher annually.Comment: Published version. 9 pages, 5 figures + Appendix, The world citation
and collaboration networks at both city and country level are available at
http://becs.aalto.fi/~rajkp/datasets.htm
Individualization as driving force of clustering phenomena in humans
One of the most intriguing dynamics in biological systems is the emergence of
clustering, the self-organization into separated agglomerations of individuals.
Several theories have been developed to explain clustering in, for instance,
multi-cellular organisms, ant colonies, bee hives, flocks of birds, schools of
fish, and animal herds. A persistent puzzle, however, is clustering of opinions
in human populations. The puzzle is particularly pressing if opinions vary
continuously, such as the degree to which citizens are in favor of or against a
vaccination program. Existing opinion formation models suggest that
"monoculture" is unavoidable in the long run, unless subsets of the population
are perfectly separated from each other. Yet, social diversity is a robust
empirical phenomenon, although perfect separation is hardly possible in an
increasingly connected world. Considering randomness did not overcome the
theoretical shortcomings so far. Small perturbations of individual opinions
trigger social influence cascades that inevitably lead to monoculture, while
larger noise disrupts opinion clusters and results in rampant individualism
without any social structure. Our solution of the puzzle builds on recent
empirical research, combining the integrative tendencies of social influence
with the disintegrative effects of individualization. A key element of the new
computational model is an adaptive kind of noise. We conduct simulation
experiments to demonstrate that with this kind of noise, a third phase besides
individualism and monoculture becomes possible, characterized by the formation
of metastable clusters with diversity between and consensus within clusters.
When clusters are small, individualization tendencies are too weak to prohibit
a fusion of clusters. When clusters grow too large, however, individualization
increases in strength, which promotes their splitting.Comment: 12 pages, 4 figure
Computational fact checking from knowledge networks
Traditional fact checking by expert journalists cannot keep up with the
enormous volume of information that is now generated online. Computational fact
checking may significantly enhance our ability to evaluate the veracity of
dubious information. Here we show that the complexities of human fact checking
can be approximated quite well by finding the shortest path between concept
nodes under properly defined semantic proximity metrics on knowledge graphs.
Framed as a network problem this approach is feasible with efficient
computational techniques. We evaluate this approach by examining tens of
thousands of claims related to history, entertainment, geography, and
biographical information using a public knowledge graph extracted from
Wikipedia. Statements independently known to be true consistently receive
higher support via our method than do false ones. These findings represent a
significant step toward scalable computational fact-checking methods that may
one day mitigate the spread of harmful misinformation
Geographic constraints on social network groups
Social groups are fundamental building blocks of human societies. While our
social interactions have always been constrained by geography, it has been
impossible, due to practical difficulties, to evaluate the nature of this
restriction on social group structure. We construct a social network of
individuals whose most frequent geographical locations are also known. We also
classify the individuals into groups according to a community detection
algorithm. We study the variation of geographical span for social groups of
varying sizes, and explore the relationship between topological positions and
geographic positions of their members. We find that small social groups are
geographically very tight, but become much more clumped when the group size
exceeds about 30 members. Also, we find no correlation between the topological
positions and geographic positions of individuals within network communities.
These results suggest that spreading processes face distinct structural and
spatial constraints.Comment: 10 pages, 5 figure
Temporal rainfall trend analysis in different agro-ecological regions of southern Africa
Rainfall is a major driver of food production in rainfed smallholder farming systems. This study was conducted to assess linear trends in (i) different daily rainfall amounts (<5, 5–10, 11–20, 21–40 and >40 mm∙day-1), and (ii) monthly and seasonal rainfall amounts. Drought was determined using the rainfall variability index. Daily rainfall data were derived from 18 meteorological stations in southern Africa. Daily rainfall was dominated by <5 mm∙day-1 followed by 5–10 mm∙day -1. Three locations experienced increasing linear trends of <5 mm∙day-1 amounts and two others in sub-humid region had increases in the >40 mm day -1 category. Semi-arid location experienced increasing trends in <5 and 5–10 mm∙day-1 events. A significant linear trend in seasonal rainfall occurred at two locations with decreasing rainfall (1.24 and 3 mm∙season-1). A 3 mm∙season-1 decrease in seasonal rainfall was experienced under semi-arid conditions. There were no apparent linear trends in monthly and seasonal rainfall at 15 of the 18 locations studied. Drought frequencies varied with location and were 50% or higher during the November–March growing season. Rainfall trends were location and agro-ecology specific, but most of the locations studied did not experience significant changes between the 1900s and 2000s
- …
