290 research outputs found
De-anonymizing Social Networks
Operators of online social networks are increasingly sharing potentially
sensitive information about users and their relationships with advertisers,
application developers, and data-mining researchers. Privacy is typically
protected by anonymization, i.e., removing names, addresses, etc.
We present a framework for analyzing privacy and anonymity in social networks
and develop a new re-identification algorithm targeting anonymized
social-network graphs. To demonstrate its effectiveness on real-world networks,
we show that a third of the users who can be verified to have accounts on both
Twitter, a popular microblogging service, and Flickr, an online photo-sharing
site, can be re-identified in the anonymous Twitter graph with only a 12% error
rate.
Our de-anonymization algorithm is based purely on the network topology, does
not require creation of a large number of dummy "sybil" nodes, is robust to
noise and all existing defenses, and works even when the overlap between the
target network and the adversary's auxiliary information is small.Comment: Published in the 30th IEEE Symposium on Security and Privacy, 2009.
The definitive version is available at:
http://www.cs.utexas.edu/~shmat/shmat_oak09.pdf Frequently Asked Questions
are answered at: http://www.cs.utexas.edu/~shmat/socialnetworks-faq.htm
A Critical Look at Decentralized Personal Data Architectures
While the Internet was conceived as a decentralized network, the most widely
used web applications today tend toward centralization. Control increasingly
rests with centralized service providers who, as a consequence, have also
amassed unprecedented amounts of data about the behaviors and personalities of
individuals.
Developers, regulators, and consumer advocates have looked to alternative
decentralized architectures as the natural response to threats posed by these
centralized services. The result has been a great variety of solutions that
include personal data stores (PDS), infomediaries, Vendor Relationship
Management (VRM) systems, and federated and distributed social networks. And
yet, for all these efforts, decentralized personal data architectures have seen
little adoption.
This position paper attempts to account for these failures, challenging the
accepted wisdom in the web community on the feasibility and desirability of
these approaches. We start with a historical discussion of the development of
various categories of decentralized personal data architectures. Then we survey
the main ideas to illustrate the common themes among these efforts. We tease
apart the design characteristics of these systems from the social values that
they (are intended to) promote. We use this understanding to point out numerous
drawbacks of the decentralization paradigm, some inherent and others
incidental. We end with recommendations for designers of these systems for
working towards goals that are achievable, but perhaps more limited in scope
and ambition
Link Prediction by De-anonymization: How We Won the Kaggle Social Network Challenge
This paper describes the winning entry to the IJCNN 2011 Social Network
Challenge run by Kaggle.com. The goal of the contest was to promote research on
real-world link prediction, and the dataset was a graph obtained by crawling
the popular Flickr social photo sharing website, with user identities scrubbed.
By de-anonymizing much of the competition test set using our own Flickr crawl,
we were able to effectively game the competition. Our attack represents a new
application of de-anonymization to gaming machine learning contests, suggesting
changes in how future competitions should be run.
We introduce a new simulated annealing-based weighted graph matching
algorithm for the seeding step of de-anonymization. We also show how to combine
de-anonymization with link prediction---the latter is required to achieve good
performance on the portion of the test set not de-anonymized---for example by
training the predictor on the de-anonymized portion of the test set, and
combining probabilistic predictions from de-anonymization and link prediction.Comment: 11 pages, 13 figures; submitted to IJCNN'201
- …
