The privacy bounds of human mobility

We used 15 months of data from 1.5 million people to show that 4 points--approximate places and times--are enough to identify 95% of individuals in a mobility database. Our work shows that human behavior puts fundamental natural constraints to the privacy of individuals and these constraints hold even when the resolution of the dataset is low; even coarse datasets provide little anonymity. We further developed a formula to estimate the uniqueness of human mobility traces. These findings have important implications for the design of frameworks and institutions dedicated to protect the privacy of individuals.

In collaboration with César Hidalgo, Vincent Blondel, and Michel Verleysen

openPDS/SaferAnswers: Protecting the Privacy of Metadata

In a world where sensors, data storage and processing power are too cheap to meter how do you ensure that users can realize the full value of their data while protecting their privacy? openPDS is a field-tested personal metadata management framework which allows individuals to collect, store, and give fine-grained access to their metadata to third parties. SafeAnswers is a new and practical way of protecting the privacy of metadata at individual level. SafeAnswers turns a hard anonymization problem into a more tractable security one. It allows services to ask questions whose answers are calculated against the metadata instead of trying to anonymize individuals' metadata. Together, openPDS and SafeAnswers provide a new way of dynamically protecting personal metadata.

In collaboration with Samuel Wang, Erez Shmueli, Sandy Pentland, and the Harvard Berkman Center

What Can Your Phone Metadata Tell About You?

How much can others learn about your personality just by looking at the way you use your phone? We provide the first evidence that personality types (for example, neurotism, extraversion, openness) can be predicted from standard mobile phone metadata. We have developed a set of novel psychology-informed indicators that can be computed from any set of mobile phone metadata. These fall into five categories, and range from the time it took you to answer a text, the entropy of your contacts, your daily distance traveled, or the percentage of text conversations you started. Using these 36 indicators, we were able to predict people's personalities correctely up to 63/%, 1.7 times better than random using only metadata.

In collaboration with Jordi Quoidbach, Florent Robic, and Sandy Pentland

The limits of community detection in networks

What can really be inferred from communities unfold by modularity-based algorithms? A broad and systematic characterization of the theoretical and practical performance of modularity contradicts the widely held assumption that the modularity function typically exhibits a clear global optimum. This implies that (i) modules identified via modularity maximization are not unique and should therefore be interpreted with extreme caution, and (ii) even moderate differences in modularity scores are meaningless.

In collaboration with Aaron Clauset, and Ben Good

Quantifying the Stability of Society

Is there such a thing as a 'poverty trap'? Logistic classifiers applied on communication and census data point to a new mechanism for poverty that relates to the persistence of relationships. This analysis shows that economic exchanges flow primarily through these persistent edges and the inability to maintain these ties can prevent upward economic mobility

In collaboration with Nathan Eagle and Aaron Clauset


    • Stability in society: Parameters for the persistence of social networks, Master’s thesis, Université catholique de Louvain, 2009.

Modeling the Dynamics of Urbanization on Social Support Networks

What is attracting migrants to urban areas within the developing world? Using 4 years of movement and communication data, it is possible to model the reinforcing social mechanisms that could explain the recent rapid growth of urban areas.

In collaboration with Nathan Eagle and Luís M.A. Bettencourt