This page links to the Twitter data used in the paper “Determinants of Meme Popularity” by James P. Gleeson, Kevin P. O’Sullivan, Raquel A. Baños and Yamir Moreno; please cite this paper if you use the data. All data processing was performed by Raquel A. Baños at Instituto de Biocomputación y Física de Sistemas Complejos (BIFI), Universidad de Zaragoza.

The zip file twitter15M_data.zip (16MB) contains the following files:

·         1year_hashtags.txt

Text file, with one line per tweet. Tweets were gathered from March 2011 to March 2012. Each line has the following structure: the first column indicates the number of fields minus 1 (NF-1) in the corresponding line, the second column corresponds to the date when the tweet was posted (time units are days), and columns from 3 to NF are the hashtags used.

 

·         ccdf_1year_S200

Folder containing multiple text files. Text file ccdf_axx.txt gives the CCDF for hashtags of age xx days, as used in Figure 3 of the paper.

 

·         pjk_final.txt

Text file from the sampling of 8.2E5 random Twitter users (sampled in October 2013). The file structure is as follows:

first column: number of followers (out-degree)

second column: number of friends (following, or in-degree)

last column: frequency