Datasets on coronavirus and misinformation available for the 1st Open Call
The 1st Open Call of MediaFutures will support artists and startups to create innovative, inclusive and participatory applications of data that will address critical societal challenges, most of them related to the coronavirus infodemic. For participants to create their proposals, Eurecat – Technology Centre of Catalonia, technical partner in MediaFutures, has deployed a technological infrastructure with an extensive catalogue of datasets that contain relevant information about the coronavirus and misinformation. These datasets have been generated by institutions all over the world (universities, research centres, foundations and public institutions) that have granted open-access to allow data re-use, a core principle in MediaFutures.
The catalogue will allow participants to discover data from very different sources (fact-checking and news, online social media, scientific articles, and repositories about statistics, interventions and behavioural traces) together with free open source tools to collect, clean, analyse and visualize data through approaches from diverse research disciplines like machine learning, social network analysis or natural language processing.
Category | Dataset | Provider |
Fact-checking and news | COVID-19 Fact-checkers Dataset | Social Media Lab – Ryerson University |
The CoronaVirusFacts / DatosCoronaVirus Alliance Database | Poynter Institute | |
CoAID | The Pennsylvania State University | |
COVID19FN | Sardar Vallabhbhai National Institute of Technology | |
GDELT | Google Jigsaw | |
Webhose’s free datasets | Webhose | |
Social media | COVID19 Infodemics Observatory | CoMuNe Lab – Fondazione Bruno Kessler |
CMU-MisCov19 | Carnegie Mellon University | |
COVID-19-TweetIDs | University of Southern California | |
Coronavirus (COVID-19) Tweets Dataset | Jawaharlal Nehru University | |
Institutional and news media tweet dataset for COVID-19 social science research | Universitat Autònoma de Barcelona | |
COVID-19 Reddit Algo-Tracker | Cornell University | |
WMF COVID-19 | Wikimedia Foundation | |
Coronavirus en YouTube | Universitat Politècnica de València | |
Scientific articles | CORD-19: The Covid-19 Open Research Dataset | Allen Institute for AI |
Statistics | COVID-19 Data Repository | Johns Hopkins University |
Data on COVID-19 | Our World in Data | |
COVID-19 World Survey Data API | University of Maryland | |
Data for the Open COVID-19 Data Working Group | University of Washington | |
Interventions | CCCSL: CSH Covid-19 Control Strategies List | Complexity Science Hub |
Health Intervention Tracking for COVID-19 (HIT-COVID) Data | Boston University and Johns Hopkins University | |
Behavioural traces | Mozilla COVID dataset | Mozilla Foundation |
The CoVidAffect dataset | CoVidAffect project | |
COVID-19 Mobility Monitoring project | ISI Foundation and Cuebiq | |
Miscellanea | #Data4COVID19 | The GovLab |
We recall that participants of the open call are free to use these or other datasets in their proposals and that selected projects will receive support from our team of mentors. In particular, support will cover training sessions on data management technologies and a first level of technical advice for orienting participants within these resources.
Explore the catalogue, download the Application KIT and apply to our 1st Open Call!