Datasets on coronavirus and misinformation available for the 1st Open Call

The 1st Open Call of MediaFutures will support artists and startups to create innovative, inclusive and participatory applications of data that will address critical societal challenges, most of them related to the coronavirus infodemic. For participants to create their proposals, Eurecat – Technology Centre of Catalonia, technical partner in MediaFutures, has deployed a technological infrastructure with an extensive catalogue of datasets that contain relevant information about the coronavirus and misinformation. These datasets have been generated by institutions all over the world (universities, research centres, foundations and public institutions) that have granted open-access to allow data re-use, a core principle in MediaFutures.

The catalogue will allow participants to discover data from very different sources (fact-checking and news, online social media, scientific articles, and repositories about statistics, interventions and behavioural traces) together with free open source tools to collect, clean, analyse and visualize data through approaches from diverse research disciplines like machine learning, social network analysis or natural language processing.

Category Dataset Provider
Fact-checking and news COVID-19 Fact-checkers Dataset Social Media Lab – Ryerson University
The CoronaVirusFacts / DatosCoronaVirus Alliance Database Poynter Institute
CoAID The Pennsylvania State University
COVID19FN Sardar Vallabhbhai National Institute of Technology
GDELT Google Jigsaw
Webhose’s free datasets Webhose
Social media COVID19 Infodemics Observatory CoMuNe Lab – Fondazione Bruno Kessler
CMU-MisCov19 Carnegie Mellon University
COVID-19-TweetIDs University of Southern California
Coronavirus (COVID-19) Tweets Dataset Jawaharlal Nehru University
Institutional and news media tweet dataset for COVID-19 social science research Universitat Autònoma de Barcelona
COVID-19 Reddit Algo-Tracker Cornell University
WMF COVID-19 Wikimedia Foundation
Coronavirus en YouTube Universitat Politècnica de València
Scientific articles CORD-19: The Covid-19 Open Research Dataset Allen Institute for AI
Statistics COVID-19 Data Repository Johns Hopkins University
Data on COVID-19 Our World in Data
COVID-19 World Survey Data API University of Maryland
Data for the Open COVID-19 Data Working Group University of Washington
Interventions CCCSL: CSH Covid-19 Control Strategies List Complexity Science Hub
Health Intervention Tracking for COVID-19 (HIT-COVID) Data Boston University and Johns Hopkins University
Behavioural traces Mozilla COVID dataset Mozilla Foundation
The CoVidAffect dataset CoVidAffect project
COVID-19 Mobility Monitoring project ISI Foundation and Cuebiq
Miscellanea #Data4COVID19 The GovLab

We recall that participants of the open call are free to use these or other datasets in their proposals and that selected projects will receive support from our team of mentors. In particular, support will cover training sessions on data management technologies and a first level of technical advice for orienting participants within these resources.

Explore the catalogue, download the Application KIT and apply to our 1st Open Call!