Computational Linguistics and the COVID-19 Outbreak

This page is maintained by AILC (the Italian Association for Computational Linguistics). It groups some of the initiatives that the Computational Linguistics community is carrying out to contribute to the fight against COVID-19. Everyone is invited to collaborate by reporting new initiatives. Please do so through our contact form.

Datasets

CORD-19 – The Allen Institute COVID-19 Open Research Dataset, a collection of Covid-19 scientific papers, weekly updated (March 2020)
Processed CORD-19 – The Allen Institute corpus processed with Sketch Engine (March 2020)
40wita – A dataset of tweets in Italian collected daily by the University of Turi
Corona Corpus – A corpus of texts from online newspapers and magazines in 20 different English-speaking countries and part of the English-Corpora.org suite of corpora

Tools

COVID-19 Semantic Browser – A semantic search tool on COVID-19 scientific papers developed by Gabriele Sarti and hosted by Area Science Park (April 2020)
COVID19 Infodemics Observatory -A platform to monitor fake news on covid-19, developed at FBK (March 2020)

Shared Tasks and Events

CLEF 2020: CheckThat! Lab Task 1 Tweet Check-Worthiness –The task asks to rank a stream of tweets on a number of topics, including COVID-19, according to their check-worthiness (March 2020)
Kaggle Tasks –Several tasks on COVID-19 (March 2020)
NLP COVID-19 Workshop an emergency workshop at ACL 2020 – Authors are invited to submit papers related to NLP applied to combat the COVID-19 pandemic (July 2020)
TREC-COVID program – Launched by NIST and OSTP, the challenge will follow the TREC assessment process to evaluate search systems, based on the CORD-19 documents

Publications

Björn W. Schuller, Dagmar M. Schuller, Kun Qian, Juan Liu, Huaiyuan Zheng, Xiao Li. COVID-19 and Computer Audition: An Overview on What Speech & Sound Analysis Could Contribute in the SARS-CoV-2 Corona Crisis, Arxive.org.