Computational Linguistics and the COVID-19 Outbreak

This page is maintained by AILC (the Italian Association for Computational Linguistics). It groups some of the initiatives that the Computational Linguistics community is carrying out to contribute to the fight against COVID-19. Everyone is invited to collaborate by reporting new initiatives. Please do so through our contact form

Datasets


  • CORD-19 – The Allen Institute COVID-19 Open Research Dataset, a collection of Covid-19 scientific papers, weekly updated (March 2020)
  • Processed CORD-19 – The Allen Institute corpus processed with Sketch Engine (March 2020)
  • 40wita – A dataset of tweets in Italian collected daily by the University of Turin
  • Corona Corpus – A corpus of texts from online newspapers and magazines in 20 different English-speaking countries and part of the English-Corpora.org suite of corpora

Tools


Shared Tasks and Events


  • CLEF 2020: CheckThat! Lab Task 1 Tweet Check-Worthiness – The task asks to rank a stream of tweets on a number of topics, including COVID-19, according to their check-worthiness (March 2020)
  • Kaggle Tasks – Several tasks on COVID-19  (March 2020)
  • NLP COVID-19 Workshop an emergency workshop at ACL 2020 – Authors are invited to submit papers related to NLP applied to combat the COVID-19 pandemic (July 2020)
  • TREC-COVID program – Launched by NIST and OSTP, the challenge will follow the TREC assessment process to evaluate search systems, based on the CORD-19 documents

Publications