COVID-19 Browser: Using Natural Language Processing to Fight the Pandemic

Our society is facing an unprecedented crisis due to the recent COVID-19 outbreak that is putting sanitary systems in check all around the world. Recently, dozens of countries announced the shutdown of all non-essential activities for the next foreseeable future, and scientists are striving worldwide to find cures and vaccines able to stop the ongoing pandemic.

In these hard times, everyone should put their expertise at play to help in the fight against the virus. For Gabriele Sarti, a Data Science student at the University of Trieste and a young member of the Italian Association for Computational Linguistics (AILC), this meant exploiting his expertise in Natural Language Processing (NLP) to develop the COVID-19 Browser, a system leveraging state-of-the-art techniques in NLP to extract meaningful information and guide scientists towards a better understanding of COVID-19.

As of today, more than 32 000 scientific papers have been published by research laboratories worldwide on the topics of the new corona virus SARS-CoV-2 and the disease COVID-19. It is very likely that in such a large quantity of text a lot of useful information is lost, making our knowledge on the subject too sparse to be exploited to its full potential. COVID-19 Browser allows users to browse a large collection of those articles directly in their console, matching article’s abstracts with user queries formulated in natural language to delve deeper in our current knowledge of the subject.

The model underlying Covid-19 Browser is SciBERT-NLI, a cutting-edge language model trained by the American nonprofit AI2 on a corpus of 1.14M scientific papers and subsequently adjusted by Gabriele to be used for the retrieval task.

Gabriele Sarti is a student in the Data Science master at the University of Trieste (https://dssc.units.it/), and is affiliated with SISSA (https://www.sissa.it), and the CNR ItaliaNLP Lab in Pisa (http://www.italianlp.it). He is a member of the Italian Association for Computational Linguistics (https://www.ai-lc.it/en/) and plays an active role in its Dissemination Team.

Links

By |2020-04-06T10:33:03+02:0024 Mar, 2020|BLOG, RESEARCH|

Affective lexica and other resources for Italian

By |2017-10-04T16:45:23+02:002 Oct, 2017|BLOG, RESOURCES|

The usefulness of research for companies

Innovation and research in Italian companies of computational linguistics.

At the beginning of the 90s, when the young people of my generation were studying Computational Linguistics (or Natural Language Processing) University, the Center for the Study of Language and Information of the Stanford University was one of the most coveted and dreamed places. Many of us were in love with the Head-Driven Phrase Structure Grammar (HPSG), invented by Carl Pollard and Ivan A. Sag in California. It sounded like HPSG could be the definitive word on formal grammars of natural languages, because they joined some language universal principles (inspired by Noam Chomsky Linguistics) with a powerful computational framework. The approach, however, had two problems: it was difficult to create and manage all the rules quite complex; parsing was not as fast as we would have liked. We devoted ourselves to research but could not make effective commercial services based on this or other computational linguistic framework.

Since then some years have passed. In October 2016 I read an interview with Andrew Ng at the issue by the Chinese company Baidu a chatbot to make medical diagnoses: “As Melody has blackberries conversations, it will Also learn and keep getting better. This is just the start of a much larger, AI-driven transformation of the healthcare industry. “In 1990, Andrew Ng was 14 years old. After a couple of degrees and doctorates, in 2002 he began working at Stanford University. In 2011 he founded the Google Brain project at Google. Also in 2011 he gave a course Machine Learning online to Stanford University, which was followed by about 100,000 students around the world. In 2012 he founded Coursera. In 2014 Ng works in Baidu as chief scientist, and so far has remained to work in that company. This exceptional man is a brilliant example of how the world of research, training and production business will nourish each other with continuous exchanges.

The world of Computational Linguistics and Artificial Intelligence in general are experiencing a period of incredible acceleration

with fast passages between the research and the application of research results into practical services and vice versa, when the issues raised by real cases become a subject of study.
This lively exchange takes place even in Italian companies doing computational linguistics. As well as researchers in this field have always been at the forefront globally, even the Italian companies doing computational linguistics have relied on an international level. For example Expert System, a public limited company based in Modena, Naples, Rovereto, has landed a number of years in the United States and grew up in Europe. CELI as an SME, with offices in Turin, Milan, provides Natural Language Processing technologies and consulting to international companies, from Korea to California. Euregio, based in Bolzano, uses NLP to provide media intelligence services. Interactive Media SpA, with offices in Rome, Trento and in Brazil, specializes in speech solutions. The startup Puglia QuestionCube is focused on the question answering and use the main machine learning tool.
Even Almawave, the Almaviva Group, for some years integrates NLP technologies. Other smaller and larger companies are integrating these technologies to provide their services, using machine learning technologies combined with standard NLP technologies.

What services they offer to customers? The main service is the “Natural Language Understanding”, that is, the automatic analysis and understanding of written texts and speech.

The understanding is obviously partial compared to human understanding, but is much faster, and this allows you to do more things that otherwise would not be feasible, or oversimplify complex activities.
In the next post of this blog we will be described in more detail the issues and the problems of Computational Linguistics addressed in universities and companies.
One of the purposes dell’AILC is to facilitate exchanges between universities, research centers and companies in this sector. In this blog so you can tell some of the findings, the results obtained, the ongoing projects, and problems encountered in the various areas of this discipline.

CELI, Expert System, Euregio and QuestionCube are already members of the Italian Association for Computational Linguistics. We hope that in the coming months other companies will join to contribute to the Italian ecosystem creation of Computational Linguistics and Artificial Intelligence.

By |2017-04-04T16:26:40+02:0012 Dec, 2016|BLOG, INDUSTRY|

Title

Go to Top