Evening Speech

Multilingual Natural Language Understanding: Instructions for (Present and Future) Use

Roberto Navigli
University of Rome La Sapienza, Italy

Short Bio: Roberto Navigli is Professor of Computer Science at the Sapienza University of Rome, where he leads the Sapienza NLP Group. He has received two prestigious ERC grants on multilingual word sense disambiguation (2011-2016) and multilingual language- and syntax-independent open-text unified representations (2017-2022). In 2015 he received the META prize for groundbreaking work in overcoming language barriers with BabelNet, a project also highlighted in The Guardian and Time magazine, and winner of the Artificial Intelligence Journal prominent paper award 2017. He is the co-founder of Babelscape, a successful company which enables Natural Language Understanding in dozens of languages. He is a Program Chair of ACL-IJCNLP 2021.

A topological view of polysemy

Milica Gasic
Heinrich Heine University Düsseldorf, Germany

Mail: gasic@uni-duesseldorf.de

Abstract: You have all worked closely with word vectors and witnessed first hand how they can encode meaning and aid tasks across the NLP spectrum. Your favourite algorithm provides you with these high dimensional vectors. What kind of space do they live on?
The manifold hypothesis suggests that word vectors live on a submanifold within their ambient vector space. We argue that we should, more accurately, expect them to live on a “pinched manifold”: a space obtained from a manifold by gluing together certain points. The gluing points correspond to polysemous words, i.e. words with multiple meanings.
Our point of view suggests that monosemous and polysemous words can be distinguished based on the topology of their neighbourhoods. We present two kinds of empirical evidence to support this point of view:
(1) We introduce a measure of polysemy, based on tools from topological data analysis, that correlates well with the actual number of meanings of a word.
(2) We propose a simple, topologically motivated solution to the SemEval-2010 task on Word Sense Induction & Disambiguation that produces
competitive results.

Short Bio: Milica Gašić is a Professor of Dialogue Systems and Machine Learning at Heinrich Heine University Düsseldorf. Prior to her current position she was a Lecturer in Spoken Dialog Systems at the Department of Engineering, University of Cambridge where she was leading the Dialogue Systems Group. She completed her PhD under the supervision of Professor Steve Young and the topic of her thesis was Statistical Dialogue Modelling. She holds an MPhil degree in Computer Speech, Text and Internet Technology from the University of Cambridge and a Diploma in Mathematics and Computer Science from the University of Belgrade. She is a member of ACL, a member of ELLIS and a senior member of IEEE. She is a recipient of a European Research Council Starting Grant and an Alexander von Humboldt Sofja Kovalevskaja Award.

EVENING SPEECH

Evening Speech

Multilingual Natural Language Understanding: Instructions for (Present and Future) Use

A topological view of polysemy

Titolo