Evening Lectures

LiLa Now: Lessons Learned and the Road Ahead for a Large-Scale Linguistic Linked Data Project

Francesco Mambrini

Università Cattolica del Sacro Cuore, Milan

Abstract: This talk draws on the experience of the LiLa: Linking Latin project to discuss both the challenges and the importance of building large-scale knowledge bases and networks of linguistic resources using the Linked Data paradigm. The LiLa project ran from 2018 to 2023. During this period, its initial goal — applying Linked Data to foster interoperability among heterogeneous linguistic resources for Latin — was tested in several important ways. Completing the project required labor-intensive work on developing conceptual models — or adapting existing ones — and on representing all available textual and lexical data. This talk will present key lessons learned from the project, ranging from our original vision for usability and interoperability, to ontology design and community engagement. Lookin ahead, it will outline the opportunities and open questions that remain for sustaining, expanding, and reusing LiLa in a landscape increasingly shaped by Large Language Models. Ultimately, we will consider how close a large-scale initiative like LiLa brings us to realizing a “network effect” for language resources — one that could benefit researchers, practitioners, and everyday users alike.

Short Bio: Francesco Mambrini holds a PhD in Classical Philology from University of Trento and EHESS, Paris. Currently, he is a Researcher at the Università Cattolica del Sacro Cuore, Milan. Previously, he worked as Research Assistant at the Deutsches Archäologisches Institut, Berlin, and at the University of Leipzig. In 2012 he was appointed Joint Fellow of the Center For Hellenic Studies and the Deutsches Archäologisches Institut for the academic year 2012-13. He has cooperated with some of the most important projects in the Digital Humanities, including The Perseus Projects (where he was Visiting Scholar in 2009 and 2011), Arachne, and the Index Thomisticus Treebank. He has been one of the collaborators of the Ancient Greek And Latin Dependency Treebank (Perseus Project) since its foundation (2009), and has curated the annotated versions of the
tragedies of Aeschylus and Sophocles. From 2018 to 2023 he worked on the Linked Open Data project LiLa: Linking Latin.

Memorization or Generalization? Exploring Transformer-based Large Language Models and, possibly, novel approaches

Fabio Massimo Zanzotto

University of Rome Tor Vergata

Abstract: Transformer-based Large Language Models demonstrate extraordinary capabilities and, thus, change the approach of the ML/NLP/NN communities when conducting research. Large chunks of research topics are neglected as Transformer-based LLMs seem to be the ultimate solution. However, it is already emerging that a large part of the capabilities of LLMs depends on their ability to memorize. Moreover, the necessity for deep neural networks to memorize long-tailed data to obtain close to optimal generalization error has attracted a lot of discussion. In this talk, we aim to report our experience in the florid research area on LLMs, exploring how these models memorize and how they generalize from training data.

Short Bio: Prof. Fabio Massimo Zanzotto is an associate professor at the University of Rome Tor Vergata coordinating the group of Human-centric ART. He specialized in artificial intelligence (AI) and natural language processing (NLP). Since 1998, he has been actively engaged in AI research, focusing on ethical considerations, AI applications in tourism and healthcare, and fundamental AI concepts. He coordinates and coordinated many research projects, including the European H2020 KATY project and the national Social Tourism e- Platform (STEP), Class-tAIs, and SfidaNow. Zanzotto also oversees collaborations with
companies in the natural language processing domain, showcasing his significant impact on both academia and industry. Zanzotto’s contributions include work on automatic textual entailment recognition, syntactic parsing, and the application of distributed and distributional models to language syntax and semantics. With over 150 publications in international and national forums, Zanzotto is a prominent figure in the field. He plays an active role in major conference committees (ACL, NAACL, EACL, EmNLP, CoLing, LREC, IJCAI, ECAI, CLEF) and serves as a reviewer for esteemed international journals. He is a member of the Association for Computational Linguistics (ACL) and the Italian Association for Artificial Intelligence and is a founding member of the Italian Association for Computational Linguistics. Moreover, he contributed to the creation of two spin-offs Reveal SRL and DevIt SRL.

Supported by the Future Artificial Intelligence Research (FAIR) project:

EVENING LECTURES

Evening Lectures

LiLa Now: Lessons Learned and the Road Ahead for a Large-Scale Linguistic Linked Data Project

Memorization or Generalization? Exploring Transformer-based Large Language Models and, possibly, novel approaches

Title