Tutorials

Jorge García

Fundamentals of Linguistic Linked Open Data

Jorge García
University of Zaragoza – Spain

Abstract: Retrieval-Augmented Generation (RAG) has become a powerful paradigm for improving the accuracy and reliability of language models and integrating external knowledge into responses to users’ prompt. However, naive RAG implementations often struggle with limitations such as irrelevant retrieval, lack of contextual awareness, and inefficient knowledge utilization. After a brief introduction to entity-aware knowledge representation structures and techniques to extract entity-related knowledge from text, we will explore the role of entities and graph-based structures in enhance RAG systems. We will discuss a broad spectrum of solutions — from lightweight entity-centric enhancements to full-fledged GraphRAG approaches — highlighting trade-offs in complexity, efficiency, and performance. During the presentation we will discuss examples from the literature, from ongoing projects on vertical domains in the Italian language, and from commercial solutions that exploits entity-centric approaches to ground users prompts.

Short Bio: He currently works as senior research fellow (“Ramón y Cajal” postdoctoral position) at the Department of Computer Science and Systems Engineering (University of Zaragoza, Spain) as a member of the Aragon Institute of Engineering Research (I3A) and of the Distributed Information Systems research group. His main research interests include multilingual Semantic Web, ontology matching, linguistic linked data, and neuro-symbolic artificial intelligence. He chaired NexusLinguarum, the “European network for Web-centred linguistic data science”, a COST Action that joined the effort of over researchers from 42 countries, and currently acts as Vice-Chair of Goblin, the Global Network on Large-Scale, Cross-domain and Multilingual Open Knowledge Graphs. He has been also involved in another six EU projects related to Semantic Web, multilingualism and language technologies, acting in two of them as Principal Investigator

Max Ionov

Advanced Topics of Linguistic Linked Open Data

Max Ionov
University of Cologne – Germany

Abstract:

Bio:

Palmonari

Beyond Naive RAG: How Entities and Graphs Enhance Retrieval-Augmented Generation

Matteo Palmonari
University of Milano-Bicocca – Italy

Abstract: Retrieval-Augmented Generation (RAG) has become a powerful paradigm for improving the accuracy and reliability of language models and integrating external knowledge into responses to users’ prompt. However, naive RAG implementations often struggle with limitations such as irrelevant retrieval, lack of contextual awareness, and inefficient knowledge utilization. After a brief introduction to entity-aware knowledge representation structures and techniques to extract entity-related knowledge from text, we will explore the role of entities and graph-based structures in enhance RAG systems. We will discuss a broad spectrum of solutions — from lightweight entity-centric enhancements to full-fledged GraphRAG approaches — highlighting trade-offs in complexity, efficiency, and performance. During the presentation we will discuss examples from the literature, from ongoing projects on vertical domains in the Italian language, and from commercial solutions that exploits entity-centric approaches to ground users prompts.

Short Bio: Matteo Palmonari is an Associate Professor in the Department of Informatics, Systems, and Communication at the University of Milan-Bicocca. His research spans data management and artificial intelligence, with a focus on semantic matching, knowledge graph profiling and exploration, natural language processing, and data enrichment. Recently, his interest has concentrated on the integration of symbolic and neural approaches in particular in the context of applications in the legal domain. He has played key roles in numerous innovation and research projects, serving as coordinator, scientific manager, or partner.

Zheng Yuan

From Corpora to Capabilities: Rethinking Language Resources in the LLM Era

Zheng Yuan
University of Sheffield – UK

Abstract: The rise of Large Language Models (LLMs) has revolutionised the development and deployment of language technologies. Yet, language resources remain at the heart of these advancements. This talk reexamines the evolving role of language resources in the LLM era – spanning their application in pretraining, fine-tuning, evaluation, and integration with retrieval-augmented generation (RAG) systems. We will explore how traditional resources such as annotated corpora and lexicons are being reimagined, the growing emphasis on data quality and documentation, and the persistent challenges in supporting multilingual and low-resource languages. Through concrete examples and case studies, the talk will highlight how curated, transparent, and inclusive language resources can drive the development of responsible and capable AI systems.

Short Bio: Zheng Yuan (website) is an Associate Professor in Natural Language Processing at the University of Sheffield and an Affiliated Researcher at the University of Cambridge, where she is also a Fellow in Computer Science at Trinity College. Her research sits at the intersection of machine learning and NLP, with a strong focus on real-world applications in education, creativity, healthcare, social media, and finance. Zheng’s work draws on insights from computer science, linguistics, education, and psychology. Previously, she served as Vice President of Data Science at Chatterbox Labs, Assistant Professor at King’s College London, and Research Associate at the University of Cambridge. She holds a PhD and MPhil from the University of Cambridge and a BSc(Eng) from Queen Mary University of London.

Supported by the Future Artificial Intelligence Research (FAIR) project:

Logos FAIR