Bernardo Magnini Fondazione Bruno Kessler
February, 27 – 5:00PM (CEST)
Title: Rethinking NLP Evaluation in the Age of LLMs: Lessons from Benchmarking Italian
Abstract: Large Language Models (LLMs) are now at the core of most NLP applications, mainly because of their strong performance and their adaptability to different tasks and languages. However, despite their widespread use, evaluating LLMs is still an active area of research, and a debate about methodologies is ongoing. Several issues are under discussion, including competence-oriented and task-oriented approaches; how to balance prompt naturalness and effectiveness; investigate the role of multiple prompts in evaluation; considering both multiple-choice and generative tasks along with the most appropriate metrics for each; and comparing zero-shot and few-shot settings taking into consideration execution performance. To be more concrete, I will report examples and lessons learned from developing an LLM benchmark for the Italian language.
Bio: Bernardo Magnini is senior researcher at FBK (Trento, Italy), and responsible of the NLP research group. His interests are in the field of Computational Linguistics, particularly lexical semantics and lexical resources, question answering, textual entailment, and conversational agents, areas in which he has published more than 300 scientific papers. He has co-chaired several events, including EVALITA, the evaluation campaign for both NLP and speech tools for the Italian language, CLIC-it 2014 (the first Italian conference on Computational Linguistics), AI*IA 2018 (the 17th International Conference of the Italian Association for Artificial Intelligence) and ACL 2022, the 60th Annual Meeting of the Association for Computational Linguistics. He has been contract professor at the University of Trento, Bolzano and Pavia, and President of the Italian Association for Computational Linguistics (AILC) from 2015 to 2022.
Live Stream and recording:https://www.youtube.com/watch?v=-EIvRMMw2UQ
Add a Comment
You must be logged in to post a comment