Symposium Programme

All timings are in BST (Aberdeen, United Kingdom). This is programme version 1.4.3.

The Symposium takes place on the 7th floor of the Sir Duncan Rice Library at the University of Aberdeen.

Monday, 1st of June, 2026

Time Event Presenter(s) Session Chair
09:00 09:30 ⚪️ Registration 🪪    
09:30 09:35 ⚪️ RetroEval 2026 Symposium Welcome 👋 Saad Mahamood  
09:35 09:45 ⚪️ University of Aberdeen Introductory Remarks Nir Oren  
09:45 10:30 🔴 Keynote: NLP Evaluation in the Face of Deceptively Fluent Models Sina Zarrieß Saad Mahamood
10:30 10:45 🟤 Coffee Break ☕️    
10:45 12:15 🟠 Oral Session 1: Evaluating NLG Architectures   Mengxuan Sun
10:45 11:15 🟠 Oral 1-1: A Comparative Evaluation of End-to-End and Pipeline Approaches for Summarisation Fahime Same, Saad Mahamood, Srinivas Ramesh Kamath  
11:15 11:45 🟠 Oral 1-2: RAG as a collapsed NLG pipeline Adarsa Sivaprasad, Barkavi Sundararajan, David M. Howcroft  
11:45 12:15 🟠 Oral 1-3: Decomposition Does Not Help: Evidence from Semantic Clustering in LLM-based Causal Graph Discovery Nikolay Babakov, Alberto Bugarín-Diz  
12:15 13:30 🟤 Lunch 🍴    
13:30 15:00 🟠 Oral Session 2: Rethinking Evaluation   Adarsa Sivaprasad
13:30 14:00 🟠 Oral 2-1: Never Truly Out of Fashion: A Retrospective Look at Evaluation in NLG Patrícia Schmidtová, Saad Mahamood, Ondřej Dušek  
14:00 14:30 🟠 Oral 2-1: Oral 2-2: NLG Evaluation: Past, Present, Future Ehud Reiter  
14:30 15:00 🟠 Oral 2-1: Oral 2-3: Solving the Task but Not the Problem: A Customer Support Case Study on Why Extrinsic Evaluation Matters Daniel Braun  
15:00 15:30 ⚪️ Get to the bus!    
15:30 16:30 ⚪️ Bus to the castle 🚌    
16:30 17:45 ⚪️ Dunottar Castle Excursion 🏰    
17:45 18:15 ⚪️ Return to the bus!    
18:15 19:00 ⚪️ Bus back to Aberdeen 🚌    
19:30 21:30 ⚪️ Celebratory Dinner at Chaophraya 🍴    

Tuesday, 2nd of June, 2026

Time Event Presenter(s) Session Chair
09:00 09:30 ⚪️ Registration 🪪    
09:30 10:15 🔴 Keynote From Benchmark to Bedside: Lessons learned in Clinical Natural Language Processing Beatrice Alex David M. Howcroft
10:15 10:45 🟤 Coffee Break ☕️    
10:45 11:45 🟠 Food for Thought   Saad Mahamood
10:45 11:15 🟠 FT-1-1: Evaluation and Assessment as Complementary Frameworks Elie Antoine  
11:15 11:45 🟠 FT-1-2: Ehud Reiter and the University of Santiago de Compostela: some notes and memories Alejandro Ramos Soto, Nikolay Babakov, Javier González Corbelle, Jose Maria Alonso-Moral, Alberto Bugarín-Diz, Senén Barro  
11:45 13:00 🟤 Lunch 🍴    
13:00 13:45 🔴 Keynote: “It’s cheaper if you don’t involve people” Albert Gatt Simone Balloccu
13:45 14:00 🟤 Get your tea 🫖 and coffee ☕️ for the poster session!    
13:45 14:45 🟢 Poster Presentations   Barkavi Sundararajan
—– —– 🟢 Poster 1-1: Towards Grounded Evaluation of Multimodal Machine Translation Systems* Sami Ul Haq and Sheila Castilho  
—– —– 🟢 Poster 1-2: Checking for implicit assumptions in data-to-text generation Kristýna Onderková, Ondrej Dusek  
—– —– 🟢 Poster 1-3: The Arabic Bible as an Evaluation tool: The Case Study of the Khalili Arabic Dialect Jakub Zbrzezny, Ehud Reiter, Wei Zhao  
—– —– 🟢 Poster 1-4: The NL4XAI program: A retrospective Jose Maria Alonso-Moral  
14:45 15:45 🟣 Panel Discussion I   Simone Balloccu
14:45 15:15 🟣 P-1-1: BabyTalk Albert Gatt, Yaji Sirpada, Saad Mahamood  
15:15 15:45 🟣 P-1-2: Commercial NLG - Data2Text Ian Davy, Yaji Sirpada, Ehud Reiter  
15:45 16:00 🟤 Coffee Break ☕️    
16:00 16:45 🟣 Panel Discussion II   Kees van Deemter
16:00 16:45 🟣 P-2-1: Retrospective on Referring Expression Generation Ehud Reiter, Albert Gatt, Sina Zarrieß, Fahime Same  
16:45 17:30 🔴 Closing Keynote “Lets Improve Research Culture” Ehud Reiter Saad Mahamood
17:30 18:00 ⚪️ Closing surprises 🤫    

* Remote presentation