Academic & Research talks:
Selected Academic Talks
“Modern LLMs - Part 1 (Architectures)”
MBZUAI, Abu Dhabi on 15.10.2025
Summary and main innovations of LLM architectures, specifically for the attention and feed-forward components. Qwen3, Qwen3-Next, Kimi K2, DeepSeek V3, GPT-OSS, LongCat-Chat are all analysed in a bit more detail. Slides
“GSPO - RL Training of LLMs”
MBZUAI, Abu Dhabi on 17.09.2025
Summary and history of RL Algorithms for LLMs upto and including GSPO. Slides
“GRPO, DeepSeekMath & DeepSeek-R1”
Imperial College London, London on 03.06.2025
Introduction to GRPO, DeepSeekMath and DeepSeek-R1 Slides
“Test time scaling.”
Imperial College London, London on 03.12.2024
How does test time scaling work, what are the ingredients… Slides
“Not all chain-of-thought are equal…“
Imperial College London, London on 09.07.2024
Not all chain-of-thought are equal, exploring how chain-of-thought works and what variants there are thereof. Slides
“Fantastic Prompts and where to find them (and a recent & brief history of LLMs)”
Imperial College London, London on 05.03.2024
Introduction LLMs, history from multi-task NLP, to text-to-text to finally massively instruction tuned models. How can one create great prompts? What prompt tuning techniques are there? Slides