But what is grokking? – transformers

The video explains “grokking” as a sudden emergence of deep understanding in AI models, demonstrated through a single-layer transformer learning modular arithmetic by internally representing inputs with sine and cosine waves to perform addition via trigonometric identities. It also highlights advances in mechanistic interpretability, showing how complex AI behaviors can sometimes be traced to understandable […]

FunctionGemma – Function Calling at the Edge – transformers

The video introduces Function Gemma, a new open model release from the Gemma team, designed to bring customizable function calling capabilities to small language models that can run efficiently on edge devices like mobile phones. Unlike the more research-focused T5 Gemma 2, Function Gemma is specialized for practical applications such as games or apps where […]