Agents, Context, and Openness
A week of coding AIs, massive context windows, and transparent speech models
Happy Friday, everyone! Now that the job is done, let's have fun!
Don't have a plan for tonight or this weekend? I got you (... well, for 5 minutes)!
Here's a recap of all the news I covered in the daily videos I now do on YouTube (and all other socials that have shorts).
You can watch them directly on YouTube, Instagram, Tiktok, almost every day, or keep reading for the quick weekly recap.
TL;DR: My goal with these videos is to cover what is new, highlighting why it's relevant, with the side quest of figuring out if it's just hype or not (for this, I also need your help with your thoughts in the comments or replies, telling me if you agree or not!).
So, here's what happened this week, and why it matters (or does not):
1️⃣ Alibaba’s Qwen3-Next-80B-A3B sets new efficiency bar
Alibaba has dropped a MoE model that runs at a fraction of the cost of much smaller LLMs—all while competing with giant reasoning models.
Key details:
80B MoE model with only 3B active parameters at inference.
Runs at 10% of the cost of smaller models like Qwen 32B.
Features 512 experts (Instruct + Thinking variants).
Hybrid attention with gated mechanisms boosts speed.
Multi-Token Prediction accelerates speculative decoding.
Trained on 15T tokens at a fraction of Qwen3-32B’s cost.
“Thinking” variant outperforms Gemini-2.5-Flash-Thinking.
256K native context, scalable to 1M.
Why it matters (or doesn't): This model pushes past Mixtral and DeepSeek’s efficiency benchmarks and rivals Qwen3-235B in reasoning—all while using just 3B active parameters. If the gains hold up, it could redefine open LLM efficiency standards.
2️⃣ ByteDance releases Seed-OSS-36B with record context window
ByteDance is making ultra-long context and controllable reasoning a standard feature in open models.
Key details:
36B parameter model with a 512K-token native context—the longest in any open model.
“Thinking budget” dial: from instant replies (0 tokens) to deep reasoning (2k–8k tokens).
Mid-generation self-reflection improves efficiency.
Licensed under Apache 2.0 for full commercial use.
Three flavors: Instruct, Base + synthetic data, Base clean.
Why it matters (or doesn't): ByteDance is making long-context and controllable reasoning widely accessible. Expect this to fuel innovation across tools and agent frameworks.
3️⃣ OpenAI’s GPT-5-Codex redefines coding assistants
OpenAI’s newest coding model aims to handle full dev cycles instead of just autocomplete.
Key details:
Goes beyond autocomplete: builds full projects, refactors codebases, and catches bugs.
Adjusts compute: minimal for trivial fixes, more for complex reasoning.
Available via CLI (
npm i -g @openai/codex
) and as a VS Code extension.Cloud runs 90% faster thanks to caching.
Integrates with image sharing, task tracking, and approval modes.
Why it matters (or doesn't): This is one of the first coding AI that feels end-to-end capable—ushering in true agentic coding. We'll see how this applies in real scenarios!
4️⃣ Meta launches MobileLLM-R1: reasoning at sub-1B scale
Meta is showing that small models, with the right data and training, can punch way above their weight.
Key details:
Family of <1B parameter models trained on 4.2T tokens.
Outperforms other open models by 2x–5x in reasoning.
Optimized for math, Python/C++, and scientific logic.
Data mix: FineWeb-Edu, StarCoder, mid-training from Llama-3.1-8B, 6.2M reasoning samples.
Open weights, full training recipes, 32K context.
Non-commercial license.
Why it matters (or doesn't): Shows that smart data efficiency can beat brute-force scaling. Big win for reproducibility, even if not ready for production chatbots.
5️⃣ Ai2 drops OLMoASR: open alternative to Whisper
The Allen Institute for AI has released an open-source ASR model to rival Whisper, with full transparency on training data and pipelines.
Key details:
Fully open Whisper-style ASR model.
Trained on 440K hours of audio.
Matches Whisper in zero-shot transcription across diverse domains.
Releases 3M hours of training data + pipelines for reproducibility.
Commercial use approved.
Why it matters (or doesn't): While Whisper may still edge out at the largest scale, OLMoASR sets a new standard in openness and transparency for speech AI.
Watch the future updates as they come out next week on YouTube, Instagram, Tiktok!
And that's it for this iteration! I'm incredibly grateful that the What's AI newsletter is now read by over 30,000 incredible human beings. Click here to share this iteration with a friend if you learned something new!
Looking for more cool AI stuff? 👇
Looking for AI news, code, learning resources, papers, memes, and more? Follow our weekly newsletter at Towards AI!
Looking to connect with other AI enthusiasts? Join the Discord community: Learn AI Together!
Want to share a product, event or course with my AI community? Reply directly to this email, or visit my Passionfroot profile to see my offers.
Thank you for reading, and I wish you a fantastic week! Be sure to have enough sleep and physical activities next week!
Louis-François Bouchard