Sponsored
Sponsored
Media Summary: In this AI Research Roundup episode, Alex discusses the paper: 'Sharing is Caring: Efficient LM Are you ready for a paradigm shift in how we train AI, especially Large Language Models (LLMs)? Discover Swarm Sampling ... This paper addresses rollout generation as a major bottleneck in

Sapo Decentralized Rl Post Training - Detailed Analysis & Overview

In this AI Research Roundup episode, Alex discusses the paper: 'Sharing is Caring: Efficient LM Are you ready for a paradigm shift in how we train AI, especially Large Language Models (LLMs)? Discover Swarm Sampling ... This paper addresses rollout generation as a major bottleneck in In this AI Research Roundup episode, Alex discusses the paper: 'Soft Adaptive Policy Optimization(2511.20347v1)' This work ... Lex Fridman Podcast full episode: Thank you for listening ❀ Check out our ... RollPacker: Taming Long-Tail Rollouts for

In this exclusive guest lecture for the Youth AI Initiative, we hosted Maxime Labonne (Head of Weintroduce INTELLECT-2, the first globally distributed reinforcement learning ( At Ray Summit 2025, Haoran Li from Character AI shares how the company powers its massive AI entertainment ... Instead of jumping between two repositories, developers can stay inside Dexbotic and start the complete Curated AI research intelligence covering May 2025 to May 2026. This video examines the most impactful research on ... We introduce OpenDiLoCo, an open-source implementation and replication of DeepMind's Distributed Low-Communication ...

Photo Gallery

SAPO: Decentralized RL Post-Training for LMs
[Daily Podcast] SAPO: Decentralized RL Boosts LM Post-Training Efficiency
đŸ”„ AI Breakthrough: 94% FASTER LLM Training with Decentralized Swarm Learning (SAPO!)
Gentle Introduction to LLM Post Training!
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing
2  -  Deep RL and RL post-training intro
Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing
Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding
SAPO: Stable RL Policy Optimization for LLMs
Fully Decentralized RL in Complex Multi-Agent Settings
The "secret sauce" of recent AI breakthroughs: Post-training with RLVR (and RLHF) | Lex Fridman
Post Training Reasoning Models
View Detailed Profile
SAPO: Decentralized RL Post-Training for LMs

SAPO: Decentralized RL Post-Training for LMs

In this AI Research Roundup episode, Alex discusses the paper: 'Sharing is Caring: Efficient LM

[Daily Podcast] SAPO: Decentralized RL Boosts LM Post-Training Efficiency

[Daily Podcast] SAPO: Decentralized RL Boosts LM Post-Training Efficiency

This paper introduces

Sponsored
đŸ”„ AI Breakthrough: 94% FASTER LLM Training with Decentralized Swarm Learning (SAPO!)

đŸ”„ AI Breakthrough: 94% FASTER LLM Training with Decentralized Swarm Learning (SAPO!)

Are you ready for a paradigm shift in how we train AI, especially Large Language Models (LLMs)? Discover Swarm Sampling ...

Gentle Introduction to LLM Post Training!

Gentle Introduction to LLM Post Training!

... to do a

Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Sharing is Caring: Efficient LM

Sponsored
2  -  Deep RL and RL post-training intro

2 - Deep RL and RL post-training intro

Then gets into

Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Sharing is Caring: Efficient LM Post-Training with Collective RL Experience Sharing

Sharing is Caring: Efficient LM

Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding

Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding

This paper addresses rollout generation as a major bottleneck in

SAPO: Stable RL Policy Optimization for LLMs

SAPO: Stable RL Policy Optimization for LLMs

In this AI Research Roundup episode, Alex discusses the paper: 'Soft Adaptive Policy Optimization(2511.20347v1)' This work ...

Fully Decentralized RL in Complex Multi-Agent Settings

Fully Decentralized RL in Complex Multi-Agent Settings

Title: Fully

The "secret sauce" of recent AI breakthroughs: Post-training with RLVR (and RLHF) | Lex Fridman

The "secret sauce" of recent AI breakthroughs: Post-training with RLVR (and RLHF) | Lex Fridman

Lex Fridman Podcast full episode: https://www.youtube.com/watch?v=EV7WhVT270Q Thank you for listening ❀ Check out our ...

Post Training Reasoning Models

Post Training Reasoning Models

Post

Sharing is Caring: Efficient LM Post-Trainingwith Collective RL Experience Sharing

Sharing is Caring: Efficient LM Post-Trainingwith Collective RL Experience Sharing

Paper: https://arxiv.org/abs/2509.08721v1 Sharing is Caring: Efficient LM

NSDI '26 - RollPacker: Taming Long-Tail Rollouts for RL Post-Training with Tail Batching

NSDI '26 - RollPacker: Taming Long-Tail Rollouts for RL Post-Training with Tail Batching

RollPacker: Taming Long-Tail Rollouts for

Advanced LLM Post-Training: SFT, DPO, Reinforcement Learning w/ Maxime Labonne (Liquid AI)

Advanced LLM Post-Training: SFT, DPO, Reinforcement Learning w/ Maxime Labonne (Liquid AI)

In this exclusive guest lecture for the Youth AI Initiative, we hosted Maxime Labonne (Head of

INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning

INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning

Weintroduce INTELLECT-2, the first globally distributed reinforcement learning (

Scaling LLM Post-Training at Character.AI | Ray Summit 2025

Scaling LLM Post-Training at Character.AI | Ray Summit 2025

At Ray Summit 2025, Haoran Li from Character AI shares how the company powers its massive AI entertainment ...

Dexbotic × RLinf, Built Like Lego: Run Embodied RL Post-Training with One Command

Dexbotic × RLinf, Built Like Lego: Run Embodied RL Post-Training with One Command

Instead of jumping between two repositories, developers can stay inside Dexbotic and start the complete

RL & Post-Training: Why Training Loops Reshape AI - Frontier AI Brief

RL & Post-Training: Why Training Loops Reshape AI - Frontier AI Brief

Curated AI research intelligence covering May 2025 to May 2026. This video examines the most impactful research on ...

Decentralized training with OpenDiLoCo

Decentralized training with OpenDiLoCo

We introduce OpenDiLoCo, an open-source implementation and replication of DeepMind's Distributed Low-Communication ...

Related Video Content

SAPO – Última Hora e Notícias de Hoje atualizadas ao minuto information

Contactos Estatuto Editorial Ficha Técnica App SAPO CondiçÔes de Utilização Política de Privacidade Publicidade...

SAPO - Facebook information

O SAPO esteve Ă  conversa com Bebe Rexha, que se estreia em Portugal no Rock in Rio Lisboa 2026.

SAPO - Apps no Google Play information

A app SAPO dĂĄ-te acesso a um universo de conteĂșdos credĂ­veis de diversas fontes de informação de referĂȘncia....

Sapo Profile | About Sapo information

Sapo is the most used omnichannel sales management platform in Viet Nam. Pioneering Headless Commerce and AI...

SAPO information

Sign in to SAPO to manage your account or create a new one.

Sponsored