LLM Optimization DPO PPO Grpo Slide - Search Videos

DeepSeekMath 7B: Open-Source Math Model Surpasses GPT-4 | Byte Goose AI posted on the topic | LinkedIn

DeepSeekMath 7B: Open-Source Math Model Surpasses GPT-4 | By…

115 views3 months ago

LLM Reinforcement Learning Fine-Tuning DeepSeek Method GRPO

LLM Reinforcement Learning Fine-Tuning DeepSeek Method GRPO

40 viewsApr 10, 2025

Rethinking Trust Region in LLM Reinforcement Learning PPO Limitations and DPPO for Stable FineTuning

Rethinking Trust Region in LLM Reinforcement Learning PPO Limi…

3 views2 months ago

Why Direct Preference Optimization ! Your LLM is Secretly a Reward Model. #ai #llm #researchpaper

Why Direct Preference Optimization ! Your LLM is Secretly a Reward M…

857 views1 month ago

YouTubeTamil AI Hub

Turn-PPO: LLM 에이전트 멀티턴 강화학습 최적화 및 GRPO 비교 분석

Turn-PPO: LLM 에이전트 멀티턴 강화학습 최적화 및 GRPO 비교 분석

2 views4 months ago

S02E05 — Four Models to Teach One to Behave — PPO

S02E05 — Four Models to Teach One to Behave — PPO

YouTubeAI X-Rayed

Dr. GRPO vs GSPO: The Bias-Variance Tradeoff

Dr. GRPO vs GSPO: The Bias-Variance Tradeoff

4 views1 month ago

YouTubeDeep Learning with Yacine

SFT vs DPO vs GRPO vs PPO (In 30 Seconds) #LLM #ML #AI

35 views2 months ago

YouTubeNeurons Decoded

Is DPO Actually Better? The Shocking Truth About LLM Alignm…

YouTubemind shift

BandPO: Probability-Aware Bounds for LLM RL

16 views1 month ago

YouTubeAI Research Roundup

[RL Fine-Tuning] From RLHF to GRPO: The Evolution and Optimiz…

275 views3 months ago

YouTubeAI Podcast Series. Byte Goose AI.

【DPO】直接偏好优化详细原理推导快速上手实战

6.5K views2 months ago

bilibili东川路第一可爱猫猫虫

Advanced Concepts in Large Language Models. RL / SFT / MHA …

Proximal Policy Optimization Explained

78.2K viewsMay 20, 2021

YouTubeEdan Meyer

Introduction to Proximal Policy Optimization algorithm (PPO)

12.8K viewsMar 31, 2020

YouTubePython Lessons

Lec-1 Introduction to Linear Programming Formulations

1.1M viewsAug 28, 2009

YouTubenptelhrd

Lecture 2 - Optimization Techniques | Linear Programming Problem | G…

45.3K viewsJun 29, 2018

YouTubeSukantaNayak edu

An Introduction to Proximal Policy Optimization (PPO) in Deep Reinfo…

18K viewsJun 3, 2019

YouTubeUdacity-DeepRL

Proximal Policy Optimization (PPO) is Easy With PyTorch | Full PPO T…

85.1K viewsDec 24, 2020

YouTubeMachine Learning with Phil

PPO | Proximal Policy Optimization (PPO) architecture | PPO Explained

857 viewsJan 29, 2025

YouTubeAILinkDeepTech

RLHF Explained (and DPO!)

17.6K viewsJun 12, 2024

YouTubeMark Hennings

MaPPO: New LLM Preference Optimization

136 views9 months ago

YouTubeAI Research Roundup

Direct Preference Optimization (DPO)

8.7K viewsNov 13, 2023

YouTubeTrelis Research

Proximal Policy Optimization (PPO) Tutorial - Master Roboschool!!!

18.5K viewsNov 12, 2018

YouTubeSkowster the Geek

Introduction to Trajectory Optimization

101.7K viewsMay 2, 2016

YouTubeMatthew Kelly

[UCLA RL-LLM] Chapter 1.4: Deep policy gradient methods (PPO, GR…

2.1K views9 months ago

YouTubeErnest Ryu

GRPO Reinforcement Learning Explained (DeepSeekMath Paper)

5.4K viewsApr 10, 2025

YouTubeAI Papers Academy

DPO : Direct Preference Optimization

331 viewsJun 20, 2024

YouTubeDhiraj Madan

Direct Preference Optimization: Forget RLHF (PPO)

16.1K viewsJun 6, 2023

YouTubeDiscover AI

GRPO: The Reinforcement Learning Trick That Changed Everything

156 views4 months ago

YouTubemathtartic

See more videos