ppo-implementation-details

The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization

OTHER License

Stars

626

View Code on GitHub Visit Website View on X

Ecosystems: Python

Issue Statistics

Past Year

All Time

Total Pull Requests

Merged Pull Requests

Total Issues

Time to Close Issues

N/A

10 days

Related Projects

cleanba

CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL

12 Feb 2023 105

PointNav-VO

[ICCV 2021] Official implementation of "The Surprising Effectiveness of Visual Odometry Technique...

22 Aug 2021 55

chain-of-hindsight

Chain-of-Hindsight, A Scalable RLHF Method

20 Feb 2023 211

pypose

A library for differentiable robotics.

11 Nov 2021 1,185

RL4LMs

A modular RL library to fine-tune language models to human preferences

18 Aug 2022 2,183

cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-f...

07 Jun 2019 5,379

deep-marl-toolkit

MARLToolkit: The Multi-Agent Rainforcement Learning Toolkit. Include implementation of MAPPO, MAD...

08 Aug 2022 70

clip-jax

Train vision models using JAX and 🤗 transformers

05 Aug 2022 75

abcdrl

Modular Single-file Reinfocement Learning Algorithms Library

12 Nov 2022 37

on-policy

This is the official implementation of Multi-Agent PPO (MAPPO).

23 Feb 2021 1,272

ProteinDT

05 Feb 2023 41

open-instruct

09 Jun 2023 1,214

invalid-action-masking

Source Code for A Closer Look at Invalid Action Masking in Policy Gradient Algorithms

18 Jun 2020 135

softlearning

Softlearning is a reinforcement learning framework for training maximum entropy policies in conti...

03 Dec 2018 1,200

OmniBenchmark

[ECCV2022] New benchmark for evaluating pre-trained model; New supervised contrastive learning fr...

12 Jul 2022 105