Examples of published reinforcement learning algorithms in recent literature implemented in TensorFlow
MIT License
Examples of published reinforcement learning algorithms in recent literature implemented in TensorFlow. Most of my research is in the continuous domain, and I haven't spent much time testing these in discrete domains such as Atari etc.
BipedalWalker-v2 solved using DPPO with a LSTM layer. CarRacing-v0 solved using PPO with a joined actor-critic network
Thanks to DeepMind and OpenAI for making their research openly available. Big thanks also to the TensorFlow community.
ppo_joined.py
All the Python scripts are written as standalone scripts (but share some common functions in utils.py
).
Just run them directly in your IDE. Or in a terminal using the -m
flag:
rl-examples$ python3 -m ppo.ppo_joined
The models and TensorBoard summaries are saved in the same directory as the script. DPPO has a helper script to set off the worker threads:
rl-examples$ sh dppo/start_dppo.sh
DPPO was tested on a 16 core machine using CPU only, so the helper script will need to be updated for your particular setup. For my setup, there was usually no speed advantage training BipedalWalker on the CPU vs GPU (GTX 1080), but CarRacing did get a performance boost due to the usage of CNN layers
ppo_lstm.py
for the correct implementation)dppo_lstm.py
) is sometimes a bit unstable,