RLlib: Scalable Reinforcement Learning

RLlib is an open-source library for reinforcement learning that offers both high scalability and a unified API for a variety of applications.

_images/rllib-stack.svg

Learn more about RLlib’s design by reading the ICML paper. To get started, take a look over the custom env example and the API documentation.

Installation

RLlib has extra dependencies on top of ray. First, you’ll need to install either PyTorch or TensorFlow. Then, install the RLlib module:

pip install tensorflow  # or tensorflow-gpu
pip install ray[rllib]  # also recommended: ray[debug]

You might also want to clone the Ray repo for convenient access to RLlib helper scripts:

git clone https://github.com/ray-project/ray
cd ray/python/ray/rllib

Troubleshooting

If you encounter errors like blas_thread_init: pthread_create: Resource temporarily unavailable when using many workers, try setting OMP_NUM_THREADS=1. Similarly, check configured system limits with ulimit -a for other resource limit errors.

If you encounter out-of-memory errors, consider setting redis_max_memory and object_store_memory in ray.init() to reduce memory usage.

For debugging unexpected hangs or performance problems, you can run ray stack to dump the stack traces of all Ray workers on the current node, and ray timeline to dump a timeline visualization of tasks to a file.