multiprocessing.Pool API (Experimental)¶
Support for the multiprocessing.Pool API on Ray is an experimental feature, so it may be changed at any time without warning. If you encounter any bugs/shortcomings/incompatibilities, please file an issue on GitHub. Contributions are always welcome!
Ray supports running distributed python programs with the multiprocessing.Pool API
using Ray Actors instead of local processes. This makes it easy
to scale existing applications that use
multiprocessing.Pool from a single node
to a cluster.
To get started, first install Ray, then use
ray.experimental.multiprocessing.Pool in place of
This will start a local Ray cluster the first time you create a
distribute your tasks across it. See the Run on a Cluster section below for
instructions to run on a multi-node Ray cluster instead.
from ray.experimental.multiprocessing import Pool def f(index): return index pool = Pool() for result in pool.map(f, range(100)): print(result)
multiprocessing.Pool API is currently supported. Please see the
multiprocessing documentation for details.
Run on a Cluster¶
This section assumes that you have a running Ray cluster. To start a Ray cluster, please refer to the cluster setup instructions.
To connect a
Pool to a running Ray cluster, you can specify the address of the
head node in one of two ways:
- By setting the
- By passing the
ray_addresskeyword argument to the
from ray.experimental.multiprocessing import Pool # Starts a new local Ray cluster. pool = Pool() # Connects to a running Ray cluster, with the current node as the head node. # Alternatively, set the environment variable RAY_ADDRESS="auto". pool = Pool(ray_address="auto") # Connects to a running Ray cluster, with a remote node as the head node. # Alternatively, set the environment variable RAY_ADDRESS="<ip_address>:<port>". pool = Pool(ray_address="<ip_address>:<port>")
You can also start Ray manually by calling
ray.init() (with any of its supported
configuration options) before creating a