Tune: A Scalable Hyperparameter Tuning Library

Important

Take the 3 minute 2019 Ray Tune User Survey!

_images/tune.png

Tune is a Python library for hyperparameter tuning at any scale. Core features:

Quick Start

Note

To run this example, you will need to install the following:

$ pip install ray torch torchvision filelock

This example runs a small grid search to train a CNN using PyTorch and Tune.

import torch.optim as optim
from ray import tune
from ray.tune.examples.mnist_pytorch import get_data_loaders, ConvNet, train, test


def train_mnist(config):
    train_loader, test_loader = get_data_loaders()
    model = ConvNet()
    optimizer = optim.SGD(model.parameters(), lr=config["lr"])
    for i in range(10):
        train(model, optimizer, train_loader)
        acc = test(model, test_loader)
        tune.track.log(mean_accuracy=acc)


analysis = tune.run(
    train_mnist, config={"lr": tune.grid_search([0.001, 0.01, 0.1])})

print("Best config: ", analysis.get_best_config(metric="mean_accuracy"))

# Get a dataframe for analyzing trial results.
df = analysis.dataframe()

If TensorBoard is installed, automatically visualize all trial results:

tensorboard --logdir ~/ray_results
_images/tune-start-tb.png

Distributed Quick Start

  1. Import and initialize Ray by appending the following to your example script.
# Append to top of your script
import ray
import argparse

parser = argparse.ArgumentParser()
parser.add_argument("--ray-address")
args = parser.parse_args()
ray.init(address=args.ray_address)

Alternatively, download a full example script here: mnist_pytorch.py

  1. Download the following example Ray cluster configuration as tune-local-default.yaml and replace the appropriate fields:
cluster_name: local-default
provider:
    type: local
    head_ip: YOUR_HEAD_NODE_HOSTNAME
    worker_ips: [WORKER_NODE_1_HOSTNAME, WORKER_NODE_2_HOSTNAME, ... ]
auth: {ssh_user: YOUR_USERNAME, ssh_private_key: ~/.ssh/id_rsa}
## Typically for local clusters, min_workers == max_workers.
min_workers: 3
max_workers: 3
setup_commands:  # Set up each node.
    - pip install ray torch torchvision tabulate tensorboard

Alternatively, download it here: tune-local-default.yaml. See Ray cluster docs here.

  1. Run ray submit like the following.
ray submit tune-local-default.yaml mnist_pytorch.py --args="--ray-address=localhost:6379" --start

This will start Ray on all of your machines and run a distributed hyperparameter search across them.

To summarize, here are the full set of commands:

wget https://raw.githubusercontent.com/ray-project/ray/master/python/ray/tune/examples/mnist_pytorch.py
wget https://raw.githubusercontent.com/ray-project/ray/master/python/ray/tune/tune-local-default.yaml
ray submit tune-local-default.yaml mnist_pytorch.py --args="--ray-address=localhost:6379" --start

Take a look at the Distributed Experiments documentation for more details, including:

  1. Setting up distributed experiments on your local cluster
  2. Using AWS and GCP
  3. Spot instance usage/pre-emptible instances, and more.

Getting Started

  • Code: GitHub repository for Tune.
  • User Guide: A comprehensive overview on how to use Tune’s features.
  • Tutorial Notebook: Our tutorial notebooks of using Tune with Keras or PyTorch.

Contribute to Tune

Take a look at our Contributor Guide for guidelines on contributing.

Citing Tune

If Tune helps you in your academic research, you are encouraged to cite our paper. Here is an example bibtex:

@article{liaw2018tune,
    title={Tune: A Research Platform for Distributed Model Selection and Training},
    author={Liaw, Richard and Liang, Eric and Nishihara, Robert
            and Moritz, Philipp and Gonzalez, Joseph E and Stoica, Ion},
    journal={arXiv preprint arXiv:1807.05118},
    year={2018}
}