Tune Trial Schedulers¶
By default, Tune schedules trials in serial order with the FIFOScheduler
class. However, you can also specify a custom scheduling algorithm that can early stop trials or perturb parameters.
tune.run( ... , scheduler=AsyncHyperBandScheduler())
Tune includes distributed implementations of early stopping algorithms such as Median Stopping Rule, HyperBand, and an asynchronous version of HyperBand. These algorithms are very resource efficient and can outperform Bayesian Optimization methods in many cases. All schedulers take in a metric
, which is a value returned in the result dict of your Trainable and is maximized or minimized according to mode
.
Current Available Trial Schedulers:
Population Based Training (PBT)¶
Tune includes a distributed implementation of Population Based Training (PBT). This can be enabled by setting the scheduler
parameter of tune.run
, e.g.
pbt_scheduler = PopulationBasedTraining(
time_attr='time_total_s',
metric='mean_accuracy',
mode='max',
perturbation_interval=600.0,
hyperparam_mutations={
"lr": [1e3, 5e4, 1e4, 5e5, 1e5],
"alpha": lambda: random.uniform(0.0, 1.0),
...
})
tune.run( ... , scheduler=pbt_scheduler)
When the PBT scheduler is enabled, each trial variant is treated as a member of the population. Periodically, topperforming trials are checkpointed (this requires your Trainable to support save and restore). Lowperforming trials clone the checkpoints of top performers and perturb the configurations in the hope of discovering an even better variation.
You can run this toy PBT example to get an idea of how how PBT operates. When training in PBT mode, a single trial may see many different hyperparameters over its lifetime, which is recorded in its result.json
file. The following figure generated by the example shows PBT with optimizing a LR schedule over the course of a single experiment:

class
ray.tune.schedulers.
PopulationBasedTraining
(time_attr='time_total_s', reward_attr=None, metric='episode_reward_mean', mode='max', perturbation_interval=60.0, hyperparam_mutations={}, quantile_fraction=0.25, resample_probability=0.25, custom_explore_fn=None, log_config=True)[source] Implements the Population Based Training (PBT) algorithm.
https://deepmind.com/blog/populationbasedtrainingneuralnetworks
PBT trains a group of models (or agents) in parallel. Periodically, poorly performing models clone the state of the top performers, and a random mutation is applied to their hyperparameters in the hopes of outperforming the current top models.
Unlike other hyperparameter search algorithms, PBT mutates hyperparameters during training time. This enables very fast hyperparameter discovery and also automatically discovers good annealing schedules.
This Tune PBT implementation considers all trials added as part of the PBT population. If the number of trials exceeds the cluster capacity, they will be timemultiplexed as to balance training progress across the population. To run multiple trials, use tune.run(num_samples=<int>).
In {LOG_DIR}/{MY_EXPERIMENT_NAME}/, all mutations are logged in pbt_global.txt and individual policy perturbations are recorded in pbt_policy_{i}.txt. Tune logs: [target trial tag, clone trial tag, target trial iteration, clone trial iteration, old config, new config] on each perturbation step.
 Parameters
time_attr (str) – The training result attr to use for comparing time. Note that you can pass in something nontemporal such as training_iteration as a measure of progress, the only requirement is that the attribute should increase monotonically.
metric (str) – The training result objective value attribute. Stopping procedures will use this attribute.
mode (str) – One of {min, max}. Determines whether objective is minimizing or maximizing the metric attribute.
perturbation_interval (float) – Models will be considered for perturbation at this interval of time_attr. Note that perturbation incurs checkpoint overhead, so you shouldn’t set this to be too frequent.
hyperparam_mutations (dict) – Hyperparams to mutate. The format is as follows: for each key, either a list or function can be provided. A list specifies an allowed set of categorical values. A function specifies the distribution of a continuous parameter. You must specify at least one of hyperparam_mutations or custom_explore_fn.
quantile_fraction (float) – Parameters are transferred from the top quantile_fraction fraction of trials to the bottom quantile_fraction fraction. Needs to be between 0 and 0.5. Setting it to 0 essentially implies doing no exploitation at all.
resample_probability (float) – The probability of resampling from the original distribution when applying hyperparam_mutations. If not resampled, the value will be perturbed by a factor of 1.2 or 0.8 if continuous, or changed to an adjacent value if discrete.
custom_explore_fn (func) – You can also specify a custom exploration function. This function is invoked as f(config) after builtin perturbations from hyperparam_mutations are applied, and should return config updated as needed. You must specify at least one of hyperparam_mutations or custom_explore_fn.
log_config (bool) – Whether to log the ray config of each model to local_dir at each exploit. Allows config schedule to be reconstructed.
import random from ray import tune from ray.tune.schedulers import PopulationBasedTraining pbt = PopulationBasedTraining( time_attr="training_iteration", metric="episode_reward_mean", mode="max", perturbation_interval=10, # every 10 `time_attr` units # (training_iterations in this case) hyperparam_mutations={ # Perturb factor1 by scaling it by 0.8 or 1.2. Resampling # resets it to a value sampled from the lambda function. "factor_1": lambda: random.uniform(0.0, 20.0), # Perturb factor2 by changing it to an adjacent value, e.g. # 10 > 1 or 10 > 100. Resampling will choose at random. "factor_2": [1, 10, 100, 1000, 10000], }) tune.run({...}, num_samples=8, scheduler=pbt)
Asynchronous HyperBand¶
The asynchronous version of HyperBand scheduler can be used by setting the scheduler
parameter of tune.run
, e.g.
async_hb_scheduler = AsyncHyperBandScheduler(
time_attr='training_iteration',
metric='episode_reward_mean',
mode='max',
max_t=100,
grace_period=10,
reduction_factor=3,
brackets=3)
tune.run( ... , scheduler=async_hb_scheduler)
Compared to the original version of HyperBand, this implementation provides better parallelism and avoids straggler issues during eliminations. An example of this can be found in async_hyperband_example.py. We recommend using this over the standard HyperBand scheduler.

class
ray.tune.schedulers.
AsyncHyperBandScheduler
(time_attr='training_iteration', reward_attr=None, metric='episode_reward_mean', mode='max', max_t=100, grace_period=1, reduction_factor=4, brackets=1)[source] Implements the Async Successive Halving.
This should provide similar theoretical performance as HyperBand but avoid straggler issues that HyperBand faces. One implementation detail is when using multiple brackets, trial allocation to bracket is done randomly with over a softmax probability.
See https://arxiv.org/abs/1810.05934
 Parameters
time_attr (str) – A training result attr to use for comparing time. Note that you can pass in something nontemporal such as training_iteration as a measure of progress, the only requirement is that the attribute should increase monotonically.
metric (str) – The training result objective value attribute. Stopping procedures will use this attribute.
mode (str) – One of {min, max}. Determines whether objective is minimizing or maximizing the metric attribute.
max_t (float) – max time units per trial. Trials will be stopped after max_t time units (determined by time_attr) have passed.
grace_period (float) – Only stop trials at least this old in time. The units are the same as the attribute named by time_attr.
reduction_factor (float) – Used to set halving rate and amount. This is simply a unitless scalar.
brackets (int) – Number of brackets. Each bracket has a different halving rate, specified by the reduction factor.
HyperBand¶
Note
Note that the HyperBand scheduler requires your trainable to support saving and restoring, which is described in Tune User Guide. Checkpointing enables the scheduler to multiplex many concurrent trials onto a limited size cluster.
Tune also implements the standard version of HyperBand. You can use it as such:
tune.run( ... , scheduler=HyperBandScheduler())
An example of this can be found in hyperband_example.py. The progress of one such HyperBand run is shown below.
== Status ==
Using HyperBand: num_stopped=0 total_brackets=5
Round #0:
Bracket(n=5, r=100, completed=80%): {'PAUSED': 4, 'PENDING': 1}
Bracket(n=8, r=33, completed=23%): {'PAUSED': 4, 'PENDING': 4}
Bracket(n=15, r=11, completed=4%): {'RUNNING': 2, 'PAUSED': 2, 'PENDING': 11}
Bracket(n=34, r=3, completed=0%): {'RUNNING': 2, 'PENDING': 32}
Bracket(n=81, r=1, completed=0%): {'PENDING': 38}
Resources used: 4/4 CPUs, 0/0 GPUs
Result logdir: ~/ray_results/hyperband_test
PAUSED trials:
 my_class_0_height=99,width=43: PAUSED [pid=11664], 0 s, 100 ts, 97.1 rew
 my_class_11_height=85,width=81: PAUSED [pid=11771], 0 s, 33 ts, 32.8 rew
 my_class_12_height=0,width=52: PAUSED [pid=11785], 0 s, 33 ts, 0 rew
 my_class_19_height=44,width=88: PAUSED [pid=11811], 0 s, 11 ts, 5.47 rew
 my_class_27_height=96,width=84: PAUSED [pid=11840], 0 s, 11 ts, 12.5 rew
... 5 more not shown
PENDING trials:
 my_class_10_height=12,width=25: PENDING
 my_class_13_height=90,width=45: PENDING
 my_class_14_height=69,width=45: PENDING
 my_class_15_height=41,width=11: PENDING
 my_class_16_height=57,width=69: PENDING
... 81 more not shown
RUNNING trials:
 my_class_23_height=75,width=51: RUNNING [pid=11843], 0 s, 1 ts, 1.47 rew
 my_class_26_height=16,width=48: RUNNING
 my_class_31_height=40,width=10: RUNNING
 my_class_53_height=28,width=96: RUNNING

class
ray.tune.schedulers.
HyperBandScheduler
(time_attr='training_iteration', reward_attr=None, metric='episode_reward_mean', mode='max', max_t=81, reduction_factor=3)[source] Implements the HyperBand early stopping algorithm.
HyperBandScheduler early stops trials using the HyperBand optimization algorithm. It divides trials into brackets of varying sizes, and periodically early stops lowperforming trials within each bracket.
To use this implementation of HyperBand with Tune, all you need to do is specify the max length of time a trial can run max_t, the time units time_attr, the name of the reported objective value metric, and if metric is to be maximized or minimized (mode). We automatically determine reasonable values for the other HyperBand parameters based on the given values.
For example, to limit trials to 10 minutes and early stop based on the episode_mean_reward attr, construct:
HyperBand('time_total_s', 'episode_reward_mean', max_t=600)
Note that Tune’s stopping criteria will be applied in conjunction with HyperBand’s early stopping mechanisms.
See also: https://people.eecs.berkeley.edu/~kjamieson/hyperband.html
 Parameters
time_attr (str) – The training result attr to use for comparing time. Note that you can pass in something nontemporal such as training_iteration as a measure of progress, the only requirement is that the attribute should increase monotonically.
metric (str) – The training result objective value attribute. Stopping procedures will use this attribute.
mode (str) – One of {min, max}. Determines whether objective is minimizing or maximizing the metric attribute.
max_t (int) – max time units per trial. Trials will be stopped after max_t time units (determined by time_attr) have passed. The scheduler will terminate trials after this time has passed. Note that this is different from the semantics of max_t as mentioned in the original HyperBand paper.
reduction_factor (float) – Same as eta. Determines how sharp the difference is between bracket spacetime allocation ratios.
HyperBand Implementation Details¶
Implementation details may deviate slightly from theory but are focused on increasing usability. Note: R
, s_max
, and eta
are parameters of HyperBand given by the paper. See this post for context.
Both
s_max
(representing thenumber of brackets  1
) andeta
, representing the downsampling rate, are fixed. In many practical settings,R
, which represents some resource unit and often the number of training iterations, can be set reasonably large, likeR >= 200
. For simplicity, assumeeta = 3
. VaryingR
betweenR = 200
andR = 1000
creates a huge range of the number of trials needed to fill up all brackets.
On the other hand, holding R
constant at R = 300
and varying eta
also leads to HyperBand configurations that are not very intuitive:
The implementation takes the same configuration as the example given in the paper and exposes max_t
, which is not a parameter in the paper.
The example in the post to calculate
n_0
is actually a little different than the algorithm given in the paper. In this implementation, we implementn_0
according to the paper (which is n in the below example):
There are also implementation specific details like how trials are placed into brackets which are not covered in the paper. This implementation places trials within brackets according to smaller bracket first  meaning that with low number of trials, there will be less early stopping.
HyperBand (BOHB)¶
Tip
This implementation is still experimental. Please report issues on https://github.com/rayproject/ray/issues/. Thanks!
This class is a variant of HyperBand that enables the BOHB Algorithm. This implementation is true to the original HyperBand implementation and does not implement pipelining nor straggler mitigation.
This is to be used in conjunction with the Tune BOHB search algorithm. See TuneBOHB for package requirements, examples, and details.
An example of this in use can be found in bohb_example.py.

class
ray.tune.schedulers.
HyperBandForBOHB
(time_attr='training_iteration', reward_attr=None, metric='episode_reward_mean', mode='max', max_t=81, reduction_factor=3)[source] Extends HyperBand early stopping algorithm for BOHB.
This implementation removes the
HyperBandScheduler
pipelining. This class introduces key changes:1. Trials are now placed so that the bracket with the largest size is filled first.
2. Trials will be paused even if the bracket is not filled. This allows BOHB to insert new trials into the training.
See ray.tune.schedulers.HyperBandScheduler for parameter docstring.
Median Stopping Rule¶
The Median Stopping Rule implements the simple strategy of stopping a trial if its performance falls below the median of other trials at similar points in time. You can set the scheduler
parameter as such:
tune.run( ... , scheduler=MedianStoppingRule())

class
ray.tune.schedulers.
MedianStoppingRule
(time_attr='time_total_s', reward_attr=None, metric='episode_reward_mean', mode='max', grace_period=60.0, min_samples_required=3, min_time_slice=0, hard_stop=True)[source] Implements the median stopping rule as described in the Vizier paper:
https://research.google.com/pubs/pub46180.html
 Parameters
time_attr (str) – The training result attr to use for comparing time. Note that you can pass in something nontemporal such as training_iteration as a measure of progress, the only requirement is that the attribute should increase monotonically.
metric (str) – The training result objective value attribute. Stopping procedures will use this attribute.
mode (str) – One of {min, max}. Determines whether objective is minimizing or maximizing the metric attribute.
grace_period (float) – Only stop trials at least this old in time. The mean will only be computed from this time onwards. The units are the same as the attribute named by time_attr.
min_samples_required (int) – Minimum number of trials to compute median over.
min_time_slice (float) – Each trial runs at least this long before yielding (assuming it isn’t stopped). Note: trials ONLY yield if there are not enough samples to evaluate performance for the current result AND there are other trials waiting to run. The units are the same as the attribute named by time_attr.
hard_stop (bool) – If False, pauses trials instead of stopping them. When all other trials are complete, paused trials will be resumed and allowed to run FIFO.