Schedulers (tune.schedulers)

FIFOScheduler

class ray.tune.schedulers.FIFOScheduler[source]

Simple scheduler that just runs trials in submission order.

HyperBandScheduler

class ray.tune.schedulers.HyperBandScheduler(time_attr='training_iteration', reward_attr=None, metric='episode_reward_mean', mode='max', max_t=81, reduction_factor=3)[source]

Implements the HyperBand early stopping algorithm.

HyperBandScheduler early stops trials using the HyperBand optimization algorithm. It divides trials into brackets of varying sizes, and periodically early stops low-performing trials within each bracket.

To use this implementation of HyperBand with Tune, all you need to do is specify the max length of time a trial can run max_t, the time units time_attr, the name of the reported objective value metric, and if metric is to be maximized or minimized (mode). We automatically determine reasonable values for the other HyperBand parameters based on the given values.

For example, to limit trials to 10 minutes and early stop based on the episode_mean_reward attr, construct:

HyperBand('time_total_s', 'episode_reward_mean', max_t=600)

Note that Tune’s stopping criteria will be applied in conjunction with HyperBand’s early stopping mechanisms.

See also: https://people.eecs.berkeley.edu/~kjamieson/hyperband.html

Parameters
  • time_attr (str) – The training result attr to use for comparing time. Note that you can pass in something non-temporal such as training_iteration as a measure of progress, the only requirement is that the attribute should increase monotonically.

  • metric (str) – The training result objective value attribute. Stopping procedures will use this attribute.

  • mode (str) – One of {min, max}. Determines whether objective is minimizing or maximizing the metric attribute.

  • max_t (int) – max time units per trial. Trials will be stopped after max_t time units (determined by time_attr) have passed. The scheduler will terminate trials after this time has passed. Note that this is different from the semantics of max_t as mentioned in the original HyperBand paper.

  • reduction_factor (float) – Same as eta. Determines how sharp the difference is between bracket space-time allocation ratios.

ASHAScheduler/AsyncHyperBandScheduler

class ray.tune.schedulers.AsyncHyperBandScheduler(time_attr='training_iteration', reward_attr=None, metric='episode_reward_mean', mode='max', max_t=100, grace_period=1, reduction_factor=4, brackets=1)[source]

Implements the Async Successive Halving.

This should provide similar theoretical performance as HyperBand but avoid straggler issues that HyperBand faces. One implementation detail is when using multiple brackets, trial allocation to bracket is done randomly with over a softmax probability.

See https://arxiv.org/abs/1810.05934

Parameters
  • time_attr (str) – A training result attr to use for comparing time. Note that you can pass in something non-temporal such as training_iteration as a measure of progress, the only requirement is that the attribute should increase monotonically.

  • metric (str) – The training result objective value attribute. Stopping procedures will use this attribute.

  • mode (str) – One of {min, max}. Determines whether objective is minimizing or maximizing the metric attribute.

  • max_t (float) – max time units per trial. Trials will be stopped after max_t time units (determined by time_attr) have passed.

  • grace_period (float) – Only stop trials at least this old in time. The units are the same as the attribute named by time_attr.

  • reduction_factor (float) – Used to set halving rate and amount. This is simply a unit-less scalar.

  • brackets (int) – Number of brackets. Each bracket has a different halving rate, specified by the reduction factor.

ray.tune.schedulers.ASHAScheduler

alias of ray.tune.schedulers.async_hyperband.AsyncHyperBandScheduler

MedianStoppingRule

class ray.tune.schedulers.MedianStoppingRule(time_attr='time_total_s', reward_attr=None, metric='episode_reward_mean', mode='max', grace_period=60.0, min_samples_required=3, min_time_slice=0, hard_stop=True)[source]

Implements the median stopping rule as described in the Vizier paper:

https://research.google.com/pubs/pub46180.html

Parameters
  • time_attr (str) – The training result attr to use for comparing time. Note that you can pass in something non-temporal such as training_iteration as a measure of progress, the only requirement is that the attribute should increase monotonically.

  • metric (str) – The training result objective value attribute. Stopping procedures will use this attribute.

  • mode (str) – One of {min, max}. Determines whether objective is minimizing or maximizing the metric attribute.

  • grace_period (float) – Only stop trials at least this old in time. The mean will only be computed from this time onwards. The units are the same as the attribute named by time_attr.

  • min_samples_required (int) – Minimum number of trials to compute median over.

  • min_time_slice (float) – Each trial runs at least this long before yielding (assuming it isn’t stopped). Note: trials ONLY yield if there are not enough samples to evaluate performance for the current result AND there are other trials waiting to run. The units are the same as the attribute named by time_attr.

  • hard_stop (bool) – If False, pauses trials instead of stopping them. When all other trials are complete, paused trials will be resumed and allowed to run FIFO.

PopulationBasedTraining

class ray.tune.schedulers.PopulationBasedTraining(time_attr='time_total_s', reward_attr=None, metric='episode_reward_mean', mode='max', perturbation_interval=60.0, hyperparam_mutations={}, quantile_fraction=0.25, resample_probability=0.25, custom_explore_fn=None, log_config=True)[source]

Implements the Population Based Training (PBT) algorithm.

https://deepmind.com/blog/population-based-training-neural-networks

PBT trains a group of models (or agents) in parallel. Periodically, poorly performing models clone the state of the top performers, and a random mutation is applied to their hyperparameters in the hopes of outperforming the current top models.

Unlike other hyperparameter search algorithms, PBT mutates hyperparameters during training time. This enables very fast hyperparameter discovery and also automatically discovers good annealing schedules.

This Tune PBT implementation considers all trials added as part of the PBT population. If the number of trials exceeds the cluster capacity, they will be time-multiplexed as to balance training progress across the population. To run multiple trials, use tune.run(num_samples=<int>).

In {LOG_DIR}/{MY_EXPERIMENT_NAME}/, all mutations are logged in pbt_global.txt and individual policy perturbations are recorded in pbt_policy_{i}.txt. Tune logs: [target trial tag, clone trial tag, target trial iteration, clone trial iteration, old config, new config] on each perturbation step.

Parameters
  • time_attr (str) – The training result attr to use for comparing time. Note that you can pass in something non-temporal such as training_iteration as a measure of progress, the only requirement is that the attribute should increase monotonically.

  • metric (str) – The training result objective value attribute. Stopping procedures will use this attribute.

  • mode (str) – One of {min, max}. Determines whether objective is minimizing or maximizing the metric attribute.

  • perturbation_interval (float) – Models will be considered for perturbation at this interval of time_attr. Note that perturbation incurs checkpoint overhead, so you shouldn’t set this to be too frequent.

  • hyperparam_mutations (dict) – Hyperparams to mutate. The format is as follows: for each key, either a list or function can be provided. A list specifies an allowed set of categorical values. A function specifies the distribution of a continuous parameter. You must specify at least one of hyperparam_mutations or custom_explore_fn.

  • quantile_fraction (float) – Parameters are transferred from the top quantile_fraction fraction of trials to the bottom quantile_fraction fraction. Needs to be between 0 and 0.5. Setting it to 0 essentially implies doing no exploitation at all.

  • resample_probability (float) – The probability of resampling from the original distribution when applying hyperparam_mutations. If not resampled, the value will be perturbed by a factor of 1.2 or 0.8 if continuous, or changed to an adjacent value if discrete.

  • custom_explore_fn (func) – You can also specify a custom exploration function. This function is invoked as f(config) after built-in perturbations from hyperparam_mutations are applied, and should return config updated as needed. You must specify at least one of hyperparam_mutations or custom_explore_fn.

  • log_config (bool) – Whether to log the ray config of each model to local_dir at each exploit. Allows config schedule to be reconstructed.

import random
from ray import tune
from ray.tune.schedulers import PopulationBasedTraining

pbt = PopulationBasedTraining(
    time_attr="training_iteration",
    metric="episode_reward_mean",
    mode="max",
    perturbation_interval=10,  # every 10 `time_attr` units
                               # (training_iterations in this case)
    hyperparam_mutations={
        # Perturb factor1 by scaling it by 0.8 or 1.2. Resampling
        # resets it to a value sampled from the lambda function.
        "factor_1": lambda: random.uniform(0.0, 20.0),
        # Perturb factor2 by changing it to an adjacent value, e.g.
        # 10 -> 1 or 10 -> 100. Resampling will choose at random.
        "factor_2": [1, 10, 100, 1000, 10000],
    })
tune.run({...}, num_samples=8, scheduler=pbt)

TrialScheduler

class ray.tune.schedulers.TrialScheduler[source]

Interface for implementing a Trial Scheduler class.

CONTINUE = 'CONTINUE'

Status for continuing trial execution

PAUSE = 'PAUSE'

Status for pausing trial execution

STOP = 'STOP'

Status for stopping trial execution

on_trial_add(trial_runner, trial)[source]

Called when a new trial is added to the trial runner.

on_trial_error(trial_runner, trial)[source]

Notification for the error of trial.

This will only be called when the trial is in the RUNNING state.

on_trial_result(trial_runner, trial, result)[source]

Called on each intermediate result returned by a trial.

At this point, the trial scheduler can make a decision by returning one of CONTINUE, PAUSE, and STOP. This will only be called when the trial is in the RUNNING state.

on_trial_complete(trial_runner, trial, result)[source]

Notification for the completion of trial.

This will only be called when the trial is in the RUNNING state and either completes naturally or by manual termination.

on_trial_remove(trial_runner, trial)[source]

Called to remove trial.

This is called when the trial is in PAUSED or PENDING state. Otherwise, call on_trial_complete.

choose_trial_to_run(trial_runner)[source]

Called to choose a new trial to run.

This should return one of the trials in trial_runner that is in the PENDING or PAUSED state. This function must be idempotent.

If no trial is ready, return None.

debug_string()[source]

Returns a human readable message for printing to the console.