Action Noise

class stable_baselines3.common.noise.ActionNoise[source]

The action noise base class

reset()[source]

Call end of episode reset for the noise

Return type:

None

class stable_baselines3.common.noise.NormalActionNoise(mean, sigma, dtype=<class 'numpy.float32'>)[source]

A Gaussian action noise.

Parameters:
  • mean (ndarray) – Mean value of the noise

  • sigma (ndarray) – Scale of the noise (std here)

  • dtype (dtype[Any] | None | Type[Any] | _SupportsDType[dtype[Any]] | str | Tuple[Any, int] | Tuple[Any, SupportsIndex | Sequence[SupportsIndex]] | List[Any] | _DTypeDict | Tuple[Any, Any]) – Type of the output noise

class stable_baselines3.common.noise.OrnsteinUhlenbeckActionNoise(mean, sigma, theta=0.15, dt=0.01, initial_noise=None, dtype=<class 'numpy.float32'>)[source]

An Ornstein Uhlenbeck action noise, this is designed to approximate Brownian motion with friction.

Based on http://math.stackexchange.com/questions/1287634/implementing-ornstein-uhlenbeck-in-matlab

Parameters:
  • mean (ndarray) – Mean of the noise

  • sigma (ndarray) – Scale of the noise

  • theta (float) – Rate of mean reversion

  • dt (float) – Timestep for the noise

  • initial_noise (ndarray | None) – Initial value for the noise output, (if None: 0)

  • dtype (dtype[Any] | None | Type[Any] | _SupportsDType[dtype[Any]] | str | Tuple[Any, int] | Tuple[Any, SupportsIndex | Sequence[SupportsIndex]] | List[Any] | _DTypeDict | Tuple[Any, Any]) – Type of the output noise

reset()[source]

reset the Ornstein Uhlenbeck noise, to the initial position

Return type:

None

class stable_baselines3.common.noise.VectorizedActionNoise(base_noise, n_envs)[source]

A Vectorized action noise for parallel environments.

Parameters:
  • base_noise (ActionNoise) – Noise generator to use

  • n_envs (int) – Number of parallel environments

reset(indices=None)[source]

Reset all the noise processes, or those listed in indices.

Parameters:

indices (Iterable[int] | None) – The indices to reset. Default: None. If the parameter is None, then all processes are reset to their initial position.

Return type:

None