(distributions)=

# Probability Distributions

Probability distributions used for the different action spaces:

- `CategoricalDistribution` -> Discrete
- `DiagGaussianDistribution` -> Box (continuous actions)
- `StateDependentNoiseDistribution` -> Box (continuous actions) when `use_sde=True`

% - ``MultiCategoricalDistribution`` -> MultiDiscrete

% - ``BernoulliDistribution`` -> MultiBinary

The policy networks output parameters for the distributions (named `flat` in the methods).
Actions are then sampled from those distributions.

For instance, in the case of discrete actions. The policy network outputs probability
of taking each action. The `CategoricalDistribution` allows sampling from it,
computes the entropy, the log probability (`log_prob`) and backpropagate the gradient.

In the case of continuous actions, a Gaussian distribution is used. The policy network outputs
mean and (log) std of the distribution (assumed to be a `DiagGaussianDistribution`).

```{eval-rst}
.. automodule:: stable_baselines3.common.distributions
  :members:
```