Evaluation Helper¶
-
stable_baselines3.common.evaluation.
evaluate_policy
(model, env, n_eval_episodes=10, deterministic=True, render=False, callback=None, reward_threshold=None, return_episode_rewards=False)[source]¶ Runs policy for
n_eval_episodes
episodes and returns average reward. This is made to work only with one env.- Parameters
model – (BaseAlgorithm) The RL agent you want to evaluate.
env – (gym.Env or VecEnv) The gym environment. In the case of a
VecEnv
this must contain only one environment.n_eval_episodes – (int) Number of episode to evaluate the agent
deterministic – (bool) Whether to use deterministic or stochastic actions
render – (bool) Whether to render the environment or not
callback – (callable) callback function to do additional checks, called after each step.
reward_threshold – (float) Minimum expected reward per episode, this will raise an error if the performance is not met
return_episode_rewards – (bool) If True, a list of reward per episode will be returned instead of the mean.
- Returns
(float, float) Mean reward per episode, std of reward per episode returns ([float], [int]) when
return_episode_rewards
is True