Atari Wrappers¶
- class stable_baselines3.common.atari_wrappers.AtariWrapper(env, noop_max=30, frame_skip=4, screen_size=84, terminal_on_life_loss=True, clip_reward=True)[source]¶
Atari 2600 preprocessings
Specifically:
NoopReset: obtain initial state by taking random number of no-ops on reset.
Frame skipping: 4 by default
Max-pooling: most recent two observations
Termination signal when a life is lost.
Resize to a square image: 84x84 by default
Grayscale observation
Clip reward to {-1, 0, 1}
- Parameters:
env (
Env
) – gym environmentnoop_max (
int
) – max number of no-opsframe_skip (
int
) – the frequency at which the agent experiences the game.screen_size (
int
) – resize Atari frameterminal_on_life_loss (
bool
) – if True, then step() returns done=True whenever a life is lost.clip_reward (
bool
) – If True (default), the reward is clip to {-1, 0, 1} depending on its sign.
- class stable_baselines3.common.atari_wrappers.ClipRewardEnv(env)[source]¶
Clips the reward to {+1, 0, -1} by its sign.
- Parameters:
env (
Env
) – the environment
- class stable_baselines3.common.atari_wrappers.EpisodicLifeEnv(env)[source]¶
Make end-of-life == end-of-episode, but only reset on true game over. Done by DeepMind for the DQN and co. since it helps value estimation.
- Parameters:
env (
Env
) – the environment to wrap
- reset(**kwargs)[source]¶
Calls the Gym environment reset, only when lives are exhausted. This way all states are still reachable even though lives are episodic, and the learner need not know about any of this behind-the-scenes.
- Parameters:
kwargs – Extra keywords passed to env.reset() call
- Return type:
ndarray
- Returns:
the first observation of the environment
- class stable_baselines3.common.atari_wrappers.FireResetEnv(env)[source]¶
Take action on reset for environments that are fixed until firing.
- Parameters:
env (
Env
) – the environment to wrap
- class stable_baselines3.common.atari_wrappers.MaxAndSkipEnv(env, skip=4)[source]¶
Return only every
skip
-th frame (frameskipping)- Parameters:
env (
Env
) – the environmentskip (
int
) – number ofskip
-th frame
- class stable_baselines3.common.atari_wrappers.NoopResetEnv(env, noop_max=30)[source]¶
Sample initial states by taking random number of no-ops on reset. No-op is assumed to be action 0.
- Parameters:
env (
Env
) – the environment to wrapnoop_max (
int
) – the maximum value of no-ops to run