Skip to content

ding.envs.env.base_env

ding.envs.env.base_env

BaseEnv

Bases: Env, ABC

Overview

Basic environment class, extended from gym.Env

Interface: __init__, reset, close, step, random_action, create_collector_env_cfg, create_evaluator_env_cfg, enable_save_replay

__init__(cfg) abstractmethod

Overview

Lazy init, only related arguments will be initialized in __init__ method, and the concrete env will be initialized the first time reset method is called.

Arguments: - cfg (:obj:dict): Environment configuration in dict type.

reset() abstractmethod

Overview

Reset the env to an initial state and returns an initial observation.

Returns: - obs (:obj:Any): Initial observation after reset.

close() abstractmethod

Overview

Close env and all the related resources, it should be called after the usage of env instance.

step(action) abstractmethod

Overview

Run one timestep of the environment's dynamics/simulation.

Arguments: - action (:obj:Any): The action input to step with. Returns: - timestep (:obj:BaseEnv.timestep): The result timestep of env executing one step.

seed(seed) abstractmethod

Overview

Set the seed for this env's random number generator(s).

Arguments: - seed (:obj:Any): Random seed.

__repr__() abstractmethod

Overview

Return the information string of this env instance.

Returns: - info (:obj:str): Information of this env instance, like type and arguments.

create_collector_env_cfg(cfg) staticmethod

Overview

Return a list of all of the environment from input config, used in env manager (a series of vectorized env), and this method is mainly responsible for envs collecting data.

Arguments: - cfg (:obj:dict): Original input env config, which needs to be transformed into the type of creating env instance actually and generated the corresponding number of configurations. Returns: - env_cfg_list (:obj:List[dict]): List of cfg including all the config collector envs.

.. note:: Elements(env config) in collector_env_cfg/evaluator_env_cfg can be different, such as server ip and port.

create_evaluator_env_cfg(cfg) staticmethod

Overview

Return a list of all of the environment from input config, used in env manager (a series of vectorized env), and this method is mainly responsible for envs evaluating performance.

Arguments: - cfg (:obj:dict): Original input env config, which needs to be transformed into the type of creating env instance actually and generated the corresponding number of configurations. Returns: - env_cfg_list (:obj:List[dict]): List of cfg including all the config evaluator envs.

enable_save_replay(replay_path)

Overview

Save replay file in the given path, and this method need to be self-implemented by each env class.

Arguments: - replay_path (:obj:str): The path to save replay file.

random_action()

Overview

Return random action generated from the original action space, usually it is convenient for test.

Returns: - random_action (:obj:Any): Action generated randomly.

get_vec_env_setting(cfg, collect=True, eval_=True)

Overview

Get vectorized env setting (env_fn, collector_env_cfg, evaluator_env_cfg).

Arguments: - cfg (:obj:dict): Original input env config in user config, such as cfg.env. Returns: - env_fn (:obj:type): Callable object, call it with proper arguments and then get a new env instance. - collector_env_cfg (:obj:List[dict]): A list contains the config of collecting data envs. - evaluator_env_cfg (:obj:List[dict]): A list contains the config of evaluation envs.

.. note:: Elements (env config) in collector_env_cfg/evaluator_env_cfg can be different, such as server ip and port.

get_env_cls(cfg)

Overview

Get the env class by correspondng module of cfg and return the callable class.

Arguments: - cfg (:obj:dict): Original input env config in user config, such as cfg.env. Returns: - env_cls_type (:obj:type): Env module as the corresponding callable class type.

create_model_env(cfg)

Overview

Create model env, which is used in model-based RL.

Full Source Code

../ding/envs/env/base_env.py

1from abc import ABC, abstractmethod 2from typing import Any, List, Tuple 3import gym 4import copy 5from easydict import EasyDict 6from collections import namedtuple 7from ding.utils import import_module, ENV_REGISTRY 8 9BaseEnvTimestep = namedtuple('BaseEnvTimestep', ['obs', 'reward', 'done', 'info']) 10 11 12# for solving multiple inheritance metaclass conflict between gym and ABC 13class FinalMeta(type(ABC), type(gym.Env)): 14 pass 15 16 17class BaseEnv(gym.Env, ABC, metaclass=FinalMeta): 18 """ 19 Overview: 20 Basic environment class, extended from ``gym.Env`` 21 Interface: 22 ``__init__``, ``reset``, ``close``, ``step``, ``random_action``, ``create_collector_env_cfg``, \ 23 ``create_evaluator_env_cfg``, ``enable_save_replay`` 24 """ 25 26 @abstractmethod 27 def __init__(self, cfg: dict) -> None: 28 """ 29 Overview: 30 Lazy init, only related arguments will be initialized in ``__init__`` method, and the concrete \ 31 env will be initialized the first time ``reset`` method is called. 32 Arguments: 33 - cfg (:obj:`dict`): Environment configuration in dict type. 34 """ 35 raise NotImplementedError 36 37 @abstractmethod 38 def reset(self) -> Any: 39 """ 40 Overview: 41 Reset the env to an initial state and returns an initial observation. 42 Returns: 43 - obs (:obj:`Any`): Initial observation after reset. 44 """ 45 raise NotImplementedError 46 47 @abstractmethod 48 def close(self) -> None: 49 """ 50 Overview: 51 Close env and all the related resources, it should be called after the usage of env instance. 52 """ 53 raise NotImplementedError 54 55 @abstractmethod 56 def step(self, action: Any) -> 'BaseEnv.timestep': 57 """ 58 Overview: 59 Run one timestep of the environment's dynamics/simulation. 60 Arguments: 61 - action (:obj:`Any`): The ``action`` input to step with. 62 Returns: 63 - timestep (:obj:`BaseEnv.timestep`): The result timestep of env executing one step. 64 """ 65 raise NotImplementedError 66 67 @abstractmethod 68 def seed(self, seed: int) -> None: 69 """ 70 Overview: 71 Set the seed for this env's random number generator(s). 72 Arguments: 73 - seed (:obj:`Any`): Random seed. 74 """ 75 raise NotImplementedError 76 77 @abstractmethod 78 def __repr__(self) -> str: 79 """ 80 Overview: 81 Return the information string of this env instance. 82 Returns: 83 - info (:obj:`str`): Information of this env instance, like type and arguments. 84 """ 85 raise NotImplementedError 86 87 @staticmethod 88 def create_collector_env_cfg(cfg: dict) -> List[dict]: 89 """ 90 Overview: 91 Return a list of all of the environment from input config, used in env manager \ 92 (a series of vectorized env), and this method is mainly responsible for envs collecting data. 93 Arguments: 94 - cfg (:obj:`dict`): Original input env config, which needs to be transformed into the type of creating \ 95 env instance actually and generated the corresponding number of configurations. 96 Returns: 97 - env_cfg_list (:obj:`List[dict]`): List of ``cfg`` including all the config collector envs. 98 99 .. note:: 100 Elements(env config) in collector_env_cfg/evaluator_env_cfg can be different, such as server ip and port. 101 """ 102 collector_env_num = cfg.pop('collector_env_num') 103 return [cfg for _ in range(collector_env_num)] 104 105 @staticmethod 106 def create_evaluator_env_cfg(cfg: dict) -> List[dict]: 107 """ 108 Overview: 109 Return a list of all of the environment from input config, used in env manager \ 110 (a series of vectorized env), and this method is mainly responsible for envs evaluating performance. 111 Arguments: 112 - cfg (:obj:`dict`): Original input env config, which needs to be transformed into the type of creating \ 113 env instance actually and generated the corresponding number of configurations. 114 Returns: 115 - env_cfg_list (:obj:`List[dict]`): List of ``cfg`` including all the config evaluator envs. 116 """ 117 evaluator_env_num = cfg.pop('evaluator_env_num') 118 return [cfg for _ in range(evaluator_env_num)] 119 120 # optional method 121 def enable_save_replay(self, replay_path: str) -> None: 122 """ 123 Overview: 124 Save replay file in the given path, and this method need to be self-implemented by each env class. 125 Arguments: 126 - replay_path (:obj:`str`): The path to save replay file. 127 """ 128 raise NotImplementedError 129 130 # optional method 131 def random_action(self) -> Any: 132 """ 133 Overview: 134 Return random action generated from the original action space, usually it is convenient for test. 135 Returns: 136 - random_action (:obj:`Any`): Action generated randomly. 137 """ 138 pass 139 140 141def get_vec_env_setting(cfg: dict, collect: bool = True, eval_: bool = True) -> Tuple[type, List[dict], List[dict]]: 142 """ 143 Overview: 144 Get vectorized env setting (env_fn, collector_env_cfg, evaluator_env_cfg). 145 Arguments: 146 - cfg (:obj:`dict`): Original input env config in user config, such as ``cfg.env``. 147 Returns: 148 - env_fn (:obj:`type`): Callable object, call it with proper arguments and then get a new env instance. 149 - collector_env_cfg (:obj:`List[dict]`): A list contains the config of collecting data envs. 150 - evaluator_env_cfg (:obj:`List[dict]`): A list contains the config of evaluation envs. 151 152 .. note:: 153 Elements (env config) in collector_env_cfg/evaluator_env_cfg can be different, such as server ip and port. 154 155 """ 156 import_module(cfg.get('import_names', [])) 157 env_fn = ENV_REGISTRY.get(cfg.type) 158 collector_env_cfg = env_fn.create_collector_env_cfg(cfg) if collect else None 159 evaluator_env_cfg = env_fn.create_evaluator_env_cfg(cfg) if eval_ else None 160 return env_fn, collector_env_cfg, evaluator_env_cfg 161 162 163def get_env_cls(cfg: EasyDict) -> type: 164 """ 165 Overview: 166 Get the env class by correspondng module of ``cfg`` and return the callable class. 167 Arguments: 168 - cfg (:obj:`dict`): Original input env config in user config, such as ``cfg.env``. 169 Returns: 170 - env_cls_type (:obj:`type`): Env module as the corresponding callable class type. 171 """ 172 import_module(cfg.get('import_names', [])) 173 return ENV_REGISTRY.get(cfg.type) 174 175 176def create_model_env(cfg: EasyDict) -> Any: 177 """ 178 Overview: 179 Create model env, which is used in model-based RL. 180 """ 181 cfg = copy.deepcopy(cfg) 182 model_env_fn = get_env_cls(cfg) 183 cfg.pop('import_names') 184 cfg.pop('type') 185 return model_env_fn(**cfg)