ding.rl_utils.exploration¶
ding.rl_utils.exploration
¶
BaseNoise
¶
Bases: ABC
Overview
Base class for action noise
Interface: init, call Examples: >>> noise_generator = OUNoise() # init one type of noise >>> noise = noise_generator(action.shape, action.device) # generate noise
__init__()
¶
Overview
Initialization method.
__call__(shape, device)
abstractmethod
¶
Overview
Generate noise according to action tensor's shape, device.
Arguments:
- shape (:obj:tuple): size of the action tensor, output noise's size should be the same.
- device (:obj:str): device of the action tensor, output noise's device should be the same as it.
Returns:
- noise (:obj:torch.Tensor): generated action noise, have the same shape and device with the input action tensor.
GaussianNoise
¶
Bases: BaseNoise
Overview
Derived class for generating gaussian noise, which satisfies :math:X \sim N(\mu, \sigma^2)
Interface: init, call
__init__(mu=0.0, sigma=1.0)
¶
Overview
Initialize :math:\mu and :math:\sigma in Gaussian Distribution.
Arguments:
- mu (:obj:float): :math:\mu , mean value.
- sigma (:obj:float): :math:\sigma , standard deviation, should be positive.
__call__(shape, device)
¶
Overview
Generate gaussian noise according to action tensor's shape, device
Arguments:
- shape (:obj:tuple): size of the action tensor, output noise's size should be the same
- device (:obj:str): device of the action tensor, output noise's device should be the same as it
Returns:
- noise (:obj:torch.Tensor): generated action noise, have the same shape and device with the input action tensor
OUNoise
¶
Bases: BaseNoise
Overview
Derived class for generating Ornstein-Uhlenbeck process noise.
Satisfies :math:dx_t=\theta(\mu-x_t)dt + \sigma dW_t,
where :math:W_t denotes Weiner Process, acting as a random perturbation term.
Interface: init, reset, call
x0
property
writable
¶
Overview
Get self._x0.
__init__(mu=0.0, sigma=0.3, theta=0.15, dt=0.01, x0=0.0)
¶
Overview
Initialize _alpha :math:= heta * dt\,
beta :math:= \sigma * \sqrt{dt}, in Ornstein-Uhlenbeck process.
Arguments:
- mu (:obj:float): :math:\mu , mean value.
- sigma (:obj:float): :math:\sigma , standard deviation of the perturbation noise.
- theta (:obj:float): How strongly the noise reacts to perturbations, greater value means stronger reaction.
- dt (:obj:float): The derivative of time t.
- x0 (:obj:Union[float, torch.Tensor]): The initial state of the noise, should be a scalar or tensor with the same shape as the action tensor.
reset()
¶
Overview
Reset _x to the initial state _x0.
__call__(shape, device, mu=None)
¶
Overview
Generate gaussian noise according to action tensor's shape, device.
Arguments:
- shape (:obj:tuple): The size of the action tensor, output noise's size should be the same.
- device (:obj:str): The device of the action tensor, output noise's device should be the same as it.
- mu (:obj:float): The new mean value :math:\mu, you can set it to None if don't need it.
Returns:
- noise (:obj:torch.Tensor): generated action noise, have the same shape and device with the input action tensor.
get_epsilon_greedy_fn(start, end, decay, type_='exp')
¶
Overview
Generate an epsilon_greedy function with decay, which inputs current timestep and outputs current epsilon.
Arguments:
- start (:obj:float): Epsilon start value. For linear , it should be 1.0.
- end (:obj:float): Epsilon end value.
- decay (:obj:int): Controls the speed that epsilon decreases from start to end. We recommend epsilon decays according to env step rather than iteration.
- type (:obj:str): How epsilon decays, now supports ['linear', 'exp'(exponential)] .
Returns:
- eps_fn (:obj:function): The epsilon greedy function with decay.
create_noise_generator(noise_type, noise_kwargs)
¶
Overview
Given the key (noise_type), create a new noise generator instance if in noise_mapping's values,
or raise an KeyError. In other words, a derived noise generator must first register,
then call create_noise generator to get the instance object.
Arguments:
- noise_type (:obj:str): the type of noise generator to be created.
Returns:
- noise (:obj:BaseNoise): the created new noise generator, should be an instance of one of noise_mapping's values.
Full Source Code
../ding/rl_utils/exploration.py