ding.rl_utils.value_rescale¶
ding.rl_utils.value_rescale
¶
value_transform(x, eps=0.01)
¶
Overview
A function to reduce the scale of the action-value function.
:math: h(x) = sign(x)(\sqrt{(abs(x)+1)} - 1) + \epsilon * x .
Arguments:
- x: (:obj:torch.Tensor) The input tensor to be normalized.
- eps: (:obj:float) The coefficient of the additive regularization term to ensure inverse function is Lipschitz continuous
Returns:
- (:obj:torch.Tensor) Normalized tensor.
.. note:: Observe and Look Further: Achieving Consistent Performance on Atari (https://arxiv.org/abs/1805.11593).
value_inv_transform(x, eps=0.01)
¶
Overview
The inverse form of value rescale.
:math: h^{-1}(x) = sign(x)({(rac{\sqrt{1+4\epsilon(|x|+1+\epsilon)}-1}{2\epsilon})}^2-1) .
Arguments:
- x: (:obj:torch.Tensor) The input tensor to be unnormalized.
- eps: (:obj:float) The coefficient of the additive regularization term to ensure inverse function is Lipschitz continuous
Returns:
- (:obj:torch.Tensor) Unnormalized tensor.
symlog(x)
¶
Overview
A function to normalize the targets.
:math: symlog(x) = sign(x)(\ln{|x|+1}) .
Arguments:
- x: (:obj:torch.Tensor) The input tensor to be normalized.
Returns:
- (:obj:torch.Tensor) Normalized tensor.
.. note:: Mastering Diverse Domains through World Models (https://arxiv.org/abs/2301.04104)
inv_symlog(x)
¶
Overview
The inverse form of symlog.
:math: symexp(x) = sign(x)(\exp{|x|}-1) .
Arguments:
- x: (:obj:torch.Tensor) The input tensor to be unnormalized.
Returns:
- (:obj:torch.Tensor) Unnormalized tensor.
Full Source Code
../ding/rl_utils/value_rescale.py