Skip to content

ding.torch_utils

ding.torch_utils

Swish

Bases: Module

Overview

Swish activation function, which is a smooth, non-monotonic activation function. For more details, please refer to Searching for Activation Functions.

Interfaces: __init__, forward.

__init__()

Overview

Initialize the Swish module.

forward(x)

Overview

Compute the Swish transformation of the input tensor.

Arguments: - x (:obj:torch.Tensor): The input tensor. Returns: - x (:obj:torch.Tensor): The output tensor after Swish transformation.

ResBlock

Bases: Module

Overview

Residual Block with 2D convolution layers, including 3 types: basic block: input channel: C x -> 33C -> norm -> act -> 33C -> norm -> act -> out ________/+ bottleneck block: x -> 11(1/4C) -> norm -> act -> 33(1/4C) -> norm -> act -> 11C -> norm -> act -> out ____________/+ downsample block: used in EfficientZero input channel: C x -> 33C -> norm -> act -> 33C -> norm -> act -> out ___ 33C ________/+

.. note:: You can refer to Deep Residual Learning for Image Recognition <https://arxiv.org/abs/1512.03385>_ for more details.

Interfaces

__init__, forward

__init__(in_channels, activation=nn.ReLU(), norm_type='BN', res_type='basic', bias=True, out_channels=None)

Overview

Init the 2D convolution residual block.

Arguments: - in_channels (:obj:int): Number of channels in the input tensor. - activation (:obj:nn.Module): The optional activation function. - norm_type (:obj:str): Type of the normalization, default set to 'BN'(Batch Normalization), supports ['BN', 'LN', 'IN', 'GN', 'SyncBN', None]. - res_type (:obj:str): Type of residual block, supports ['basic', 'bottleneck', 'downsample'] - bias (:obj:bool): Whether to add a learnable bias to the conv2d_block. default set to True. - out_channels (:obj:int): Number of channels in the output tensor, default set to None, which means out_channels = in_channels.

forward(x)

Overview

Return the redisual block output.

Arguments: - x (:obj:torch.Tensor): The input tensor. Returns: - x (:obj:torch.Tensor): The resblock output tensor.

ResFCBlock

Bases: Module

Overview

Residual Block with 2 fully connected layers. x -> fc1 -> norm -> act -> fc2 -> norm -> act -> out _______/+

Interfaces

__init__, forward

__init__(in_channels, activation=nn.ReLU(), norm_type='BN', dropout=None)

Overview

Init the fully connected layer residual block.

Arguments: - in_channels (:obj:int): The number of channels in the input tensor. - activation (:obj:nn.Module): The optional activation function. - norm_type (:obj:str): The type of the normalization, default set to 'BN'. - dropout (:obj:float): The dropout rate, default set to None.

forward(x)

Overview

Return the output of the redisual block.

Arguments: - x (:obj:torch.Tensor): The input tensor. Returns: - x (:obj:torch.Tensor): The resblock output tensor.

BilinearUpsample

Bases: Module

Overview

This module upsamples the input to the given scale_factor using the bilinear mode.

Interfaces: __init__, forward

__init__(scale_factor)

Overview

Initialize the BilinearUpsample class.

Arguments: - scale_factor (:obj:Union[float, List[float]]): The multiplier for the spatial size.

forward(x)

Overview

Return the upsampled input tensor.

Arguments: - x (:obj:torch.Tensor): The input tensor. Returns: - upsample(:obj:torch.Tensor): The upsampled input tensor.

NearestUpsample

Bases: Module

Overview

This module upsamples the input to the given scale_factor using the nearest mode.

Interfaces: __init__, forward

__init__(scale_factor)

Overview

Initialize the NearestUpsample class.

Arguments: - scale_factor (:obj:Union[float, List[float]]): The multiplier for the spatial size.

forward(x)

Overview

Return the upsampled input tensor.

Arguments: - x (:obj:torch.Tensor): The input tensor. Returns: - upsample(:obj:torch.Tensor): The upsampled input tensor.

NoiseLinearLayer

Bases: Module

Overview

This is a linear layer with random noise.

Interfaces: __init__, reset_noise, reset_parameters, forward

__init__(in_channels, out_channels, sigma0=0.4)

Overview

Initialize the NoiseLinearLayer class. The 'enable_noise' attribute enables external control over whether noise is applied. - If enable_noise is True, the layer adds noise even if the module is in evaluation mode. - If enable_noise is False, no noise is added regardless of self.training.

Arguments: - in_channels (:obj:int): Number of channels in the input tensor. - out_channels (:obj:int): Number of channels in the output tensor. - sigma0 (:obj:int, optional): Default noise volume when initializing NoiseLinearLayer. Default is 0.4.

reset_noise()

Overview

Reset the noise settings in the layer.

reset_parameters()

Overview

Reset the parameters in the layer.

forward(x)

Overview

Perform the forward pass with noise.

Arguments: - x (:obj:torch.Tensor): The input tensor. Returns: - output (:obj:torch.Tensor): The output tensor with noise.

SoftArgmax

Bases: Module

Overview

A neural network module that computes the SoftArgmax operation (essentially a 2-dimensional spatial softmax), which is often used for location regression tasks. It converts a feature map (such as a heatmap) into precise coordinate locations.

Interfaces: __init__, forward

.. note:: For more information on SoftArgmax, you can refer to https://en.wikipedia.org/wiki/Softmax_function and the paper https://arxiv.org/pdf/1504.00702.pdf.

__init__()

Overview

Initialize the SoftArgmax module.

forward(x)

Overview

Perform the forward pass of the SoftArgmax operation.

Arguments: - x (:obj:torch.Tensor): The input tensor, typically a heatmap representing predicted locations. Returns: - location (:obj:torch.Tensor): The predicted coordinates as a result of the SoftArgmax operation. Shapes: - x: :math:(B, C, H, W), where B is the batch size, C is the number of channels, and H and W represent height and width respectively. - location: :math:(B, 2), where B is the batch size and 2 represents the coordinates (height, width).

Transformer

Bases: Module

Overview

Implementation of the Transformer model.

.. note:: For more details, refer to "Attention is All You Need": http://arxiv.org/abs/1706.03762.

Interfaces

__init__, forward

__init__(input_dim, head_dim=128, hidden_dim=1024, output_dim=256, head_num=2, mlp_num=2, layer_num=3, dropout_ratio=0.0, activation=nn.ReLU())

Overview

Initialize the Transformer with the provided dimensions, dropout layer, activation function, and layer numbers.

Arguments: - input_dim (:obj:int): The dimension of the input. - head_dim (:obj:int): The dimension of each head in the multi-head attention mechanism. - hidden_dim (:obj:int): The dimension of the hidden layer in the MLP (Multi-Layer Perceptron). - output_dim (:obj:int): The dimension of the output. - head_num (:obj:int): The number of heads in the multi-head attention mechanism. - mlp_num (:obj:int): The number of layers in the MLP. - layer_num (:obj:int): The number of Transformer layers. - dropout_ratio (:obj:float): The dropout ratio for the dropout layer. - activation (:obj:nn.Module): The activation function used in the MLP.

forward(x, mask=None)

Overview

Perform the forward pass through the Transformer.

Arguments: - x (:obj:torch.Tensor): The input tensor, with shape (B, N, C), where B is batch size, N is the number of entries, and C is the feature dimension. - mask (:obj:Optional[torch.Tensor], optional): The mask tensor (bool), used to mask out invalid entries in attention. It has shape (B, N), where B is batch size and N is number of entries. Defaults to None. Returns: - x (:obj:torch.Tensor): The output tensor from the Transformer.

ScaledDotProductAttention

Bases: Module

Overview

Implementation of Scaled Dot Product Attention, a key component of Transformer models. This class performs the dot product of the query, key and value tensors, scales it with the square root of the dimension of the key vector (d_k) and applies dropout for regularization.

Interfaces: __init__, forward

__init__(d_k, dropout=0.0)

Overview

Initialize the ScaledDotProductAttention module with the dimension of the key vector and the dropout rate.

Arguments: - d_k (:obj:int): The dimension of the key vector. This will be used to scale the dot product of the query and key. - dropout (:obj:float, optional): The dropout rate to be applied after the softmax operation. Defaults to 0.0.

forward(q, k, v, mask=None)

Overview

Perform the Scaled Dot Product Attention operation on the query, key and value tensors.

Arguments: - q (:obj:torch.Tensor): The query tensor. - k (:obj:torch.Tensor): The key tensor. - v (:obj:torch.Tensor): The value tensor. - mask (:obj:Optional[torch.Tensor]): An optional mask tensor to be applied on the attention scores. Defaults to None. Returns: - output (:obj:torch.Tensor): The output tensor after the attention operation.

ScatterConnection

Bases: Module

Overview

Scatter feature to its corresponding location. In AlphaStar, each entity is embedded into a tensor, and these tensors are scattered into a feature map with map size.

Interfaces: __init__, forward, xy_forward

__init__(scatter_type)

Overview

Initialize the ScatterConnection object.

Arguments: - scatter_type (:obj:str): The scatter type, which decides the behavior when two entities have the same location. It can be either 'add' or 'cover'. If 'add', the first one will be added to the second one. If 'cover', the first one will be covered by the second one.

forward(x, spatial_size, location)

Overview

Scatter input tensor 'x' into a spatial feature map.

Arguments: - x (:obj:torch.Tensor): The input tensor of shape (B, M, N), where B is the batch size, M is the number of entities, and N is the dimension of entity attributes. - spatial_size (:obj:Tuple[int, int]): The size (H, W) of the spatial feature map into which 'x' will be scattered, where H is the height and W is the width. - location (:obj:torch.Tensor): The tensor of locations of shape (B, M, 2). Each location should be (y, x). Returns: - output (:obj:torch.Tensor): The scattered feature map of shape (B, N, H, W). Note: When there are some overlapping in locations, 'cover' mode will result in the loss of information. 'add' mode is used as a temporary substitute.

xy_forward(x, spatial_size, coord_x, coord_y)

Overview

Scatter input tensor 'x' into a spatial feature map using separate x and y coordinates.

Arguments: - x (:obj:torch.Tensor): The input tensor of shape (B, M, N), where B is the batch size, M is the number of entities, and N is the dimension of entity attributes. - spatial_size (:obj:Tuple[int, int]): The size (H, W) of the spatial feature map into which 'x' will be scattered, where H is the height and W is the width. - coord_x (:obj:torch.Tensor): The x-coordinates tensor of shape (B, M). - coord_y (:obj:torch.Tensor): The y-coordinates tensor of shape (B, M). Returns: - output (:obj:torch.Tensor): The scattered feature map of shape (B, N, H, W). Note: When there are some overlapping in locations, 'cover' mode will result in the loss of information. 'add' mode is used as a temporary substitute.

ResNet

Bases: Module

Overview

Implements ResNet, ResNeXt, SE-ResNeXt, and SENet models. This implementation supports various modifications based on the v1c, v1d, v1e, and v1s variants included in the MXNet Gluon ResNetV1b model. For more details about the variants and options, please refer to the 'Bag of Tricks' paper: https://arxiv.org/pdf/1812.01187.

Interfaces: __init__, forward, zero_init_last_bn, get_classifier

__init__(block, layers, num_classes=1000, in_chans=3, cardinality=1, base_width=64, stem_width=64, stem_type='', replace_stem_pool=False, output_stride=32, block_reduce_first=1, down_kernel_size=1, avg_down=False, act_layer=nn.ReLU, norm_layer=nn.BatchNorm2d, aa_layer=None, drop_rate=0.0, drop_path_rate=0.0, drop_block_rate=0.0, global_pool='avg', zero_init_last_bn=True, block_args=None)

Overview

Initialize the ResNet model with given block, layers and other configuration options.

Arguments: - block (:obj:nn.Module): Class for the residual block. - layers (:obj:List[int]): Numbers of layers in each block. - num_classes (:obj:int, optional): Number of classification classes. Default is 1000. - in_chans (:obj:int, optional): Number of input (color) channels. Default is 3. - cardinality (:obj:int, optional): Number of convolution groups for 3x3 conv in Bottleneck. Default is 1. - base_width (:obj:int, optional): Factor determining bottleneck channels. Default is 64. - stem_width (:obj:int, optional): Number of channels in stem convolutions. Default is 64. - stem_type (:obj:str, optional): The type of stem. Default is ''. - replace_stem_pool (:obj:bool, optional): Whether to replace stem pooling. Default is False. - output_stride (:obj:int, optional): Output stride of the network. Default is 32. - block_reduce_first (:obj:int, optional): Reduction factor for first convolution output width of residual blocks. Default is 1. - down_kernel_size (:obj:int, optional): Kernel size of residual block downsampling path. Default is 1. - avg_down (:obj:bool, optional): Whether to use average pooling for projection skip connection between stages/downsample. Default is False. - act_layer (:obj:nn.Module, optional): Activation layer. Default is nn.ReLU. - norm_layer (:obj:nn.Module, optional): Normalization layer. Default is nn.BatchNorm2d. - aa_layer (:obj:Optional[nn.Module], optional): Anti-aliasing layer. Default is None. - drop_rate (:obj:float, optional): Dropout probability before classifier, for training. Default is 0.0. - drop_path_rate (:obj:float, optional): Drop path rate. Default is 0.0. - drop_block_rate (:obj:float, optional): Drop block rate. Default is 0.0. - global_pool (:obj:str, optional): Global pooling type. Default is 'avg'. - zero_init_last_bn (:obj:bool, optional): Whether to initialize last batch normalization with zero. Default is True. - block_args (:obj:Optional[dict], optional): Additional arguments for block. Default is None.

init_weights(zero_init_last_bn=True)

Overview

Initialize the weights in the model.

Arguments: - zero_init_last_bn (:obj:bool, optional): Whether to initialize last batch normalization with zero. Default is True.

get_classifier()

Overview

Get the classifier module from the model.

Returns: - classifier (:obj:nn.Module): The classifier module in the model.

reset_classifier(num_classes, global_pool='avg')

Overview

Reset the classifier with a new number of classes and pooling type.

Arguments: - num_classes (:obj:int): New number of classification classes. - global_pool (:obj:str, optional): New global pooling type. Default is 'avg'.

forward_features(x)

Overview

Forward pass through the feature layers of the model.

Arguments: - x (:obj:torch.Tensor): The input tensor. Returns: - x (:obj:torch.Tensor): The output tensor after passing through feature layers.

forward(x)

Overview

Full forward pass through the model.

Arguments: - x (:obj:torch.Tensor): The input tensor. Returns: - x (:obj:torch.Tensor): The output tensor after passing through the model.

GumbelSoftmax

Bases: Module

Overview

An nn.Module that computes GumbelSoftmax.

Interfaces: __init__, forward, gumbel_softmax_sample

.. note:: For more information on GumbelSoftmax, refer to the paper Categorical Reparameterization with Gumbel-Softmax.

__init__()

Overview

Initialize the GumbelSoftmax module.

gumbel_softmax_sample(x, temperature, eps=1e-08)

Overview

Draw a sample from the Gumbel-Softmax distribution.

Arguments: - x (:obj:torch.Tensor): Input tensor. - temperature (:obj:float): Non-negative scalar controlling the sharpness of the distribution. - eps (:obj:float): Small number to prevent division by zero, default is 1e-8. Returns: - output (:obj:torch.Tensor): Sample from Gumbel-Softmax distribution.

forward(x, temperature=1.0, hard=False)

Overview

Forward pass for the GumbelSoftmax module.

Arguments: - x (:obj:torch.Tensor): Unnormalized log-probabilities. - temperature (:obj:float): Non-negative scalar controlling the sharpness of the distribution. - hard (:obj:bool): If True, returns one-hot encoded labels. Default is False. Returns: - output (:obj:torch.Tensor): Sample from Gumbel-Softmax distribution. Shapes: - x: its shape is :math:(B, N), where B is the batch size and N is the number of classes. - y: its shape is :math:(B, N), where B is the batch size and N is the number of classes.

GTrXL

Bases: Module

Overview

GTrXL Transformer implementation as described in "Stabilizing Transformer for Reinforcement Learning" (https://arxiv.org/abs/1910.06764).

Interfaces: __init__, forward, reset_memory, get_memory

__init__(input_dim, head_dim=128, embedding_dim=256, head_num=2, mlp_num=2, layer_num=3, memory_len=64, dropout_ratio=0.0, activation=nn.ReLU(), gru_gating=True, gru_bias=2.0, use_embedding_layer=True)

Overview

Init GTrXL Model.

Arguments: - input_dim (:obj:int): The dimension of the input observation. - head_dim (:obj:int, optional): The dimension of each head. Default is 128. - embedding_dim (:obj:int, optional): The dimension of the embedding. Default is 256. - head_num (:obj:int, optional): The number of heads for multi-head attention. Default is 2. - mlp_num (:obj:int, optional): The number of MLP layers in the attention layer. Default is 2. - layer_num (:obj:int, optional): The number of transformer layers. Default is 3. - memory_len (:obj:int, optional): The length of memory. Default is 64. - dropout_ratio (:obj:float, optional): The dropout ratio. Default is 0. - activation (:obj:nn.Module, optional): The activation function. Default is nn.ReLU(). - gru_gating (:obj:bool, optional): If False, replace GRU gates with residual connections. Default is True. - gru_bias (:obj:float, optional): The GRU gate bias. Default is 2.0. - use_embedding_layer (:obj:bool, optional): If False, don't use input embedding layer. Default is True. Raises: - AssertionError: If embedding_dim is not an even number.

reset_memory(batch_size=None, state=None)

Overview

Clear or set the memory of GTrXL.

Arguments: - batch_size (:obj:Optional[int]): The batch size. Default is None. - state (:obj:Optional[torch.Tensor]): The input memory with shape (layer_num, memory_len, bs, embedding_dim). Default is None.

get_memory()

Overview

Returns the memory of GTrXL.

Returns: - memory (:obj:Optional[torch.Tensor]): The output memory or None if memory has not been initialized. The shape is (layer_num, memory_len, bs, embedding_dim).

forward(x, batch_first=False, return_mem=True)

Overview

Performs a forward pass on the GTrXL.

Arguments: - x (:obj:torch.Tensor): The input tensor with shape (seq_len, bs, input_size). - batch_first (:obj:bool, optional): If the input data has shape (bs, seq_len, input_size), set this parameter to True to transpose along the first and second dimension and obtain shape (seq_len, bs, input_size). This does not affect the output memory. Default is False. - return_mem (:obj:bool, optional): If False, return only the output tensor without dict. Default is True. Returns: - x (:obj:Dict[str, torch.Tensor]): A dictionary containing the transformer output of shape (seq_len, bs, embedding_size) and memory of shape (layer_num, seq_len, bs, embedding_size).

GRUGatingUnit

Bases: Module

Overview

The GRUGatingUnit module implements the GRU gating mechanism used in the GTrXL model.

Interfaces: __init__, forward

__init__(input_dim, bg=2.0)

Overview

Initialize the GRUGatingUnit module.

Arguments: - input_dim (:obj:int): The dimensionality of the input. - bg (:obj:bg): The gate bias. By setting bg > 0 we can explicitly initialize the gating mechanism to be close to the identity map. This can greatly improve the learning speed and stability since it initializes the agent close to a Markovian policy (ignore attention at the beginning).

forward(x, y)

Overview

Compute the output value using the GRU gating mechanism.

Arguments: - x: (:obj:torch.Tensor): The first input tensor. - y: (:obj:torch.Tensor): The second input tensor. x and y should have the same shape and their last dimension should match the input_dim. Returns: - g: (:obj:torch.Tensor): The output of the GRU gating mechanism. The shape of g matches the shapes of x and y.

PopArt

Bases: Module

Overview

A linear layer with popart normalization. This class implements a linear transformation followed by PopArt normalization, which is a method to automatically adapt the contribution of each task to the agent's updates in multi-task learning, as described in the paper https://arxiv.org/abs/1809.04474.

Interfaces

__init__, reset_parameters, forward, update_parameters

__init__(input_features=None, output_features=None, beta=0.5)

Overview

Initialize the class with input features, output features, and the beta parameter.

Arguments: - input_features (:obj:Union[int, None]): The size of each input sample. - output_features (:obj:Union[int, None]): The size of each output sample. - beta (:obj:float): The parameter for moving average.

reset_parameters()

Overview

Reset the parameters including weights and bias using kaiming_uniform_ and uniform_ initialization.

forward(x)

Overview

Implement the forward computation of the linear layer and return both the output and the normalized output of the layer.

Arguments: - x (:obj:torch.Tensor): Input tensor which is to be normalized. Returns: - output (:obj:Dict[str, torch.Tensor]): A dictionary contains 'pred' and 'unnormalized_pred'.

update_parameters(value)

Overview

Update the normalization parameters based on the given value and return the new mean and standard deviation after the update.

Arguments: - value (:obj:torch.Tensor): The tensor to be used for updating parameters. Returns: - update_results (:obj:Dict[str, torch.Tensor]): A dictionary contains 'new_mean' and 'new_std'.

GatingType

Bases: Enum

Overview

Enum class defining different types of tensor gating and aggregation in modules.

SumMerge

Bases: Module

Overview

A PyTorch module that merges a list of tensors by computing their sum. All input tensors must have the same size. This module can work with any type of tensor (vector, units or visual).

Interfaces: __init__, forward

forward(tensors)

Overview

Forward pass of the SumMerge module, which sums the input tensors.

Arguments: - tensors (:obj:List[Tensor]): List of input tensors to be summed. All tensors must have the same size. Returns: - summed (:obj:Tensor): Tensor resulting from the sum of all input tensors.

VectorMerge

Bases: Module

Overview

Merges multiple vector streams. Streams are first transformed through layer normalization, relu, and linear layers, then summed. They don't need to have the same size. Gating can also be used before the sum.

Interfaces: __init__, encode, _compute_gate, forward

.. note:: For more details about the gating types, please refer to the GatingType enum class.

__init__(input_sizes, output_size, gating_type=GatingType.NONE, use_layer_norm=True)

Overview

Initialize the VectorMerge module.

Arguments: - input_sizes (:obj:Dict[str, int]): A dictionary mapping input names to their sizes. The size is a single integer for 1D inputs, or None for 0D inputs. If an input size is None, we assume it's (). - output_size (:obj:int): The size of the output vector. - gating_type (:obj:GatingType): The type of gating mechanism to use. Default is GatingType.NONE. - use_layer_norm (:obj:bool): Whether to use layer normalization. Default is True.

encode(inputs)

Overview

Encode the input tensors using layer normalization, relu, and linear transformations.

Arguments: - inputs (:obj:Dict[str, Tensor]): The input tensors. Returns: - gates (:obj:List[Tensor]): The gate tensors after transformations. - outputs (:obj:List[Tensor]): The output tensors after transformations.

forward(inputs)

Overview

Forward pass through the VectorMerge module.

Arguments: - inputs (:obj:Dict[str, Tensor]): The input tensors. Returns: - output (:obj:Tensor): The output tensor after passing through the module.

LabelSmoothCELoss

Bases: Module

Overview

Label smooth cross entropy loss.

Interfaces: __init__, forward.

__init__(ratio)

Overview

Initialize the LabelSmoothCELoss object using the given arguments.

Arguments: - ratio (:obj:float): The ratio of label-smoothing (the value is in 0-1). If the ratio is larger, the extent of label smoothing is larger.

forward(logits, labels)

Overview

Calculate label smooth cross entropy loss.

Arguments: - logits (:obj:torch.Tensor): Predicted logits. - labels (:obj:torch.LongTensor): Ground truth. Returns: - loss (:obj:torch.Tensor): Calculated loss.

SoftFocalLoss

Bases: Module

Overview

Soft focal loss.

Interfaces: __init__, forward.

__init__(gamma=2, weight=None, size_average=True, reduce=None)

Overview

Initialize the SoftFocalLoss object using the given arguments.

Arguments: - gamma (:obj:int): The extent of focus on hard samples. A smaller gamma will lead to more focus on easy samples, while a larger gamma will lead to more focus on hard samples. - weight (:obj:Any): The weight for loss of each class. - size_average (:obj:bool): By default, the losses are averaged over each loss element in the batch. Note that for some losses, there are multiple elements per sample. If the field size_average is set to False, the losses are instead summed for each minibatch. Ignored when reduce is False. - reduce (:obj:Optional[bool]): By default, the losses are averaged or summed over observations for each minibatch depending on size_average. When reduce is False, returns a loss for each batch element instead and ignores size_average.

forward(inputs, targets)

Overview

Calculate soft focal loss.

Arguments: - logits (:obj:torch.Tensor): Predicted logits. - labels (:obj:torch.LongTensor): Ground truth. Returns: - loss (:obj:torch.Tensor): Calculated loss.

MultiLogitsLoss

Bases: Module

Overview

Base class for supervised learning on linklink, including basic processes.

Interfaces: __init__, forward.

__init__(criterion=None, smooth_ratio=0.1)

Overview

Initialization method, use cross_entropy as default criterion.

Arguments: - criterion (:obj:str): Criterion type, supports ['cross_entropy', 'label_smooth_ce']. - smooth_ratio (:obj:float): Smoothing ratio for label smoothing.

forward(logits, labels)

Overview

Calculate multiple logits loss.

Arguments: - logits (:obj:torch.Tensor): Predicted logits, whose shape must be 2-dim, like (B, N). - labels (:obj:torch.LongTensor): Ground truth. Returns: - loss (:obj:torch.Tensor): Calculated loss.

ContrastiveLoss

Bases: Module

Overview

The class for contrastive learning losses. Only InfoNCE loss is supported currently. Code Reference: https://github.com/rdevon/DIM. Paper Reference: https://arxiv.org/abs/1808.06670.

Interfaces: __init__, forward.

__init__(x_size, y_size, heads=[1, 1], encode_shape=64, loss_type='infoNCE', temperature=1.0)

Overview

Initialize the ContrastiveLoss object using the given arguments.

Arguments: - x_size (:obj:Union[int, SequenceType]): input shape for x, both the obs shape and the encoding shape are supported. - y_size (:obj:Union[int, SequenceType]): Input shape for y, both the obs shape and the encoding shape are supported. - heads (:obj:SequenceType): A list of 2 int elems, heads[0] for x and head[1] for y. Used in multi-head, global-local, local-local MI maximization process. - encoder_shape (:obj:Union[int, SequenceType]): The dimension of encoder hidden state. - loss_type: Only the InfoNCE loss is available now. - temperature: The parameter to adjust the log_softmax.

forward(x, y)

Overview

Computes the noise contrastive estimation-based loss, a.k.a. infoNCE.

Arguments: - x (:obj:torch.Tensor): The input x, both raw obs and encoding are supported. - y (:obj:torch.Tensor): The input y, both raw obs and encoding are supported. Returns: loss (:obj:torch.Tensor): The calculated loss value. Examples: >>> x_dim = [3, 16] >>> encode_shape = 16 >>> x = np.random.normal(0, 1, size=x_dim) >>> y = x ** 2 + 0.01 * np.random.normal(0, 1, size=x_dim) >>> estimator = ContrastiveLoss(dims, dims, encode_shape=encode_shape) >>> loss = estimator.forward(x, y) Examples: >>> x_dim = [3, 1, 16, 16] >>> encode_shape = 16 >>> x = np.random.normal(0, 1, size=x_dim) >>> y = x ** 2 + 0.01 * np.random.normal(0, 1, size=x_dim) >>> estimator = ContrastiveLoss(dims, dims, encode_shape=encode_shape) >>> loss = estimator.forward(x, y)

build_activation(activation, inplace=None)

Overview

Build and return the activation module according to the given type.

Arguments: - activation (:obj:str): The type of activation module, now supports ['relu', 'glu', 'prelu', 'swish', 'gelu', 'tanh', 'sigmoid', 'softplus', 'elu', 'square', 'identity']. - inplace (Optional[:obj:bool): Execute the operation in-place in activation, defaults to None. Returns: - act_func (:obj:nn.module): The corresponding activation module.

fc_block(in_channels, out_channels, activation=None, norm_type=None, use_dropout=False, dropout_probability=0.5)

Overview

Create a fully-connected block with activation, normalization, and dropout. Optional normalization can be done to the dim 1 (across the channels). x -> fc -> norm -> act -> dropout -> out

Arguments: - in_channels (:obj:int): Number of channels in the input tensor. - out_channels (:obj:int): Number of channels in the output tensor. - activation (:obj:nn.Module, optional): The optional activation function. - norm_type (:obj:str, optional): Type of the normalization. - use_dropout (:obj:bool, optional): Whether to use dropout in the fully-connected block. Default is False. - dropout_probability (:obj:float, optional): Probability of an element to be zeroed in the dropout. Default is 0.5. Returns: - block (:obj:nn.Sequential): A sequential list containing the torch layers of the fully-connected block.

.. note::

You can refer to nn.linear (https://pytorch.org/docs/master/generated/torch.nn.Linear.html).

conv2d_block(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, pad_type='zero', activation=None, norm_type=None, num_groups_for_gn=1, bias=True)

Overview

Create a 2-dimensional convolution layer with activation and normalization.

Arguments: - in_channels (:obj:int): Number of channels in the input tensor. - out_channels (:obj:int): Number of channels in the output tensor. - kernel_size (:obj:int): Size of the convolving kernel. - stride (:obj:int, optional): Stride of the convolution. Default is 1. - padding (:obj:int, optional): Zero-padding added to both sides of the input. Default is 0. - dilation (:obj:int): Spacing between kernel elements. - groups (:obj:int, optional): Number of blocked connections from input channels to output channels. Default is 1. - pad_type (:obj:str, optional): The way to add padding, include ['zero', 'reflect', 'replicate']. Default is 'zero'. - activation (:obj:nn.Module): the optional activation function. - norm_type (:obj:str): The type of the normalization, now support ['BN', 'LN', 'IN', 'GN', 'SyncBN'], default set to None, which means no normalization. - num_groups_for_gn (:obj:int): Number of groups for GroupNorm. - bias (:obj:bool): whether to add a learnable bias to the nn.Conv2d. Default is True. Returns: - block (:obj:nn.Sequential): A sequential list containing the torch layers of the 2-dimensional convolution layer.

.. note:: Conv2d (https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html#torch.nn.Conv2d)

one_hot(val, num, num_first=False)

Overview

Convert a torch.LongTensor to one-hot encoding. This implementation can be slightly faster than torch.nn.functional.one_hot.

Arguments: - val (:obj:torch.LongTensor): Each element contains the state to be encoded, the range should be [0, num-1] - num (:obj:int): Number of states of the one-hot encoding - num_first (:obj:bool, optional): If False, the one-hot encoding is added as the last dimension; otherwise, it is added as the first dimension. Default is False. Returns: - one_hot (:obj:torch.FloatTensor): The one-hot encoded tensor. Example: >>> one_hot(2torch.ones([2,2]).long(),3) tensor([[[0., 0., 1.], [0., 0., 1.]], [[0., 0., 1.], [0., 0., 1.]]]) >>> one_hot(2torch.ones([2,2]).long(),3,num_first=True) tensor([[[0., 0.], [1., 0.]], [[0., 1.], [0., 0.]], [[1., 0.], [0., 1.]]])

deconv2d_block(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, activation=None, norm_type=None)

Overview

Create a 2-dimensional transpose convolution layer with activation and normalization.

Arguments: - in_channels (:obj:int): Number of channels in the input tensor. - out_channels (:obj:int): Number of channels in the output tensor. - kernel_size (:obj:int): Size of the convolving kernel. - stride (:obj:int, optional): Stride of the convolution. Default is 1. - padding (:obj:int, optional): Zero-padding added to both sides of the input. Default is 0. - output_padding (:obj:int, optional): Additional size added to one side of the output shape. Default is 0. - groups (:obj:int, optional): Number of blocked connections from input channels to output channels. Default is 1. - activation (:obj:int, optional): The optional activation function. - norm_type (:obj:int, optional): Type of the normalization. Returns: - block (:obj:nn.Sequential): A sequential list containing the torch layers of the 2-dimensional transpose convolution layer.

.. note::

ConvTranspose2d (https://pytorch.org/docs/master/generated/torch.nn.ConvTranspose2d.html)

binary_encode(y, max_val)

Overview

Convert elements in a tensor to its binary representation.

Arguments: - y (:obj:torch.Tensor): The tensor to be converted into its binary representation. - max_val (:obj:torch.Tensor): The maximum value of the elements in the tensor. Returns: - binary (:obj:torch.Tensor): The input tensor in its binary representation. Example: >>> binary_encode(torch.tensor([3,2]),torch.tensor(8)) tensor([[0, 0, 1, 1],[0, 0, 1, 0]])

noise_block(in_channels, out_channels, activation=None, norm_type=None, use_dropout=False, dropout_probability=0.5, sigma0=0.4)

Overview

Create a fully-connected noise layer with activation, normalization, and dropout. Optional normalization can be done to the dim 1 (across the channels).

Arguments: - in_channels (:obj:int): Number of channels in the input tensor. - out_channels (:obj:int): Number of channels in the output tensor. - activation (:obj:str, optional): The optional activation function. Default is None. - norm_type (:obj:str, optional): Type of normalization. Default is None. - use_dropout (:obj:bool, optional): Whether to use dropout in the fully-connected block. - dropout_probability (:obj:float, optional): Probability of an element to be zeroed in the dropout. Default is 0.5. - sigma0 (:obj:float, optional): The sigma0 is the default noise volume when initializing NoiseLinearLayer. Default is 0.4. Returns: - block (:obj:nn.Sequential): A sequential list containing the torch layers of the fully-connected block.

MLP(in_channels, hidden_channels, out_channels, layer_num, layer_fn=None, activation=None, norm_type=None, use_dropout=False, dropout_probability=0.5, output_activation=True, output_norm=True, last_linear_layer_init_zero=False)

Overview

Create a multi-layer perceptron using fully-connected blocks with activation, normalization, and dropout, optional normalization can be done to the dim 1 (across the channels). x -> fc -> norm -> act -> dropout -> out

Arguments: - in_channels (:obj:int): Number of channels in the input tensor. - hidden_channels (:obj:int): Number of channels in the hidden tensor. - out_channels (:obj:int): Number of channels in the output tensor. - layer_num (:obj:int): Number of layers. - layer_fn (:obj:Callable, optional): Layer function. - activation (:obj:nn.Module, optional): The optional activation function. - norm_type (:obj:str, optional): The type of the normalization. - use_dropout (:obj:bool, optional): Whether to use dropout in the fully-connected block. Default is False. - dropout_probability (:obj:float, optional): Probability of an element to be zeroed in the dropout. Default is 0.5. - output_activation (:obj:bool, optional): Whether to use activation in the output layer. If True, we use the same activation as front layers. Default is True. - output_norm (:obj:bool, optional): Whether to use normalization in the output layer. If True, we use the same normalization as front layers. Default is True. - last_linear_layer_init_zero (:obj:bool, optional): Whether to use zero initializations for the last linear layer (including w and b), which can provide stable zero outputs in the beginning, usually used in the policy network in RL settings. Returns: - block (:obj:nn.Sequential): A sequential list containing the torch layers of the multi-layer perceptron.

.. note:: you can refer to nn.linear (https://pytorch.org/docs/master/generated/torch.nn.Linear.html).

normed_linear(in_features, out_features, bias=True, device=None, dtype=None, scale=1.0)

Overview

Create a nn.Linear module but with normalized fan-in init.

Arguments: - in_features (:obj:int): Number of features in the input tensor. - out_features (:obj:int): Number of features in the output tensor. - bias (:obj:bool, optional): Whether to add a learnable bias to the nn.Linear. Default is True. - device (:obj:torch.device, optional): The device to put the created module on. Default is None. - dtype (:obj:torch.dtype, optional): The desired data type of created module. Default is None. - scale (:obj:float, optional): The scale factor for initialization. Default is 1.0. Returns: - out (:obj:nn.Linear): A nn.Linear module with normalized fan-in init.

normed_conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None, scale=1)

Overview

Create a nn.Conv2d module but with normalized fan-in init.

Arguments: - in_channels (:obj:int): Number of channels in the input tensor. - out_channels (:obj:int): Number of channels in the output tensor. - kernel_size (:obj:Union[int, Tuple[int, int]]): Size of the convolving kernel. - stride (:obj:Union[int, Tuple[int, int]], optional): Stride of the convolution. Default is 1. - padding (:obj:Union[int, Tuple[int, int]], optional): Zero-padding added to both sides of the input. Default is 0. - dilation (:Union[int, Tuple[int, int]], optional): Spacing between kernel elements. Default is 1. - groups (:obj:int, optional): Number of blocked connections from input channels to output channels. Default is 1. - bias (:obj:bool, optional): Whether to add a learnable bias to the nn.Conv2d. Default is True. - padding_mode (:obj:str, optional): The type of padding algorithm to use. Default is 'zeros'. - device (:obj:torch.device, optional): The device to put the created module on. Default is None. - dtype (:obj:torch.dtype, optional): The desired data type of created module. Default is None. - scale (:obj:float, optional): The scale factor for initialization. Default is 1. Returns: - out (:obj:nn.Conv2d): A nn.Conv2d module with normalized fan-in init.

conv1d_block(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, activation=None, norm_type=None)

Overview

Create a 1-dimensional convolution layer with activation and normalization.

Arguments: - in_channels (:obj:int): Number of channels in the input tensor. - out_channels (:obj:int): Number of channels in the output tensor. - kernel_size (:obj:int): Size of the convolving kernel. - stride (:obj:int, optional): Stride of the convolution. Default is 1. - padding (:obj:int, optional): Zero-padding added to both sides of the input. Default is 0. - dilation (:obj:int, optional): Spacing between kernel elements. Default is 1. - groups (:obj:int, optional): Number of blocked connections from input channels to output channels. Default is 1. - activation (:obj:nn.Module, optional): The optional activation function. - norm_type (:obj:str, optional): Type of the normalization. Returns: - block (:obj:nn.Sequential): A sequential list containing the torch layers of the 1-dimensional convolution layer.

.. note:: Conv1d (https://pytorch.org/docs/stable/generated/torch.nn.Conv1d.html#torch.nn.Conv1d)

build_normalization(norm_type, dim=None)

Overview

Construct the corresponding normalization module. For beginners, refer to this article to learn more about batch normalization.

Arguments: - norm_type (:obj:str): Type of the normalization. Currently supports ['BN', 'LN', 'IN', 'SyncBN']. - dim (:obj:Optional[int]): Dimension of the normalization, applicable when norm_type is in ['BN', 'IN']. Returns: - norm_func (:obj:nn.Module): The corresponding batch normalization function.

get_lstm(lstm_type, input_size, hidden_size, num_layers=1, norm_type='LN', dropout=0.0, seq_len=None, batch_size=None)

Overview

Build and return the corresponding LSTM cell based on the provided parameters.

Arguments: - lstm_type (:obj:str): Version of RNN cell. Supported options are ['normal', 'pytorch', 'hpc', 'gru']. - input_size (:obj:int): Size of the input vector. - hidden_size (:obj:int): Size of the hidden state vector. - num_layers (:obj:int): Number of LSTM layers (default is 1). - norm_type (:obj:str): Type of normalization (default is 'LN'). - dropout (:obj:float): Dropout rate (default is 0.0). - seq_len (:obj:Optional[int]): Sequence length (default is None). - batch_size (:obj:Optional[int]): Batch size (default is None). Returns: - lstm (:obj:Union[LSTM, PytorchLSTM]): The corresponding LSTM cell.

sequence_mask(lengths, max_len=None)

Overview

Generates a boolean mask for a batch of sequences with differing lengths.

Arguments: - lengths (:obj:torch.Tensor): A tensor with the lengths of each sequence. Shape could be (n, 1) or (n). - max_len (:obj:int, optional): The padding size. If max_len is None, the padding size is the max length of sequences. Returns: - masks (:obj:torch.BoolTensor): A boolean mask tensor. The mask has the same device as lengths.

resnet18()

Overview

Creates a ResNet18 model.

Returns: - model (:obj:nn.Module): ResNet18 model.

build_ce_criterion(cfg)

Overview

Get a cross entropy loss instance according to given config.

Arguments: - cfg (:obj:dict) : Config dict. It contains: - type (:obj:str): Type of loss function, now supports ['cross_entropy', 'label_smooth_ce', 'soft_focal_loss']. - kwargs (:obj:dict): Arguments for the corresponding loss function. Returns: - loss (:obj:nn.Module): loss function instance

Full Source Code

../ding/torch_utils/__init__.py

1from .checkpoint_helper import build_checkpoint_helper, CountVar, auto_checkpoint 2from .data_helper import to_device, to_tensor, to_ndarray, to_list, to_dtype, same_shape, tensor_to_list, \ 3 build_log_buffer, CudaFetcher, get_tensor_data, unsqueeze, squeeze, get_null_data, get_shape0, to_item, \ 4 zeros_like 5from .distribution import CategoricalPd, CategoricalPdPytorch 6from .metric import levenshtein_distance, hamming_distance 7from .network import * 8from .loss import * 9from .optimizer_helper import Adam, RMSprop, calculate_grad_norm, calculate_grad_norm_without_bias_two_norm 10from .nn_test_helper import is_differentiable 11from .math_helper import cov 12from .dataparallel import DataParallel 13from .reshape_helper import fold_batch, unfold_batch, unsqueeze_repeat 14from .parameter import NonegativeParameter, TanhParameter