Skip to content

ding.torch_utils.network.resnet

ding.torch_utils.network.resnet

This implementation of ResNet is a bit modification version of https://github.com/rwightman/pytorch-image-models.git

AvgPool2dSame

Bases: AvgPool2d

Overview

Tensorflow-like 'SAME' wrapper for 2D average pooling.

Interfaces: __init__, forward

__init__(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True)

Overview

Initialize the AvgPool2dSame with given arguments.

Arguments: - kernel_size (:obj:int): The size of the window to take an average over. - stride (:obj:Optional[Tuple[int, int]]): The stride of the window. If None, default to kernel_size. - padding (:obj:int): Implicit zero padding to be added on both sides. - ceil_mode (:obj:bool): When True, will use ceil instead of floor to compute the output shape. - count_include_pad (:obj:bool): When True, will include the zero-padding in the averaging calculation.

forward(x)

Overview

Forward pass of the AvgPool2dSame.

Argument: - x (:obj:torch.Tensor): Input tensor. Returns: - (:obj:torch.Tensor): Output tensor after average pooling.

ClassifierHead

Bases: Module

Overview

Classifier head with configurable global pooling and dropout.

Interfaces: __init__, forward

__init__(in_chs, num_classes, pool_type='avg', drop_rate=0.0, use_conv=False)

Overview

Initialize the ClassifierHead with given arguments.

Arguments: - in_chs (:obj:int): Number of input channels. - num_classes (:obj:int): Number of classes for the final classification. - pool_type (:obj:str): The type of pooling to use; 'avg' for Average Pooling. - drop_rate (:obj:float): The dropout rate. - use_conv (:obj:bool): Whether to use convolution or not.

forward(x)

Overview

Forward pass of the ClassifierHead.

Argument: - x (:obj:torch.Tensor): Input tensor. Returns: - (:obj:torch.Tensor): Output tensor after classification.

BasicBlock

Bases: Module

Overview

The basic building block for models like ResNet. This class extends pytorch's Module class. It represents a standard block of layers including two convolutions, batch normalization, an optional attention mechanism, and activation functions.

Interfaces: __init__, forward, zero_init_last_bn Properties: - expansion (:obj:int): Specifies the expansion factor for the planes of the conv layers.

__init__(inplanes, planes, stride=1, downsample=None, cardinality=1, base_width=64, reduce_first=1, dilation=1, first_dilation=None, act_layer=nn.ReLU, norm_layer=nn.BatchNorm2d, attn_layer=None, aa_layer=None, drop_block=None, drop_path=None)

Overview

Initialize the BasicBlock with given parameters.

Arguments: - inplanes (:obj:int): Number of input channels. - planes (:obj:int): Number of output channels. - stride (:obj:int): The stride of the convolutional layer. - downsample (:obj:Callable): Function for downsampling the inputs. - cardinality (:obj:int): Group size for grouped convolution. - base_width (:obj:int): Base width of the convolutions. - reduce_first (:obj:int): Reduction factor for first convolution of each block. - dilation (:obj:int): Spacing between kernel points. - first_dilation (:obj:int): First dilation value. - act_layer (:obj:Callable): Function for activation layer. - norm_layer (:obj:Callable): Function for normalization layer. - attn_layer (:obj:Callable): Function for attention layer. - aa_layer (:obj:Callable): Function for anti-aliasing layer. - drop_block (:obj:Callable): Method for dropping block. - drop_path (:obj:Callable): Method for dropping path.

zero_init_last_bn()

Overview

Initialize the batch normalization layer with zeros.

forward(x)

Overview

Defines the computation performed at every call.

Arguments: - x (:obj:torch.Tensor): The input tensor. Returns: - output (:obj:torch.Tensor): The output tensor after passing through the BasicBlock.

Bottleneck

Bases: Module

Overview

The Bottleneck class is a basic block used to build ResNet networks. It is a part of the PyTorch's implementation of ResNet. This block is designed with several layers including a convolutional layer, normalization layer, activation layer, attention layer, anti-aliasing layer, and a dropout layer.

Interfaces: __init__, forward, zero_init_last_bn Properties: expansion, inplanes, planes, stride, downsample, cardinality, base_width, reduce_first, dilation, first_dilation, act_layer, norm_layer, attn_layer, aa_layer, drop_block, drop_path

__init__(inplanes, planes, stride=1, downsample=None, cardinality=1, base_width=64, reduce_first=1, dilation=1, first_dilation=None, act_layer=nn.ReLU, norm_layer=nn.BatchNorm2d, attn_layer=None, aa_layer=None, drop_block=None, drop_path=None)

Overview

Initialize the Bottleneck class with various parameters.

Parameters:

Name Type Description Default
- inplanes (

obj:int): The number of input planes.

required
- planes (

obj:int): The number of output planes.

required
- stride (

obj:int, optional): The stride size, defaults to 1.

required
- downsample (

obj:nn.Module, optional): The downsample method, defaults to None.

required
- cardinality (

obj:int, optional): The size of the group convolutions, defaults to 1.

required
- base_width (

obj:int, optional): The base width, defaults to 64.

required
- reduce_first (

obj:int, optional): The first reduction factor, defaults to 1.

required
- dilation (

obj:int, optional): The dilation factor, defaults to 1.

required
- first_dilation (

obj:int, optional): The first dilation factor, defaults to None.

required
- act_layer (

obj:Type[nn.Module], optional): The activation layer type, defaults to nn.ReLU.

required
- norm_layer (

obj:Type[nn.Module], optional): The normalization layer type, defaults to nn.BatchNorm2d.

required
- attn_layer (

obj:Type[nn.Module], optional): The attention layer type, defaults to None.

required
- aa_layer (

obj:Type[nn.Module], optional): The anti-aliasing layer type, defaults to None.

required
- drop_block (

obj:Callable): The dropout block, defaults to None.

required
- drop_path (

obj:Callable): The drop path, defaults to None.

required

zero_init_last_bn()

Overview

Initialize the last batch normalization layer with zero.

forward(x)

Overview

Defines the computation performed at every call.

Arguments: - x (:obj:Tensor): The input tensor. Returns: - x (:obj:Tensor): The output tensor resulting from the computation.

ResNet

Bases: Module

Overview

Implements ResNet, ResNeXt, SE-ResNeXt, and SENet models. This implementation supports various modifications based on the v1c, v1d, v1e, and v1s variants included in the MXNet Gluon ResNetV1b model. For more details about the variants and options, please refer to the 'Bag of Tricks' paper: https://arxiv.org/pdf/1812.01187.

Interfaces: __init__, forward, zero_init_last_bn, get_classifier

__init__(block, layers, num_classes=1000, in_chans=3, cardinality=1, base_width=64, stem_width=64, stem_type='', replace_stem_pool=False, output_stride=32, block_reduce_first=1, down_kernel_size=1, avg_down=False, act_layer=nn.ReLU, norm_layer=nn.BatchNorm2d, aa_layer=None, drop_rate=0.0, drop_path_rate=0.0, drop_block_rate=0.0, global_pool='avg', zero_init_last_bn=True, block_args=None)

Overview

Initialize the ResNet model with given block, layers and other configuration options.

Arguments: - block (:obj:nn.Module): Class for the residual block. - layers (:obj:List[int]): Numbers of layers in each block. - num_classes (:obj:int, optional): Number of classification classes. Default is 1000. - in_chans (:obj:int, optional): Number of input (color) channels. Default is 3. - cardinality (:obj:int, optional): Number of convolution groups for 3x3 conv in Bottleneck. Default is 1. - base_width (:obj:int, optional): Factor determining bottleneck channels. Default is 64. - stem_width (:obj:int, optional): Number of channels in stem convolutions. Default is 64. - stem_type (:obj:str, optional): The type of stem. Default is ''. - replace_stem_pool (:obj:bool, optional): Whether to replace stem pooling. Default is False. - output_stride (:obj:int, optional): Output stride of the network. Default is 32. - block_reduce_first (:obj:int, optional): Reduction factor for first convolution output width of residual blocks. Default is 1. - down_kernel_size (:obj:int, optional): Kernel size of residual block downsampling path. Default is 1. - avg_down (:obj:bool, optional): Whether to use average pooling for projection skip connection between stages/downsample. Default is False. - act_layer (:obj:nn.Module, optional): Activation layer. Default is nn.ReLU. - norm_layer (:obj:nn.Module, optional): Normalization layer. Default is nn.BatchNorm2d. - aa_layer (:obj:Optional[nn.Module], optional): Anti-aliasing layer. Default is None. - drop_rate (:obj:float, optional): Dropout probability before classifier, for training. Default is 0.0. - drop_path_rate (:obj:float, optional): Drop path rate. Default is 0.0. - drop_block_rate (:obj:float, optional): Drop block rate. Default is 0.0. - global_pool (:obj:str, optional): Global pooling type. Default is 'avg'. - zero_init_last_bn (:obj:bool, optional): Whether to initialize last batch normalization with zero. Default is True. - block_args (:obj:Optional[dict], optional): Additional arguments for block. Default is None.

init_weights(zero_init_last_bn=True)

Overview

Initialize the weights in the model.

Arguments: - zero_init_last_bn (:obj:bool, optional): Whether to initialize last batch normalization with zero. Default is True.

get_classifier()

Overview

Get the classifier module from the model.

Returns: - classifier (:obj:nn.Module): The classifier module in the model.

reset_classifier(num_classes, global_pool='avg')

Overview

Reset the classifier with a new number of classes and pooling type.

Arguments: - num_classes (:obj:int): New number of classification classes. - global_pool (:obj:str, optional): New global pooling type. Default is 'avg'.

forward_features(x)

Overview

Forward pass through the feature layers of the model.

Arguments: - x (:obj:torch.Tensor): The input tensor. Returns: - x (:obj:torch.Tensor): The output tensor after passing through feature layers.

forward(x)

Overview

Full forward pass through the model.

Arguments: - x (:obj:torch.Tensor): The input tensor. Returns: - x (:obj:torch.Tensor): The output tensor after passing through the model.

to_2tuple(item)

Overview

Convert a scalar to a 2-tuple or return the item if it's not a scalar.

Arguments: - item (:obj:int): An item to be converted to a 2-tuple. Returns: - (:obj:tuple): A 2-tuple of the item.

get_same_padding(x, k, s, d)

Overview

Calculate asymmetric TensorFlow-like 'SAME' padding for a convolution.

Arguments: - x (:obj:int): The size of the input. - k (:obj:int): The size of the kernel. - s (:obj:int): The stride of the convolution. - d (:obj:int): The dilation of the convolution. Returns: - (:obj:int): The size of the padding.

pad_same(x, k, s, d=(1, 1), value=0)

Overview

Dynamically pad input x with 'SAME' padding for conv with specified args.

Arguments: - x (:obj:Tensor): The input tensor. - k (:obj:List[int]): The size of the kernel. - s (:obj:List[int]): The stride of the convolution. - d (:obj:List[int]): The dilation of the convolution. - value (:obj:float): Value to fill the padding. Returns: - (:obj:Tensor): The padded tensor.

avg_pool2d_same(x, kernel_size, stride, padding=(0, 0), ceil_mode=False, count_include_pad=True)

Overview

Apply average pooling with 'SAME' padding on the input tensor.

Arguments: - x (:obj:Tensor): The input tensor. - kernel_size (:obj:List[int]): The size of the kernel. - stride (:obj:List[int]): The stride of the convolution. - padding (:obj:List[int]): The size of the padding. - ceil_mode (:obj:bool): When True, will use ceil instead of floor to compute the output shape. - count_include_pad (:obj:bool): When True, will include the zero-padding in the averaging calculation. Returns: - (:obj:Tensor): The tensor after average pooling.

create_classifier(num_features, num_classes, pool_type='avg', use_conv=False)

Overview

Create a classifier with global pooling layer and fully connected layer.

Arguments: - num_features (:obj:int): The number of features. - num_classes (:obj:int): The number of classes for the final classification. - pool_type (:obj:str): The type of pooling to use; 'avg' for Average Pooling. - use_conv (:obj:bool): Whether to use convolution or not. Returns: - global_pool (:obj:nn.Module): The created global pooling layer. - fc (:obj:nn.Module): The created fully connected layer.

create_attn(layer, plane)

Overview

Create an attention mechanism.

Arguments: - layer (:obj:nn.Module): The layer where the attention is to be applied. - plane (:obj:int): The plane on which the attention is to be applied. Returns: - None

get_padding(kernel_size, stride, dilation=1)

Overview

Compute the padding based on the kernel size, stride and dilation.

Arguments: - kernel_size (:obj:int): The size of the kernel. - stride (:obj:int): The stride of the convolution. - dilation (:obj:int): The dilation factor. Returns: - padding (:obj:int): The computed padding.

downsample_conv(in_channels, out_channels, kernel_size, stride=1, dilation=1, first_dilation=None, norm_layer=None)

Overview

Create a sequential module for downsampling that includes a convolution layer and a normalization layer.

Arguments: - in_channels (:obj:int): The number of input channels. - out_channels (:obj:int): The number of output channels. - kernel_size (:obj:int): The size of the kernel. - stride (:obj:int, optional): The stride size, defaults to 1. - dilation (:obj:int, optional): The dilation factor, defaults to 1. - first_dilation (:obj:int, optional): The first dilation factor, defaults to None. - norm_layer (:obj:Type[nn.Module], optional): The normalization layer type, defaults to nn.BatchNorm2d. Returns: - nn.Sequential: A sequence of layers performing downsampling through convolution.

downsample_avg(in_channels, out_channels, kernel_size, stride=1, dilation=1, first_dilation=None, norm_layer=None)

Overview

Create a sequential module for downsampling that includes an average pooling layer, a convolution layer, and a normalization layer.

Arguments: - in_channels (:obj:int): The number of input channels. - out_channels (:obj:int): The number of output channels. - kernel_size (:obj:int): The size of the kernel. - stride (:obj:int, optional): The stride size, defaults to 1. - dilation (:obj:int, optional): The dilation factor, defaults to 1. - first_dilation (:obj:int, optional): The first dilation factor, defaults to None. - norm_layer (:obj:Type[nn.Module], optional): The normalization layer type, defaults to nn.BatchNorm2d. Returns: - nn.Sequential: A sequence of layers performing downsampling through average pooling.

drop_blocks(drop_block_rate=0.0)

Overview

Generate a list of None values based on the drop block rate.

Arguments: - drop_block_rate (:obj:float, optional): The drop block rate, defaults to 0. Returns: - List[None]: A list of None values.

make_blocks(block_fn, channels, block_repeats, inplanes, reduce_first=1, output_stride=32, down_kernel_size=1, avg_down=False, drop_block_rate=0.0, drop_path_rate=0.0, **kwargs)

Overview

Create a list of blocks for the network, with each block having a given number of repeats. Also, create a feature info list that contains information about the output of each block.

Arguments: - block_fn (:obj:Type[nn.Module]): The type of block to use. - channels (:obj:List[int]): The list of output channels for each block. - block_repeats (:obj:List[int]): The list of number of repeats for each block. - inplanes (:obj:int): The number of input planes. - reduce_first (:obj:int, optional): The first reduction factor, defaults to 1. - output_stride (:obj:int, optional): The total stride of the network, defaults to 32. - down_kernel_size (:obj:int, optional): The size of the downsample kernel, defaults to 1. - avg_down (:obj:bool, optional): Whether to use average pooling for downsampling, defaults to False. - drop_block_rate (:obj:float, optional): The drop block rate, defaults to 0. - drop_path_rate (:obj:float, optional): The drop path rate, defaults to 0. Returns: - Tuple[List[Tuple[str, nn.Module]], List[Dict[str, Union[int, str]]]]: A tuple that includes a list of blocks for the network and a feature info list.

resnet18()

Overview

Creates a ResNet18 model.

Returns: - model (:obj:nn.Module): ResNet18 model.

Full Source Code

../ding/torch_utils/network/resnet.py

1""" 2This implementation of ResNet is a bit modification version of `https://github.com/rwightman/pytorch-image-models.git` 3""" 4from typing import List, Callable, Optional, Tuple, Type, Dict, Union 5import math 6import numpy as np 7import torch 8import torch.nn as nn 9import torch.nn.functional as F 10 11from .nn_module import Flatten 12 13 14def to_2tuple(item: int) -> tuple: 15 """ 16 Overview: 17 Convert a scalar to a 2-tuple or return the item if it's not a scalar. 18 Arguments: 19 - item (:obj:`int`): An item to be converted to a 2-tuple. 20 Returns: 21 - (:obj:`tuple`): A 2-tuple of the item. 22 """ 23 if np.isscalar(item): 24 return (item, item) 25 else: 26 return item 27 28 29# Calculate asymmetric TensorFlow-like 'SAME' padding for a convolution 30def get_same_padding(x: int, k: int, s: int, d: int) -> int: 31 """ 32 Overview: 33 Calculate asymmetric TensorFlow-like 'SAME' padding for a convolution. 34 Arguments: 35 - x (:obj:`int`): The size of the input. 36 - k (:obj:`int`): The size of the kernel. 37 - s (:obj:`int`): The stride of the convolution. 38 - d (:obj:`int`): The dilation of the convolution. 39 Returns: 40 - (:obj:`int`): The size of the padding. 41 """ 42 return max((math.ceil(x / s) - 1) * s + (k - 1) * d + 1 - x, 0) 43 44 45# Dynamically pad input x with 'SAME' padding for conv with specified args 46def pad_same(x, k: List[int], s: List[int], d: List[int] = (1, 1), value: float = 0): 47 """ 48 Overview: 49 Dynamically pad input x with 'SAME' padding for conv with specified args. 50 Arguments: 51 - x (:obj:`Tensor`): The input tensor. 52 - k (:obj:`List[int]`): The size of the kernel. 53 - s (:obj:`List[int]`): The stride of the convolution. 54 - d (:obj:`List[int]`): The dilation of the convolution. 55 - value (:obj:`float`): Value to fill the padding. 56 Returns: 57 - (:obj:`Tensor`): The padded tensor. 58 """ 59 ih, iw = x.size()[-2:] 60 pad_h, pad_w = get_same_padding(ih, k[0], s[0], d[0]), get_same_padding(iw, k[1], s[1], d[1]) 61 if pad_h > 0 or pad_w > 0: 62 x = F.pad(x, [pad_w // 2, pad_w - pad_w // 2, pad_h // 2, pad_h - pad_h // 2], value=value) 63 return x 64 65 66def avg_pool2d_same( 67 x, 68 kernel_size: List[int], 69 stride: List[int], 70 padding: List[int] = (0, 0), 71 ceil_mode: bool = False, 72 count_include_pad: bool = True 73): 74 """ 75 Overview: 76 Apply average pooling with 'SAME' padding on the input tensor. 77 Arguments: 78 - x (:obj:`Tensor`): The input tensor. 79 - kernel_size (:obj:`List[int]`): The size of the kernel. 80 - stride (:obj:`List[int]`): The stride of the convolution. 81 - padding (:obj:`List[int]`): The size of the padding. 82 - ceil_mode (:obj:`bool`): When True, will use ceil instead of floor to compute the output shape. 83 - count_include_pad (:obj:`bool`): When True, will include the zero-padding in the averaging calculation. 84 Returns: 85 - (:obj:`Tensor`): The tensor after average pooling. 86 """ 87 # FIXME how to deal with count_include_pad vs not for external padding? 88 x = pad_same(x, kernel_size, stride) 89 return F.avg_pool2d(x, kernel_size, stride, (0, 0), ceil_mode, count_include_pad) 90 91 92class AvgPool2dSame(nn.AvgPool2d): 93 """ 94 Overview: 95 Tensorflow-like 'SAME' wrapper for 2D average pooling. 96 Interfaces: 97 ``__init__``, ``forward`` 98 """ 99 100 def __init__( 101 self, 102 kernel_size: int, 103 stride: Optional[Tuple[int, int]] = None, 104 padding: int = 0, 105 ceil_mode: bool = False, 106 count_include_pad: bool = True 107 ) -> None: 108 """ 109 Overview: 110 Initialize the AvgPool2dSame with given arguments. 111 Arguments: 112 - kernel_size (:obj:`int`): The size of the window to take an average over. 113 - stride (:obj:`Optional[Tuple[int, int]]`): The stride of the window. If None, default to kernel_size. 114 - padding (:obj:`int`): Implicit zero padding to be added on both sides. 115 - ceil_mode (:obj:`bool`): When True, will use `ceil` instead of `floor` to compute the output shape. 116 - count_include_pad (:obj:`bool`): When True, will include the zero-padding in the averaging calculation. 117 """ 118 kernel_size = to_2tuple(kernel_size) 119 stride = to_2tuple(stride) 120 super(AvgPool2dSame, self).__init__(kernel_size, stride, (0, 0), ceil_mode, count_include_pad) 121 122 def forward(self, x: torch.Tensor) -> torch.Tensor: 123 """ 124 Overview: 125 Forward pass of the AvgPool2dSame. 126 Argument: 127 - x (:obj:`torch.Tensor`): Input tensor. 128 Returns: 129 - (:obj:`torch.Tensor`): Output tensor after average pooling. 130 """ 131 x = pad_same(x, self.kernel_size, self.stride) 132 return F.avg_pool2d(x, self.kernel_size, self.stride, self.padding, self.ceil_mode, self.count_include_pad) 133 134 135def _create_pool(num_features: int, 136 num_classes: int, 137 pool_type: str = 'avg', 138 use_conv: bool = False) -> Tuple[nn.Module, int]: 139 """ 140 Overview: 141 Create a global pooling layer based on the given arguments. 142 Arguments: 143 - num_features (:obj:`int`): Number of input features. 144 - num_classes (:obj:`int`): Number of output classes. 145 - pool_type (:obj:`str`): Type of the pooling operation. Defaults to 'avg'. 146 - use_conv (:obj:`bool`): Whether to use convolutional layer after pooling. Defaults to False. 147 Returns: 148 - (:obj:`Tuple[nn.Module, int]`): The created global pooling layer and the number of pooled features. 149 """ 150 flatten_in_pool = not use_conv # flatten when we use a Linear layer after pooling 151 if not pool_type: 152 assert num_classes == 0 or use_conv, \ 153 'Pooling can only be disabled if classifier is also removed or conv classifier is used' 154 flatten_in_pool = False # disable flattening if pooling is pass-through (no pooling) 155 assert flatten_in_pool 156 global_pool = nn.AdaptiveAvgPool2d(1) 157 num_pooled_features = num_features * 1 158 return global_pool, num_pooled_features 159 160 161def _create_fc(num_features: int, num_classes: int, use_conv: bool = False) -> nn.Module: 162 """ 163 Overview: 164 Create a fully connected layer based on the given arguments. 165 Arguments: 166 - num_features (:obj:`int`): Number of input features. 167 - num_classes (:obj:`int`): Number of output classes. 168 - use_conv (:obj:`bool`): Whether to use convolutional layer. Defaults to False. 169 Returns: 170 - (:obj:`nn.Module`): The created fully connected layer. 171 """ 172 if num_classes <= 0: 173 fc = nn.Identity() # pass-through (no classifier) 174 elif use_conv: 175 fc = nn.Conv2d(num_features, num_classes, 1, bias=True) 176 else: 177 # use nn.Linear for simplification 178 fc = nn.Linear(num_features, num_classes, bias=True) 179 return fc 180 181 182def create_classifier(num_features: int, 183 num_classes: int, 184 pool_type: str = 'avg', 185 use_conv: bool = False) -> Tuple[nn.Module, nn.Module]: 186 """ 187 Overview: 188 Create a classifier with global pooling layer and fully connected layer. 189 Arguments: 190 - num_features (:obj:`int`): The number of features. 191 - num_classes (:obj:`int`): The number of classes for the final classification. 192 - pool_type (:obj:`str`): The type of pooling to use; 'avg' for Average Pooling. 193 - use_conv (:obj:`bool`): Whether to use convolution or not. 194 Returns: 195 - global_pool (:obj:`nn.Module`): The created global pooling layer. 196 - fc (:obj:`nn.Module`): The created fully connected layer. 197 """ 198 assert pool_type == 'avg' 199 global_pool, num_pooled_features = _create_pool(num_features, num_classes, pool_type, use_conv=use_conv) 200 fc = _create_fc(num_pooled_features, num_classes, use_conv=use_conv) 201 return global_pool, fc 202 203 204class ClassifierHead(nn.Module): 205 """ 206 Overview: 207 Classifier head with configurable global pooling and dropout. 208 Interfaces: 209 ``__init__``, ``forward`` 210 """ 211 212 def __init__( 213 self, 214 in_chs: int, 215 num_classes: int, 216 pool_type: str = 'avg', 217 drop_rate: float = 0., 218 use_conv: bool = False 219 ) -> None: 220 """ 221 Overview: 222 Initialize the ClassifierHead with given arguments. 223 Arguments: 224 - in_chs (:obj:`int`): Number of input channels. 225 - num_classes (:obj:`int`): Number of classes for the final classification. 226 - pool_type (:obj:`str`): The type of pooling to use; 'avg' for Average Pooling. 227 - drop_rate (:obj:`float`): The dropout rate. 228 - use_conv (:obj:`bool`): Whether to use convolution or not. 229 """ 230 super(ClassifierHead, self).__init__() 231 self.drop_rate = drop_rate 232 self.global_pool, num_pooled_features = _create_pool(in_chs, num_classes, pool_type, use_conv=use_conv) 233 self.fc = _create_fc(num_pooled_features, num_classes, use_conv=use_conv) 234 self.flatten = Flatten(1) if use_conv and pool_type else nn.Identity() 235 236 def forward(self, x: torch.Tensor) -> torch.Tensor: 237 """ 238 Overview: 239 Forward pass of the ClassifierHead. 240 Argument: 241 - x (:obj:`torch.Tensor`): Input tensor. 242 Returns: 243 - (:obj:`torch.Tensor`): Output tensor after classification. 244 """ 245 x = self.global_pool(x) 246 if self.drop_rate: 247 x = F.dropout(x, p=float(self.drop_rate), training=self.training) 248 x = self.fc(x) 249 x = self.flatten(x) 250 return x 251 252 253def create_attn(layer: nn.Module, plane: int) -> None: 254 """ 255 Overview: 256 Create an attention mechanism. 257 Arguments: 258 - layer (:obj:`nn.Module`): The layer where the attention is to be applied. 259 - plane (:obj:`int`): The plane on which the attention is to be applied. 260 Returns: 261 - None 262 """ 263 return None 264 265 266def get_padding(kernel_size: int, stride: int, dilation: int = 1) -> int: 267 """ 268 Overview: 269 Compute the padding based on the kernel size, stride and dilation. 270 Arguments: 271 - kernel_size (:obj:`int`): The size of the kernel. 272 - stride (:obj:`int`): The stride of the convolution. 273 - dilation (:obj:`int`): The dilation factor. 274 Returns: 275 - padding (:obj:`int`): The computed padding. 276 """ 277 padding = ((stride - 1) + dilation * (kernel_size - 1)) // 2 278 return padding 279 280 281class BasicBlock(nn.Module): 282 """ 283 Overview: 284 The basic building block for models like ResNet. This class extends pytorch's Module class. 285 It represents a standard block of layers including two convolutions, batch normalization, 286 an optional attention mechanism, and activation functions. 287 Interfaces: 288 ``__init__``, ``forward``, ``zero_init_last_bn`` 289 Properties: 290 - expansion (:obj:int): Specifies the expansion factor for the planes of the conv layers. 291 """ 292 expansion = 1 293 294 def __init__( 295 self, 296 inplanes: int, 297 planes: int, 298 stride: int = 1, 299 downsample: Callable = None, 300 cardinality: int = 1, 301 base_width: int = 64, 302 reduce_first: int = 1, 303 dilation: int = 1, 304 first_dilation: int = None, 305 act_layer: Callable = nn.ReLU, 306 norm_layer: Callable = nn.BatchNorm2d, 307 attn_layer: Callable = None, 308 aa_layer: Callable = None, 309 drop_block: Callable = None, 310 drop_path: Callable = None 311 ) -> None: 312 """ 313 Overview: 314 Initialize the BasicBlock with given parameters. 315 Arguments: 316 - inplanes (:obj:`int`): Number of input channels. 317 - planes (:obj:`int`): Number of output channels. 318 - stride (:obj:`int`): The stride of the convolutional layer. 319 - downsample (:obj:`Callable`): Function for downsampling the inputs. 320 - cardinality (:obj:`int`): Group size for grouped convolution. 321 - base_width (:obj:`int`): Base width of the convolutions. 322 - reduce_first (:obj:`int`): Reduction factor for first convolution of each block. 323 - dilation (:obj:`int`): Spacing between kernel points. 324 - first_dilation (:obj:`int`): First dilation value. 325 - act_layer (:obj:`Callable`): Function for activation layer. 326 - norm_layer (:obj:`Callable`): Function for normalization layer. 327 - attn_layer (:obj:`Callable`): Function for attention layer. 328 - aa_layer (:obj:`Callable`): Function for anti-aliasing layer. 329 - drop_block (:obj:`Callable`): Method for dropping block. 330 - drop_path (:obj:`Callable`): Method for dropping path. 331 """ 332 super(BasicBlock, self).__init__() 333 334 assert cardinality == 1, 'BasicBlock only supports cardinality of 1' 335 assert base_width == 64, 'BasicBlock does not support changing base width' 336 first_planes = planes // reduce_first 337 outplanes = planes * self.expansion 338 first_dilation = first_dilation or dilation 339 use_aa = aa_layer is not None and (stride == 2 or first_dilation != dilation) 340 341 self.conv1 = nn.Conv2d( 342 inplanes, 343 first_planes, 344 kernel_size=3, 345 stride=1 if use_aa else stride, 346 padding=first_dilation, 347 dilation=first_dilation, 348 bias=False 349 ) 350 self.bn1 = norm_layer(first_planes) 351 self.act1 = act_layer(inplace=True) 352 self.aa = aa_layer(channels=first_planes, stride=stride) if use_aa else None 353 354 self.conv2 = nn.Conv2d(first_planes, outplanes, kernel_size=3, padding=dilation, dilation=dilation, bias=False) 355 self.bn2 = norm_layer(outplanes) 356 357 self.se = create_attn(attn_layer, outplanes) 358 359 self.act2 = act_layer(inplace=True) 360 self.downsample = downsample 361 self.stride = stride 362 self.dilation = dilation 363 self.drop_block = drop_block 364 self.drop_path = drop_path 365 366 def zero_init_last_bn(self) -> None: 367 """ 368 Overview: 369 Initialize the batch normalization layer with zeros. 370 """ 371 nn.init.zeros_(self.bn2.weight) 372 373 def forward(self, x: torch.Tensor) -> torch.Tensor: 374 """ 375 Overview: 376 Defines the computation performed at every call. 377 Arguments: 378 - x (:obj:`torch.Tensor`): The input tensor. 379 Returns: 380 - output (:obj:`torch.Tensor`): The output tensor after passing through the BasicBlock. 381 """ 382 shortcut = x 383 384 x = self.conv1(x) 385 x = self.bn1(x) 386 if self.drop_block is not None: 387 x = self.drop_block(x) 388 x = self.act1(x) 389 if self.aa is not None: 390 x = self.aa(x) 391 392 x = self.conv2(x) 393 x = self.bn2(x) 394 if self.drop_block is not None: 395 x = self.drop_block(x) 396 397 if self.se is not None: 398 x = self.se(x) 399 400 if self.drop_path is not None: 401 x = self.drop_path(x) 402 403 if self.downsample is not None: 404 shortcut = self.downsample(shortcut) 405 x += shortcut 406 x = self.act2(x) 407 408 return x 409 410 411class Bottleneck(nn.Module): 412 """ 413 Overview: 414 The Bottleneck class is a basic block used to build ResNet networks. It is a part of the PyTorch's 415 implementation of ResNet. This block is designed with several layers including a convolutional layer, 416 normalization layer, activation layer, attention layer, anti-aliasing layer, and a dropout layer. 417 Interfaces: 418 ``__init__``, ``forward``, ``zero_init_last_bn`` 419 Properties: 420 expansion, inplanes, planes, stride, downsample, cardinality, base_width, reduce_first, dilation, \ 421 first_dilation, act_layer, norm_layer, attn_layer, aa_layer, drop_block, drop_path 422 423 """ 424 expansion = 4 425 426 def __init__( 427 self, 428 inplanes: int, 429 planes: int, 430 stride: int = 1, 431 downsample: Optional[nn.Module] = None, 432 cardinality: int = 1, 433 base_width: int = 64, 434 reduce_first: int = 1, 435 dilation: int = 1, 436 first_dilation: Optional[int] = None, 437 act_layer: Type[nn.Module] = nn.ReLU, 438 norm_layer: Type[nn.Module] = nn.BatchNorm2d, 439 attn_layer: Optional[Type[nn.Module]] = None, 440 aa_layer: Optional[Type[nn.Module]] = None, 441 drop_block: Callable = None, 442 drop_path: Callable = None 443 ) -> None: 444 """ 445 Overview: 446 Initialize the Bottleneck class with various parameters. 447 448 Arguments: 449 - inplanes (:obj:`int`): The number of input planes. 450 - planes (:obj:`int`): The number of output planes. 451 - stride (:obj:`int`, optional): The stride size, defaults to 1. 452 - downsample (:obj:`nn.Module`, optional): The downsample method, defaults to None. 453 - cardinality (:obj:`int`, optional): The size of the group convolutions, defaults to 1. 454 - base_width (:obj:`int`, optional): The base width, defaults to 64. 455 - reduce_first (:obj:`int`, optional): The first reduction factor, defaults to 1. 456 - dilation (:obj:`int`, optional): The dilation factor, defaults to 1. 457 - first_dilation (:obj:`int`, optional): The first dilation factor, defaults to None. 458 - act_layer (:obj:`Type[nn.Module]`, optional): The activation layer type, defaults to nn.ReLU. 459 - norm_layer (:obj:`Type[nn.Module]`, optional): The normalization layer type, defaults to nn.BatchNorm2d. 460 - attn_layer (:obj:`Type[nn.Module]`, optional): The attention layer type, defaults to None. 461 - aa_layer (:obj:`Type[nn.Module]`, optional): The anti-aliasing layer type, defaults to None. 462 - drop_block (:obj:`Callable`): The dropout block, defaults to None. 463 - drop_path (:obj:`Callable`): The drop path, defaults to None. 464 """ 465 super(Bottleneck, self).__init__() 466 467 width = int(math.floor(planes * (base_width / 64)) * cardinality) 468 first_planes = width // reduce_first 469 outplanes = planes * self.expansion 470 first_dilation = first_dilation or dilation 471 use_aa = aa_layer is not None and (stride == 2 or first_dilation != dilation) 472 473 self.conv1 = nn.Conv2d(inplanes, first_planes, kernel_size=1, bias=False) 474 self.bn1 = norm_layer(first_planes) 475 self.act1 = act_layer(inplace=True) 476 477 self.conv2 = nn.Conv2d( 478 first_planes, 479 width, 480 kernel_size=3, 481 stride=1 if use_aa else stride, 482 padding=first_dilation, 483 dilation=first_dilation, 484 groups=cardinality, 485 bias=False 486 ) 487 self.bn2 = norm_layer(width) 488 self.act2 = act_layer(inplace=True) 489 self.aa = aa_layer(channels=width, stride=stride) if use_aa else None 490 491 self.conv3 = nn.Conv2d(width, outplanes, kernel_size=1, bias=False) 492 self.bn3 = norm_layer(outplanes) 493 494 self.se = create_attn(attn_layer, outplanes) 495 496 self.act3 = act_layer(inplace=True) 497 self.downsample = downsample 498 self.stride = stride 499 self.dilation = dilation 500 self.drop_block = drop_block 501 self.drop_path = drop_path 502 503 def zero_init_last_bn(self) -> None: 504 """ 505 Overview: 506 Initialize the last batch normalization layer with zero. 507 """ 508 nn.init.zeros_(self.bn3.weight) 509 510 def forward(self, x: torch.Tensor) -> torch.Tensor: 511 """ 512 Overview: 513 Defines the computation performed at every call. 514 Arguments: 515 - x (:obj:`Tensor`): The input tensor. 516 Returns: 517 - x (:obj:`Tensor`): The output tensor resulting from the computation. 518 """ 519 shortcut = x 520 521 x = self.conv1(x) 522 x = self.bn1(x) 523 if self.drop_block is not None: 524 x = self.drop_block(x) 525 x = self.act1(x) 526 527 x = self.conv2(x) 528 x = self.bn2(x) 529 if self.drop_block is not None: 530 x = self.drop_block(x) 531 x = self.act2(x) 532 if self.aa is not None: 533 x = self.aa(x) 534 535 x = self.conv3(x) 536 x = self.bn3(x) 537 if self.drop_block is not None: 538 x = self.drop_block(x) 539 540 if self.se is not None: 541 x = self.se(x) 542 543 if self.drop_path is not None: 544 x = self.drop_path(x) 545 546 if self.downsample is not None: 547 shortcut = self.downsample(shortcut) 548 x += shortcut 549 x = self.act3(x) 550 551 return x 552 553 554def downsample_conv( 555 in_channels: int, 556 out_channels: int, 557 kernel_size: int, 558 stride: int = 1, 559 dilation: int = 1, 560 first_dilation: int = None, 561 norm_layer: Type[nn.Module] = None 562) -> nn.Sequential: 563 """ 564 Overview: 565 Create a sequential module for downsampling that includes a convolution layer and a normalization layer. 566 Arguments: 567 - in_channels (:obj:`int`): The number of input channels. 568 - out_channels (:obj:`int`): The number of output channels. 569 - kernel_size (:obj:`int`): The size of the kernel. 570 - stride (:obj:`int`, optional): The stride size, defaults to 1. 571 - dilation (:obj:`int`, optional): The dilation factor, defaults to 1. 572 - first_dilation (:obj:`int`, optional): The first dilation factor, defaults to None. 573 - norm_layer (:obj:`Type[nn.Module]`, optional): The normalization layer type, defaults to nn.BatchNorm2d. 574 Returns: 575 - nn.Sequential: A sequence of layers performing downsampling through convolution. 576 """ 577 norm_layer = norm_layer or nn.BatchNorm2d 578 kernel_size = 1 if stride == 1 and dilation == 1 else kernel_size 579 first_dilation = (first_dilation or dilation) if kernel_size > 1 else 1 580 p = get_padding(kernel_size, stride, first_dilation) 581 582 return nn.Sequential( 583 *[ 584 nn.Conv2d( 585 in_channels, out_channels, kernel_size, stride=stride, padding=p, dilation=first_dilation, bias=False 586 ), 587 norm_layer(out_channels) 588 ] 589 ) 590 591 592def downsample_avg( 593 in_channels: int, 594 out_channels: int, 595 kernel_size: int, 596 stride: int = 1, 597 dilation: int = 1, 598 first_dilation: int = None, 599 norm_layer: Type[nn.Module] = None 600) -> nn.Sequential: 601 """ 602 Overview: 603 Create a sequential module for downsampling that includes an average pooling layer, a convolution layer, 604 and a normalization layer. 605 Arguments: 606 - in_channels (:obj:`int`): The number of input channels. 607 - out_channels (:obj:`int`): The number of output channels. 608 - kernel_size (:obj:`int`): The size of the kernel. 609 - stride (:obj:`int`, optional): The stride size, defaults to 1. 610 - dilation (:obj:`int`, optional): The dilation factor, defaults to 1. 611 - first_dilation (:obj:`int`, optional): The first dilation factor, defaults to None. 612 - norm_layer (:obj:`Type[nn.Module]`, optional): The normalization layer type, defaults to nn.BatchNorm2d. 613 Returns: 614 - nn.Sequential: A sequence of layers performing downsampling through average pooling. 615 """ 616 norm_layer = norm_layer or nn.BatchNorm2d 617 avg_stride = stride if dilation == 1 else 1 618 if stride == 1 and dilation == 1: 619 pool = nn.Identity() 620 else: 621 avg_pool_fn = AvgPool2dSame if avg_stride == 1 and dilation > 1 else nn.AvgPool2d 622 pool = avg_pool_fn(2, avg_stride, ceil_mode=True, count_include_pad=False) 623 624 return nn.Sequential( 625 *[pool, 626 nn.Conv2d(in_channels, out_channels, 1, stride=1, padding=0, bias=False), 627 norm_layer(out_channels)] 628 ) 629 630 631def drop_blocks(drop_block_rate: float = 0.) -> List[None]: 632 """ 633 Overview: 634 Generate a list of None values based on the drop block rate. 635 Arguments: 636 - drop_block_rate (:obj:`float`, optional): The drop block rate, defaults to 0. 637 Returns: 638 - List[None]: A list of None values. 639 """ 640 assert drop_block_rate == 0., drop_block_rate 641 return [None for _ in range(4)] 642 643 644def make_blocks( 645 block_fn: Type[nn.Module], 646 channels: List[int], 647 block_repeats: List[int], 648 inplanes: int, 649 reduce_first: int = 1, 650 output_stride: int = 32, 651 down_kernel_size: int = 1, 652 avg_down: bool = False, 653 drop_block_rate: float = 0., 654 drop_path_rate: float = 0., 655 **kwargs 656) -> Tuple[List[Tuple[str, nn.Module]], List[Dict[str, Union[int, str]]]]: 657 """ 658 Overview: 659 Create a list of blocks for the network, with each block having a given number of repeats. Also, create a 660 feature info list that contains information about the output of each block. 661 Arguments: 662 - block_fn (:obj:`Type[nn.Module]`): The type of block to use. 663 - channels (:obj:`List[int]`): The list of output channels for each block. 664 - block_repeats (:obj:`List[int]`): The list of number of repeats for each block. 665 - inplanes (:obj:`int`): The number of input planes. 666 - reduce_first (:obj:`int`, optional): The first reduction factor, defaults to 1. 667 - output_stride (:obj:`int`, optional): The total stride of the network, defaults to 32. 668 - down_kernel_size (:obj:`int`, optional): The size of the downsample kernel, defaults to 1. 669 - avg_down (:obj:`bool`, optional): Whether to use average pooling for downsampling, defaults to False. 670 - drop_block_rate (:obj:`float`, optional): The drop block rate, defaults to 0. 671 - drop_path_rate (:obj:`float`, optional): The drop path rate, defaults to 0. 672 Returns: 673 - Tuple[List[Tuple[str, nn.Module]], List[Dict[str, Union[int, str]]]]: \ 674 A tuple that includes a list of blocks for the network and a feature info list. 675 """ 676 stages = [] 677 feature_info = [] 678 net_num_blocks = sum(block_repeats) 679 net_block_idx = 0 680 net_stride = 4 681 dilation = prev_dilation = 1 682 for stage_idx, (planes, num_blocks, db) in enumerate(zip(channels, block_repeats, drop_blocks(drop_block_rate))): 683 stage_name = f'layer{stage_idx + 1}' # never liked this name, but weight compat requires it 684 stride = 1 if stage_idx == 0 else 2 685 if net_stride >= output_stride: 686 dilation *= stride 687 stride = 1 688 else: 689 net_stride *= stride 690 691 downsample = None 692 if stride != 1 or inplanes != planes * block_fn.expansion: 693 down_kwargs = dict( 694 in_channels=inplanes, 695 out_channels=planes * block_fn.expansion, 696 kernel_size=down_kernel_size, 697 stride=stride, 698 dilation=dilation, 699 first_dilation=prev_dilation, 700 norm_layer=kwargs.get('norm_layer') 701 ) 702 downsample = downsample_avg(**down_kwargs) if avg_down else downsample_conv(**down_kwargs) 703 704 block_kwargs = dict(reduce_first=reduce_first, dilation=dilation, drop_block=db, **kwargs) 705 blocks = [] 706 for block_idx in range(num_blocks): 707 downsample = downsample if block_idx == 0 else None 708 stride = stride if block_idx == 0 else 1 709 block_dpr = drop_path_rate * net_block_idx / (net_num_blocks - 1) # stochastic depth linear decay rule 710 blocks.append( 711 block_fn( 712 inplanes, planes, stride, downsample, first_dilation=prev_dilation, drop_path=None, **block_kwargs 713 ) 714 ) 715 prev_dilation = dilation 716 inplanes = planes * block_fn.expansion 717 net_block_idx += 1 718 719 stages.append((stage_name, nn.Sequential(*blocks))) 720 feature_info.append(dict(num_chs=inplanes, reduction=net_stride, module=stage_name)) 721 722 return stages, feature_info 723 724 725class ResNet(nn.Module): 726 """ 727 Overview: 728 Implements ResNet, ResNeXt, SE-ResNeXt, and SENet models. This implementation supports various modifications 729 based on the v1c, v1d, v1e, and v1s variants included in the MXNet Gluon ResNetV1b model. For more details 730 about the variants and options, please refer to the 'Bag of Tricks' paper: https://arxiv.org/pdf/1812.01187. 731 Interfaces: 732 ``__init__``, ``forward``, ``zero_init_last_bn``, ``get_classifier`` 733 """ 734 735 def __init__( 736 self, 737 block: nn.Module, 738 layers: List[int], 739 num_classes: int = 1000, 740 in_chans: int = 3, 741 cardinality: int = 1, 742 base_width: int = 64, 743 stem_width: int = 64, 744 stem_type: str = '', 745 replace_stem_pool: bool = False, 746 output_stride: int = 32, 747 block_reduce_first: int = 1, 748 down_kernel_size: int = 1, 749 avg_down: bool = False, 750 act_layer: nn.Module = nn.ReLU, 751 norm_layer: nn.Module = nn.BatchNorm2d, 752 aa_layer: Optional[nn.Module] = None, 753 drop_rate: float = 0.0, 754 drop_path_rate: float = 0.0, 755 drop_block_rate: float = 0.0, 756 global_pool: str = 'avg', 757 zero_init_last_bn: bool = True, 758 block_args: Optional[dict] = None 759 ) -> None: 760 """ 761 Overview: 762 Initialize the ResNet model with given block, layers and other configuration options. 763 Arguments: 764 - block (:obj:`nn.Module`): Class for the residual block. 765 - layers (:obj:`List[int]`): Numbers of layers in each block. 766 - num_classes (:obj:`int`, optional): Number of classification classes. Default is 1000. 767 - in_chans (:obj:`int`, optional): Number of input (color) channels. Default is 3. 768 - cardinality (:obj:`int`, optional): Number of convolution groups for 3x3 conv in Bottleneck. Default is 1. 769 - base_width (:obj:`int`, optional): Factor determining bottleneck channels. Default is 64. 770 - stem_width (:obj:`int`, optional): Number of channels in stem convolutions. Default is 64. 771 - stem_type (:obj:`str`, optional): The type of stem. Default is ''. 772 - replace_stem_pool (:obj:`bool`, optional): Whether to replace stem pooling. Default is False. 773 - output_stride (:obj:`int`, optional): Output stride of the network. Default is 32. 774 - block_reduce_first (:obj:`int`, optional): Reduction factor for first convolution output width of \ 775 residual blocks. Default is 1. 776 - down_kernel_size (:obj:`int`, optional): Kernel size of residual block downsampling path. Default is 1. 777 - avg_down (:obj:`bool`, optional): Whether to use average pooling for projection skip connection between 778 stages/downsample. Default is False. 779 - act_layer (:obj:`nn.Module`, optional): Activation layer. Default is nn.ReLU. 780 - norm_layer (:obj:`nn.Module`, optional): Normalization layer. Default is nn.BatchNorm2d. 781 - aa_layer (:obj:`Optional[nn.Module]`, optional): Anti-aliasing layer. Default is None. 782 - drop_rate (:obj:`float`, optional): Dropout probability before classifier, for training. Default is 0.0. 783 - drop_path_rate (:obj:`float`, optional): Drop path rate. Default is 0.0. 784 - drop_block_rate (:obj:`float`, optional): Drop block rate. Default is 0.0. 785 - global_pool (:obj:`str`, optional): Global pooling type. Default is 'avg'. 786 - zero_init_last_bn (:obj:`bool`, optional): Whether to initialize last batch normalization with zero. \ 787 Default is True. 788 - block_args (:obj:`Optional[dict]`, optional): Additional arguments for block. Default is None. 789 """ 790 block_args = block_args or dict() 791 assert output_stride in (8, 16, 32) 792 self.num_classes = num_classes 793 self.drop_rate = drop_rate 794 super(ResNet, self).__init__() 795 796 # Stem 797 deep_stem = 'deep' in stem_type 798 inplanes = stem_width * 2 if deep_stem else 64 799 if deep_stem: 800 stem_chs = (stem_width, stem_width) 801 if 'tiered' in stem_type: 802 stem_chs = (3 * (stem_width // 4), stem_width) 803 self.conv1 = nn.Sequential( 804 *[ 805 nn.Conv2d(in_chans, stem_chs[0], 3, stride=2, padding=1, bias=False), 806 norm_layer(stem_chs[0]), 807 act_layer(inplace=True), 808 nn.Conv2d(stem_chs[0], stem_chs[1], 3, stride=1, padding=1, bias=False), 809 norm_layer(stem_chs[1]), 810 act_layer(inplace=True), 811 nn.Conv2d(stem_chs[1], inplanes, 3, stride=1, padding=1, bias=False) 812 ] 813 ) 814 else: 815 self.conv1 = nn.Conv2d(in_chans, inplanes, kernel_size=7, stride=2, padding=3, bias=False) 816 self.bn1 = norm_layer(inplanes) 817 self.act1 = act_layer(inplace=True) 818 self.feature_info = [dict(num_chs=inplanes, reduction=2, module='act1')] 819 820 # Stem Pooling 821 if replace_stem_pool: 822 self.maxpool = nn.Sequential( 823 *filter( 824 None, [ 825 nn.Conv2d(inplanes, inplanes, 3, stride=1 if aa_layer else 2, padding=1, bias=False), 826 aa_layer(channels=inplanes, stride=2) if aa_layer else None, 827 norm_layer(inplanes), 828 act_layer(inplace=True) 829 ] 830 ) 831 ) 832 else: 833 if aa_layer is not None: 834 self.maxpool = nn.Sequential( 835 *[nn.MaxPool2d(kernel_size=3, stride=1, padding=1), 836 aa_layer(channels=inplanes, stride=2)] 837 ) 838 else: 839 self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) 840 841 # Feature Blocks 842 channels = [64, 128, 256, 512] 843 stage_modules, stage_feature_info = make_blocks( 844 block, 845 channels, 846 layers, 847 inplanes, 848 cardinality=cardinality, 849 base_width=base_width, 850 output_stride=output_stride, 851 reduce_first=block_reduce_first, 852 avg_down=avg_down, 853 down_kernel_size=down_kernel_size, 854 act_layer=act_layer, 855 norm_layer=norm_layer, 856 aa_layer=aa_layer, 857 drop_block_rate=drop_block_rate, 858 drop_path_rate=drop_path_rate, 859 **block_args 860 ) 861 for stage in stage_modules: 862 self.add_module(*stage) # layer1, layer2, etc 863 self.feature_info.extend(stage_feature_info) 864 865 # Head (Pooling and Classifier) 866 self.num_features = 512 * block.expansion 867 self.global_pool, self.fc = create_classifier(self.num_features, self.num_classes, pool_type=global_pool) 868 869 self.init_weights(zero_init_last_bn=zero_init_last_bn) 870 871 def init_weights(self, zero_init_last_bn: bool = True) -> None: 872 """ 873 Overview: 874 Initialize the weights in the model. 875 Arguments: 876 - zero_init_last_bn (:obj:`bool`, optional): Whether to initialize last batch normalization with zero. 877 Default is True. 878 """ 879 for n, m in self.named_modules(): 880 if isinstance(m, nn.Conv2d): 881 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') 882 elif isinstance(m, nn.BatchNorm2d): 883 nn.init.ones_(m.weight) 884 nn.init.zeros_(m.bias) 885 if zero_init_last_bn: 886 for m in self.modules(): 887 if hasattr(m, 'zero_init_last_bn'): 888 m.zero_init_last_bn() 889 890 def get_classifier(self) -> nn.Module: 891 """ 892 Overview: 893 Get the classifier module from the model. 894 Returns: 895 - classifier (:obj:`nn.Module`): The classifier module in the model. 896 """ 897 return self.fc 898 899 def reset_classifier(self, num_classes: int, global_pool: str = 'avg') -> None: 900 """ 901 Overview: 902 Reset the classifier with a new number of classes and pooling type. 903 Arguments: 904 - num_classes (:obj:`int`): New number of classification classes. 905 - global_pool (:obj:`str`, optional): New global pooling type. Default is 'avg'. 906 """ 907 self.num_classes = num_classes 908 self.global_pool, self.fc = create_classifier(self.num_features, self.num_classes, pool_type=global_pool) 909 910 def forward_features(self, x: torch.Tensor) -> torch.Tensor: 911 """ 912 Overview: 913 Forward pass through the feature layers of the model. 914 Arguments: 915 - x (:obj:`torch.Tensor`): The input tensor. 916 Returns: 917 - x (:obj:`torch.Tensor`): The output tensor after passing through feature layers. 918 """ 919 x = self.conv1(x) 920 x = self.bn1(x) 921 x = self.act1(x) 922 x = self.maxpool(x) 923 924 x = self.layer1(x) 925 x = self.layer2(x) 926 x = self.layer3(x) 927 x = self.layer4(x) 928 return x 929 930 def forward(self, x: torch.Tensor) -> torch.Tensor: 931 """ 932 Overview: 933 Full forward pass through the model. 934 Arguments: 935 - x (:obj:`torch.Tensor`): The input tensor. 936 Returns: 937 - x (:obj:`torch.Tensor`): The output tensor after passing through the model. 938 """ 939 x = self.forward_features(x) 940 x = self.global_pool(x) 941 x = x.view(x.shape[0], -1) 942 if self.drop_rate: 943 x = F.dropout(x, p=float(self.drop_rate), training=self.training) 944 x = self.fc(x) 945 return x 946 947 948def resnet18() -> nn.Module: 949 """ 950 Overview: 951 Creates a ResNet18 model. 952 Returns: 953 - model (:obj:`nn.Module`): ResNet18 model. 954 """ 955 return ResNet(block=BasicBlock, layers=[2, 2, 2, 2])