basecls.models#
- basecls.models.build_model(cfg)[源代码]#
The factory function to build model.
备注
if
cfg.modeldoes not have the attrhead, this function will build model with the default head. Otherwise ifcfg.model.headisNone, this function will build model without any head.备注
if
cfg.model.headdoes not have the attrw_outandcfg.num_classesexists,w_outwill be overridden bycfg.num_classes.- 参数
cfg (
ConfigDict) – config for building model.- 返回类型
- 返回
A model.
- basecls.models.sync_model(model)[源代码]#
Sync parameters and buffers.
- 参数
model (
Module) – model for syncing.
- class basecls.models.EffNet(stem_w, block_name, depths, widths, strides, kernels, exp_rs=1.0, se_rs=0.0, drop_path_prob=0.0, depth_mult=1.0, width_mult=1.0, omit_mult=False, norm_name='BN', act_name='silu', head=None)[源代码]#
基类:
ModuleEfficientNet model.
- 参数
stem_w (
int) – stem width.block_name (
Union[str,Callable,Sequence[Union[str,Callable]]]) – block name.depths (
Sequence[int]) – depth for each stage (number of blocks in the stage).widths (
Sequence[int]) – width for each stage (width of each block in the stage).strides (
Sequence[int]) – strides for each stage (applies to the first block of each stage).exp_rs (
Union[float,Sequence[Union[float,Sequence[float]]]]) – expansion ratios for MBConv blocks in each stage.se_r – Squeeze-and-Excitation (SE) ratio. Default:
0.25drop_path_prob (
float) – drop path probability. Default:0.0depth_mult (
float) – depth multiplier. Default:1.0width_mult (
float) – width multiplier. Default:1.0omit_mult (
bool) – omit multiplier for stem width, head width, the first stage depth and the last stage depth, enabled in EfficientNet-Lite. Default:Falsenorm_name (
str) – normalization function. Default:"BN"act_name (
str) – activation function. Default:"silu"head (
Optional[Mapping[str,Any]]) – head args. Default:None
- class basecls.models.HRNet(stage_modules, stage_blocks, stage_block_names, stage_channels, w_stem=64, multi_scale_output=True, merge_block_name='bottleneck', merge_channels=[32, 64, 128, 256], norm_name='BN', act_name='relu', head=None, **kwargs)[源代码]#
基类:
ModuleHRNet model.
- 参数
stage_modules (
List[int]) – Number of modules for each stage.stage_blocks (
List[List[int]]) – Number of blocks for each module in stages.stage_block_names (
List[str]) – Branch block types for each stage.stage_channels (
List[List[int]]) – Number of channels for each stage.w_stem (
int) – Stem width. Default:64multi_scale_output (
bool) – Whether output multi-scale features. Default:Truemerge_block_name (
str) – Merge block type. Default:"bottleneck"merge_channels (
List[int]) – Channels of each scale in merge block. Default:[32, 64, 128, 256]norm_name (
str) – Normalization layer. Default:"BN"act_name (
str) – Activation function. Default:"relu"head (
Optional[Mapping[str,Any]]) – head args. Default:None
- class basecls.models.MBNet(stem_w, depths, widths, strides, kernels, exp_rs=1.0, se_rs=0.0, stage_act_names=None, has_proj_act=False, has_skip=True, drop_path_prob=0.0, width_mult=1.0, norm_name='BN', act_name='relu6', head=None)[源代码]#
基类:
ModuleMobileNet model.
- 参数
stem_w (
int) – stem width.depths (
Sequence[int]) – depth for each stage (number of blocks in the stage).widths (
Sequence[int]) – width for each stage (width of each block in the stage).strides (
Sequence[int]) – strides for each stage (applies to the first block of each stage).exp_rs (
Union[float,Sequence[Union[float,Sequence[float]]]]) – expansion ratios for MobileNet basic blocks in each stage. Default:1.0se_rs (
Union[float,Sequence[Union[float,Sequence[float]]]]) – Squeeze-and-Excitation (SE) ratios. Default:0.0stage_act_names (
Optional[Sequence[str]]) – activation function for stages. Default:Nonehas_proj_act (
bool) – whether apply activation to output. Default:Falsehas_skip (
bool) – whether apply skip connection. Default:Truedrop_path_prob (
float) – drop path probability. Default:0.0width_mult (
float) – width multiplier. Default:1.0norm_name (
str) – normalization function. Default:"BN"act_name (
str) – activation function. Default:"relu6"head (
Optional[Mapping[str,Any]]) – head args. Default:None
- class basecls.models.RegNet(stem_name, stem_w, block_name, depth, w0, wa, wm, group_w, stride=2, bot_mul=1.0, se_r=0.0, drop_path_prob=0.0, zero_init_final_gamma=False, norm_name='BN', act_name='relu', head=None)[源代码]#
基类:
ResNetRegNet model.
- 参数
stem_w (
int) – stem width.depth (
int) – depth.w0 (
int) – initial width.wa (
float) – slope.wm (
float) – quantization.group_w (
int) – group width for each stage (applies to bottleneck block).stride (
int) – stride for each stage (applies to the first block of each stage). Default:2bot_mul (
float) – bottleneck multiplier for each stage (applies to bottleneck block). Default:1.0se_r (
float) – Squeeze-and-Excitation (SE) ratio. Default:0.0drop_path_prob (
float) – drop path probability. Default:0.0zero_init_final_gamma (
bool) – enable zero-initialize or not. Default:Falsenorm_name (
str) – normalization function. Default:"BN"act_name (
str) – activation function. Default:"relu"head (
Optional[Mapping[str,Any]]) – head args. Default:None
- class basecls.models.RepVGG(num_blocks, width_multiplier, head=None, groups=1, se_r=0.0, act_name='relu', deploy=False)[源代码]#
基类:
ModuleRepVGG Model.
Use
RepVGG.convert_to_deploy()to convert a trainingRepVGGto deploy:model = RepVGG(..., deploy=False) model.load_state_dict(...) _ = RepVGG.convert_to_deploy(model)
- 参数
width_multiplier (
Sequence[int]) – RepVGG widths,base_widthis[64, 128, 256, 512].head (
Optional[Mapping[str,Any]]) – head args. Default:Nonegroups (
Union[int,List[Union[int,List[int]]]]) – number of groups for blocks. Default:1se_r (
float) – Squeeze-and-Excitation (SE) ratio. Default:0.0act_name (
str) – activation function. Default:"relu"deploy (
bool) – switch a reparamed RepVGG into deploy mode. Default:False
- class basecls.models.ResMLP(img_size=224, patch_size=16, in_chans=3, embed_dim=768, depth=12, drop_rate=0.0, drop_path_rate=0.0, embed_layer=PatchEmbed, init_scale=1e-4, ffn_ratio=4.0, act_name='gelu', num_classes=1000, **kwargs)[源代码]#
基类:
ModuleResMLP model.
- 参数
img_size (
int) – Input image size. Default:224patch_size (
int) – Patch token size. Default:16in_chans (
int) – Number of input image channels. Default:3embed_dim (
int) – Number of linear projection output channels. Default:768depth (
int) – Depth of Transformer Encoder layer. Default:12drop_rate (
float) – Dropout rate. Default:0.0drop_path_rate (
float) – Stochastic depth rate. Default:0.0embed_layer (
Module) – Patch embedding layer. Default:PatchEmbedinit_scale (
float) – Initial value for LayerScale. Default:1e-4ffn_ratio (
float) – Ratio of ffn hidden dim to embedding dim. Default:4.0act_name (
str) – Activation function. Default:"gelu"num_classes (
int) – Number of classes. Default:1000
- class basecls.models.ResNet(stem_name, stem_w, block_name, depths, widths, strides, bot_muls=1.0, group_ws=None, se_r=0.0, avg_down=False, drop_path_prob=0.0, zero_init_final_gamma=False, norm_name='BN', act_name='relu', head=None)[源代码]#
基类:
ModuleResNet model.
- 参数
stem_w (
int) – stem width.depths (
Sequence[int]) – depth for each stage (number of blocks in the stage).widths (
Sequence[int]) – width for each stage (width of each block in the stage).strides (
Sequence[int]) – strides for each stage (applies to the first block of each stage).bot_muls (
Union[float,Sequence[float]]) – bottleneck multipliers for each stage (applies to bottleneck block). Default:1.0group_ws (
Optional[Sequence[int]]) – group widths for each stage (applies to bottleneck block). Default:Nonese_r (
float) – Squeeze-and-Excitation (SE) ratio. Default:0.0drop_path_prob (
float) – drop path probability. Default:0.0zero_init_final_gamma (
bool) – enable zero-initialize or not. Default:Falsenorm_name (
str) – normalization function. Default:"BN"act_name (
str) – activation function. Default:"relu"head (
Optional[Mapping[str,Any]]) – head args. Default:None
- class basecls.models.SNetV2(block, stem_w, depths, widths, strides, kernels, use_maxpool=True, se_r=0.0, drop_path_prob=0.0, norm_name='BN', act_name='relu', head=None)[源代码]#
基类:
ModuleShufflenetV2 model.
- 参数
block (
Callable) – building block to use,SNV2XceptionBlockfor v2+.stem_w (
int) – width for stem layer.depths (
Sequence[int]) – depth for each stage (number of blocks in the stage).widths (
Sequence[int]) – width for each stage (width of each block in the stage).strides (
Sequence[int]) – strides for each stage (applies to the first block of each stage).use_maxpool (
bool) – whether use maxpool stride 2 after stem. Default:Truese_r (
float) – Squeeze-and-Excitation (SE) ratio. Default:0.0drop_path_prob (
float) – drop path probability. Default:0.0norm_name (
str) – normalization function. Default:"BN"act_name (
str) – activation function. Default:"relu6"head (
Optional[Mapping[str,Any]]) – head args. Default:None
- class basecls.models.SwinTransformer(img_size=224, patch_size=4, in_chans=3, embed_dim=96, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], window_size=7, ffn_ratio=4.0, qkv_bias=True, qk_scale=None, ape=False, patch_norm=True, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.1, embed_layer=PatchEmbed, norm_name='LN', act_name='gelu', num_classes=1000, **kwargs)[源代码]#
基类:
Module- Swin Transformer
- A PyTorch impl of :
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows - https://arxiv.org/pdf/2103.14030
- 参数
img_size (
int) – Input image size. Default:224patch_size (
int) – Patch size. Default:4in_chans (
int) – Number of input image channels. Default:3embed_dim (
int) – Patch embedding dimension. Default:96depths (
Sequence[int]) – Depth of each Swin Transformer layer.num_heads (
Sequence[int]) – Number of attention heads in different layers.window_size (
int) – Window size. Default:7ffn_ratio (
float) – Ratio of ffn hidden dim to embedding dim. Default:4.0qkv_bias (
bool) – If True, add a learnable bias to query, key, value. Default:Trueqk_scale (
Optional[float]) – Override default qk scale of head_dim ** -0.5 if set. Default:Noneape (
bool) – If True, add absolute position embedding to the patch embedding. Default:Falsepatch_norm (
bool) – If True, add normalization after patch embedding. Default:Truedrop_rate (
float) – Dropout rate. Default:0attn_drop_rate (
float) – Attention dropout rate. Default:0drop_path_rate (
float) – Stochastic depth rate. Default:0.1embed_layer (
Module) – Patch embedding layer. Default:PatchEmbednorm_name (
str) – Normalization layer. Default:"LN"act_name (
str) – Activation layer. Default:"gelu"num_classes (
int) – Number of classes for classification head. Default:1000
- class basecls.models.VGG(depths, widths, norm_name=None, act_name='relu', head=None)[源代码]#
基类:
ModuleVGG model.
- 参数
depths (
Sequence[int]) – depth for each stage (number of blocks in the stage).widths (
Sequence[int]) – width for each stage (width of each block in the stage).norm_name (
Optional[str]) – normalization function. Default:Noneact_name (
str) – activation function. Default:"relu"head (
Optional[Mapping[str,Any]]) – head args. Default:None
- class basecls.models.ViT(img_size=224, patch_size=16, in_chans=3, embed_dim=768, depth=12, num_heads=12, ffn_ratio=4.0, qkv_bias=True, qk_scale=None, representation_size=None, distilled=False, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.0, embed_layer=PatchEmbed, norm_name='LN', act_name='gelu', num_classes=1000, **kwargs)[源代码]#
基类:
ModuleViT model.
- 参数
img_size (
int) – Input image size. Default:224patch_size (
int) – Patch token size. Default:16in_chans (
int) – Number of input image channels. Default:3embed_dim (
int) – Number of linear projection output channels. Default:768depth (
int) – Depth of Transformer Encoder layer. Default:12num_heads (
int) – Number of attention heads. Default:12ffn_ratio (
float) – Ratio of ffn hidden dim to embedding dim. Default:4.0qkv_bias (
bool) – If True, add a learnable bias to query, key, value. Default:Trueqk_scale (
Optional[float]) – Override default qk scale of head_dim ** -0.5 if set. Default:Nonerepresentation_size (
Optional[int]) – Size of representation layer (pre-logits). Default:Nonedistilled (
bool) – Includes a distillation token and head. Default:Falsedrop_rate (
float) – Dropout rate. Default:0.0attn_drop_rate (
float) – Attention dropout rate. Default:0.0drop_path_rate (
float) – Stochastic depth rate. Default:0.0embed_layer (
Module) – Patch embedding layer. Default:PatchEmbednorm_name (
str) – Normalization layer. Default:"LN"act_name (
str) – Activation function. Default:"gelu"num_classes (
int) – Number of classes. Default:1000
- load_state_dict(state_dict, strict=True)[源代码]#
Loads a given dictionary created by
state_dict()into this module. IfstrictisTrue, the keys ofstate_dict()must exactly match the keys returned bystate_dict().Users can also pass a closure:
Function[key: str, var: Tensor] -> Optional[np.ndarray]as a state_dict, in order to handle complex situations. For example, load everything except for the final linear classifier:state_dict = {...} # Dict[str, np.ndarray] model.load_state_dict({ k: None if k.startswith('fc') else v for k, v in state_dict.items() }, strict=False)
Here returning
Nonemeans skipping parameterk.To prevent shape mismatch (e.g. load PyTorch weights), we can reshape before loading:
state_dict = {...} def reshape_accordingly(k, v): return state_dict[k].reshape(v.shape) model.load_state_dict(reshape_accordingly)
We can also perform inplace re-initialization or pruning:
def reinit_and_pruning(k, v): if 'bias' in k: M.init.zero_(v) if 'conv' in k: