Zhangzhe's Blog

The projection of my life.

0%

Normalizations

四种 Norm 对比

norms.png

  • BN:垂直于 C 维度归一化
  • LN:垂直于 N 维度归一化
  • IN:垂直于 N, C 维度归一化
  • GNGN={LN,   g=1IN,   g=cGN = \begin{cases} LN,\ \ \ g=1\\ IN,\ \ \ g=c \end{cases}g 表示每个 group 覆盖的 channel

四种 Norm 的 Affine 参数 Shape

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import torch

inp = torch.randn(3, 4, 5, 6)

# batch norm
batchnorm = torch.nn.BatchNorm2d(num_features=4)
batchnorm.weight.shape # torch.Size([4])
out = batchnorm(inp)
out.shape # torch.Size([3, 4, 5, 6])
out.permute(1,0,2,3).reshape(4, -1).mean(1).data # tensor([-5.9605e-09, 1.5895e-08, -5.2982e-09, -7.9473e-09]) 几乎是 0
out.permute(1,0,2,3).reshape(4, -1).std(1).data # tensor([1.0056, 1.0056, 1.0056, 1.0056]) 几乎是 1

# layer norm
layernorm = torch.nn.LayerNorm(normalized_shape=(4, 5, 6))
layernorm.weight.shape # torch.Size([4, 5, 6])
out = layernorm(inp)
out.shape # torch.Size([3, 4, 5, 6])
out.reshape(3, -1).mean(1).data # tensor([ 3.9736e-09, -5.9605e-09, 2.7816e-08]) 几乎是 0
out.reshape(3, -1).std(1).data # tensor([1.0042, 1.0042, 1.0042]) 几乎是 1

# instance norm
instancenorm = torch.nn.InstanceNorm2d(num_features=4)
out = instancenorm(inp)
out.shape # torch.Size([3, 4, 5, 6])
out.reshape(12, -1).mean(1).data # tensor([ 1.7881e-08, 1.7881e-08, -2.3842e-08, -1.9868e-08, 0.0000e+00, -3.9736e-09, 0.0000e+00, 0.0000e+00, 0.0000e+00, 7.9473e-09, 1.9868e-08, -2.1855e-08]) 几乎是 0
out.reshape(12, -1).std(1).data # tensor([1.0171, 1.0171, 1.0171, 1.0171, 1.0171, 1.0171, 1.0171, 1.0171, 1.0171, 1.0171, 1.0171, 1.0171]) 几乎是 1

# group norm
groupnorm = torch.nn.GroupNorm(num_groups=2, num_channels=4)
groupnorm.weight.shape # torch.Size([4, 5, 6])
out = groupnorm(inp)
out.shape # torch.Size([3, 4, 5, 6])
out.reshape(6, -1).mean(1).data # tensor([ 1.9868e-09, 3.1789e-08, -1.9868e-08, 0.0000e+00, 1.5895e-08, 5.9605e-08]) 几乎是 0
out.reshape(6, -1).std(1).data # tensor([1.0084, 1.0084, 1.0084, 1.0084, 1.0084, 1.0084]) 几乎是 1