亚洲视频在线免费观看,日韩欧美国产一区二区,日韩二区三区

在利用torch.max函數和F.Ssoftmax函數時，對應該設置什么維度，總是有點懵，遂總結一下：

首先看看二維tensor的函數的例子：

									import torch

									import torch.nn.functional as F

									input = torch.randn(3,4)

									print(input)

									tensor([[-0.5526, -0.0194, 2.1469, -0.2567],

									    [-0.3337, -0.9229, 0.0376, -0.0801],

									    [ 1.4721, 0.1181, -2.6214, 1.7721]])

									b = F.softmax(input,dim=0) # 按列SoftMax,列和為1

									print(b)

									tensor([[0.1018, 0.3918, 0.8851, 0.1021],

									    [0.1268, 0.1587, 0.1074, 0.1218],

									    [0.7714, 0.4495, 0.0075, 0.7762]])

									c = F.softmax(input,dim=1)  # 按行SoftMax,行和為1

									print(c)

									tensor([[0.0529, 0.0901, 0.7860, 0.0710],

									    [0.2329, 0.1292, 0.3377, 0.3002],

									    [0.3810, 0.0984, 0.0064, 0.5143]])

									d = torch.max(input,dim=0)  # 按列取max,

									print(d)

									torch.return_types.max(

									values=tensor([1.4721, 0.1181, 2.1469, 1.7721]),

									indices=tensor([2, 2, 0, 2]))

									e = torch.max(input,dim=1)  # 按行取max，

									print(e)

									torch.return_types.max(

									values=tensor([2.1469, 0.0376, 1.7721]),

									indices=tensor([2, 2, 3]))

下面看看三維tensor解釋例子：

函數softmax輸出的是所給矩陣的概率分布；

b輸出的是在dim=0維上的概率分布，b[0][5][6]+b[1][5][6]+b[2][5][6]=1

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

									a=torch.rand(3,16,20)

									b=F.softmax(a,dim=0)

									c=F.softmax(a,dim=1)

									d=F.softmax(a,dim=2)

									In [1]: import torch as t

									In [2]: import torch.nn.functional as F

									In [4]: a=t.Tensor(3,4,5)

									In [5]: b=F.softmax(a,dim=0)

									In [6]: c=F.softmax(a,dim=1)

									In [7]: d=F.softmax(a,dim=2)

									In [8]: a

									Out[8]: 

									tensor([[[-0.1581, 0.0000, 0.0000, 0.0000, -0.0344],

									     [ 0.0000, -0.0344, 0.0000, -0.0344, 0.0000],

									     [-0.0344, 0.0000, -0.0344, 0.0000, -0.0344],

									     [ 0.0000, -0.0344, 0.0000, -0.0344, 0.0000]],

									    [[-0.0344, 0.0000, -0.0344, 0.0000, -0.0344],

									     [ 0.0000, -0.0344, 0.0000, -0.0344, 0.0000],

									     [-0.0344, 0.0000, -0.0344, 0.0000, -0.0344],

									     [ 0.0000, -0.0344, 0.0000, -0.0344, 0.0000]],

									    [[-0.0344, 0.0000, -0.0344, 0.0000, -0.0344],

									     [ 0.0000, -0.0344, 0.0000, -0.0344, 0.0000],

									     [-0.0344, 0.0000, -0.0344, 0.0000, -0.0344],

									     [ 0.0000, -0.0344, 0.0000, -0.0344, 0.0000]]])

									In [9]: b

									Out[9]: 

									tensor([[[0.3064, 0.3333, 0.3410, 0.3333, 0.3333],

									     [0.3333, 0.3333, 0.3333, 0.3333, 0.3333],

									     [0.3333, 0.3333, 0.3333, 0.3333, 0.3333],

									     [0.3333, 0.3333, 0.3333, 0.3333, 0.3333]],

									    [[0.3468, 0.3333, 0.3295, 0.3333, 0.3333],

									     [0.3333, 0.3333, 0.3333, 0.3333, 0.3333],

									     [0.3333, 0.3333, 0.3333, 0.3333, 0.3333],

									     [0.3333, 0.3333, 0.3333, 0.3333, 0.3333]],

									    [[0.3468, 0.3333, 0.3295, 0.3333, 0.3333],

									     [0.3333, 0.3333, 0.3333, 0.3333, 0.3333],

									     [0.3333, 0.3333, 0.3333, 0.3333, 0.3333],

									     [0.3333, 0.3333, 0.3333, 0.3333, 0.3333]]])

									In [10]: b.sum()

									Out[10]: tensor(20.0000)

									In [11]: b[0][0][0]+b[1][0][0]+b[2][0][0]

									Out[11]: tensor(1.0000)

									In [12]: c.sum()

									Out[12]: tensor(15.)

									In [13]: c

									Out[13]: 

									tensor([[[0.2235, 0.2543, 0.2521, 0.2543, 0.2457],

									     [0.2618, 0.2457, 0.2521, 0.2457, 0.2543],

									     [0.2529, 0.2543, 0.2436, 0.2543, 0.2457],

									     [0.2618, 0.2457, 0.2521, 0.2457, 0.2543]],

									    [[0.2457, 0.2543, 0.2457, 0.2543, 0.2457],

									     [0.2543, 0.2457, 0.2543, 0.2457, 0.2543],

									     [0.2457, 0.2543, 0.2457, 0.2543, 0.2457],

									     [0.2543, 0.2457, 0.2543, 0.2457, 0.2543]],

									    [[0.2457, 0.2543, 0.2457, 0.2543, 0.2457],

									     [0.2543, 0.2457, 0.2543, 0.2457, 0.2543],

									     [0.2457, 0.2543, 0.2457, 0.2543, 0.2457],

									     [0.2543, 0.2457, 0.2543, 0.2457, 0.2543]]])

									In [14]: n=t.rand(3,4)

									In [15]: n

									Out[15]: 

									tensor([[0.2769, 0.3475, 0.8914, 0.6845],

									    [0.9251, 0.3976, 0.8690, 0.4510],

									    [0.8249, 0.1157, 0.3075, 0.3799]])

									In [16]: m=t.argmax(n,dim=0)

									In [17]: m

									Out[17]: tensor([1, 1, 0, 0])

									In [18]: p=t.argmax(n,dim=1)

									In [19]: p

									Out[19]: tensor([2, 0, 0])

									In [20]: d.sum()

									Out[20]: tensor(12.0000)

									In [22]: d

									Out[22]: 

									tensor([[[0.1771, 0.2075, 0.2075, 0.2075, 0.2005],

									     [0.2027, 0.1959, 0.2027, 0.1959, 0.2027],

									     [0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

									     [0.2027, 0.1959, 0.2027, 0.1959, 0.2027]],

									    [[0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

									     [0.2027, 0.1959, 0.2027, 0.1959, 0.2027],

									     [0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

									     [0.2027, 0.1959, 0.2027, 0.1959, 0.2027]],

									    [[0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

									     [0.2027, 0.1959, 0.2027, 0.1959, 0.2027],

									     [0.1972, 0.2041, 0.1972, 0.2041, 0.1972],

									     [0.2027, 0.1959, 0.2027, 0.1959, 0.2027]]])

									In [23]: d[0][0].sum()

									Out[23]: tensor(1.)

補充知識：多分類問題torch.nn.Softmax的使用

為什么談論這個問題呢？是因為我在工作的過程中遇到了語義分割預測輸出特征圖個數為16，也就是所謂的16分類問題。

因為每個通道的像素的值的大小代表了像素屬于該通道的類的大小，為了在一張圖上用不同的顏色顯示出來，我不得不學習了torch.nn.Softmax的使用。

首先看一個簡答的例子，倘若輸出為(3, 4, 4)，也就是3張4x4的特征圖。

									import torch

									img = torch.rand((3,4,4))

									print(img)

輸出為：

									tensor([[[0.0413, 0.8728, 0.8926, 0.0693],

									     [0.4072, 0.0302, 0.9248, 0.6676],

									     [0.4699, 0.9197, 0.3333, 0.4809],

									     [0.3877, 0.7673, 0.6132, 0.5203]],

									    [[0.4940, 0.7996, 0.5513, 0.8016],

									     [0.1157, 0.8323, 0.9944, 0.2127],

									     [0.3055, 0.4343, 0.8123, 0.3184],

									     [0.8246, 0.6731, 0.3229, 0.1730]],

									    [[0.0661, 0.1905, 0.4490, 0.7484],

									     [0.4013, 0.1468, 0.2145, 0.8838],

									     [0.0083, 0.5029, 0.0141, 0.8998],

									     [0.8673, 0.2308, 0.8808, 0.0532]]])

我們可以看到共三張特征圖，每張特征圖上對應的值越大，說明屬于該特征圖對應類的概率越大。

									import torch.nn as nn

									sogtmax = nn.Softmax(dim=0)

									img = sogtmax(img)

									print(img)

輸出為：

									tensor([[[0.2780, 0.4107, 0.4251, 0.1979],

									     [0.3648, 0.2297, 0.3901, 0.3477],

									     [0.4035, 0.4396, 0.2993, 0.2967],

									     [0.2402, 0.4008, 0.3273, 0.4285]],

									    [[0.4371, 0.3817, 0.3022, 0.4117],

									     [0.2726, 0.5122, 0.4182, 0.2206],

									     [0.3423, 0.2706, 0.4832, 0.2522],

									     [0.3718, 0.3648, 0.2449, 0.3028]],

									    [[0.2849, 0.2076, 0.2728, 0.3904],

									     [0.3627, 0.2581, 0.1917, 0.4317],

									     [0.2543, 0.2898, 0.2175, 0.4511],

									     [0.3880, 0.2344, 0.4278, 0.2686]]])

可以看到，上面的代碼對每張特征圖對應位置的像素值進行Softmax函數處理，圖中標紅位置加和=1，同理，標藍位置加和=1。

我們看到Softmax函數會對原特征圖每個像素的值在對應維度(這里dim=0，也就是第一維)上進行計算，將其處理到0～1之間，并且大小固定不變。

print(torch.max(img,0))

輸出為：

									torch.return_types.max(

									values=tensor([[0.4371, 0.4107, 0.4251, 0.4117],

									    [0.3648, 0.5122, 0.4182, 0.4317],

									    [0.4035, 0.4396, 0.4832, 0.4511],

									    [0.3880, 0.4008, 0.4278, 0.4285]]),

									indices=tensor([[1, 0, 0, 1],

									    [0, 1, 1, 2],

									    [0, 0, 1, 2],

									    [2, 0, 2, 0]]))

可以看到這里3x4x4變成了1x4x4，而且對應位置上的值為像素對應每個通道上的最大值，并且indices是對應的分類。

清楚理解了上面的流程，那么我們就容易處理了。

看具體案例，這里輸出output的大小為：16x416x416.

									output = torch.tensor(output)

									sm = nn.Softmax(dim=0)

									output = sm(output)

									mask = torch.max(output,0).indices.numpy()

									# 因為要轉化為RGB彩色圖，所以增加一維

									rgb_img = np.zeros((output.shape[1], output.shape[2], 3))

									for i in range(len(mask)):

									  for j in range(len(mask[0])):

									    if mask[i][j] == 0:

									      rgb_img[i][j][0] = 255

									      rgb_img[i][j][1] = 255

									      rgb_img[i][j][2] = 255

									    if mask[i][j] == 1:

									      rgb_img[i][j][0] = 255

									      rgb_img[i][j][1] = 180

									      rgb_img[i][j][2] = 0

									    if mask[i][j] == 2:

									      rgb_img[i][j][0] = 255

									      rgb_img[i][j][1] = 180

									      rgb_img[i][j][2] = 180

									    if mask[i][j] == 3:

									      rgb_img[i][j][0] = 255

									      rgb_img[i][j][1] = 180

									      rgb_img[i][j][2] = 255

									    if mask[i][j] == 4:

									      rgb_img[i][j][0] = 255

									      rgb_img[i][j][1] = 255

									      rgb_img[i][j][2] = 180

									    if mask[i][j] == 5:

									      rgb_img[i][j][0] = 255

									      rgb_img[i][j][1] = 255

									      rgb_img[i][j][2] = 0

									    if mask[i][j] == 6:

									      rgb_img[i][j][0] = 255

									      rgb_img[i][j][1] = 0

									      rgb_img[i][j][2] = 180

									    if mask[i][j] == 7:

									      rgb_img[i][j][0] = 255

									      rgb_img[i][j][1] = 0

									      rgb_img[i][j][2] = 255

									    if mask[i][j] == 8:

									      rgb_img[i][j][0] = 255

									      rgb_img[i][j][1] = 0

									      rgb_img[i][j][2] = 0

									    if mask[i][j] == 9:

									      rgb_img[i][j][0] = 180

									      rgb_img[i][j][1] = 0

									      rgb_img[i][j][2] = 0

									    if mask[i][j] == 10:

									      rgb_img[i][j][0] = 180

									      rgb_img[i][j][1] = 255

									      rgb_img[i][j][2] = 255

									    if mask[i][j] == 11:

									      rgb_img[i][j][0] = 180

									      rgb_img[i][j][1] = 0

									      rgb_img[i][j][2] = 180

									    if mask[i][j] == 12:

									      rgb_img[i][j][0] = 180

									      rgb_img[i][j][1] = 0

									      rgb_img[i][j][2] = 255

									    if mask[i][j] == 13:

									      rgb_img[i][j][0] = 180

									      rgb_img[i][j][1] = 255

									      rgb_img[i][j][2] = 180

									    if mask[i][j] == 14:

									      rgb_img[i][j][0] = 0

									      rgb_img[i][j][1] = 180

									      rgb_img[i][j][2] = 255

									    if mask[i][j] == 15:

									      rgb_img[i][j][0] = 0

									      rgb_img[i][j][1] = 0

									      rgb_img[i][j][2] = 0

									cv2.imwrite('output.jpg', rgb_img)