ランダムシードを設定してpytorchコードの再現性を保証する方法

4617 ワード

縁起

サンプルコード

pytorch乱数シードの設定

pytorch初期化シードとCuDNn

の設定
に縁を付ける
深い学習研究分野において、論文結果の再現性は大きな問題である.各種paperの中のコードにかかわらず、時には自分で書いたコードで、再現性を保証することは難しい:同じネットワーク構造を使っても、同じデータベースを使っても、同じ機械で訓練しても、訓練の結果には違いがある.この現象は深い学習訓練過程のランダム性によるものである.

ネットワークパラメータのランダム初期化

正規化方法、例えばdropoutがトレーニング中にネットワーク内のノード

をランダムに破棄する

最適化プロセス、例えばSGD、RMSPorpまたはAdamなどの方法も、ランダム初期化

Tips:pytorchの再現性はpytorchバージョンとオペレーティングシステムプラットフォームの影響を受ける.
サンプルコード
pytorchのランダムシードを設定する方法を例に示します.

# Train a model to fit a line y=mx using given data points

import torch

## Uncomment the two lines below to make the training reproducible.
#seed = 3
#torch.manual_seed(seed)

# set device to CUDA if available, else to CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Device:', device)

# N - number of data points
# n_inputs - number of input variables
# n_hidden - number of units in the hidden layer
# n_outputs - number of outputs
N, n_inputs, n_hidden, n_outputs = 5, 1, 100, 1

# Input 7 pairs of (x, y) input values
x = torch.tensor([[0.0], [1.0], [2.0], [3.0], [4.0], [5.0], [6.0], [7.0]], device=device)
y = torch.tensor([[0.0], [10.0], [20.0], [30.0], [40.0], [50.0], [60.0], [70.0]], device=device)

# Make a 3 layer neural network with an input layer, hidden layer and output layer
model = torch.nn.Sequential(
    torch.nn.Linear(n_inputs, n_hidden),
    torch.nn.ReLU(),
    torch.nn.Linear(n_hidden, n_outputs)
)
# Move the model to the device
model.to(device)

# Define the loss function to be the mean squared error loss
loss_fn = torch.nn.MSELoss(reduction='sum')

# Do forward pass through the data points, compute loss, compute gradients using backward propagation and update the weights using the gradients.
learning_rate = 1e-4
for t in range(1000):
    y_out = model.forward(x)
    loss = loss_fn(y_out, y)
    if t % 100 == 99:
        print(t, loss.item())
    #  print(y_out)

    # Gradients are made to zero prior to backward pass.
    model.zero_grad()
    loss.backward()

    # Update weights using gradient descent
    with torch.no_grad():
        for param in model.parameters():
            param -= learning_rate * param.grad

上記のコードを2回実行:最初の実行結果
Device: cuda 99 13.865872383117676 199 5.772928714752197 299 3.566026210784912 399 2.5292069911956787 499 1.8655864000320435 599 1.3915504217147827 699 1.0447190999984741 799 0.7871285676956177 899 0.5957959890365601 999 0.45342087745666504
2回目の実行結果
Device: cuda 99 6.1840715408325195 199 3.0933115482330322 299 1.9355353116989136 399 1.3561317920684814 499 0.998731791973114 599 0.7554249167442322 699 0.5831341743469238 799 0.45905551314353943 899 0.3688798248767853 999 0.30284053087234497
pytorch乱数シードの設定
コメントコードの6,7行

seed = 3
torch.manual_seed(seed)

2回再実行:1回目の実行結果
Device: cuda 99 10.655608177185059 199 3.6195263862609863 299 1.653144359588623 399 0.9989959001541138 499 0.712784469127655 599 0.5509689450263977 699 0.44407185912132263 799 0.368024617433548 899 0.3116675019264221 999 0.2681158781051636
2回目の実行結果
Device: cuda 99 10.655608177185059 199 3.6195263862609863DNN 299 1.653144359588623 399 0.9989959001541138 499 0.712784469127655 599 0.5509689450263977 699 0.44407185912132263 799 0.368024617433548 899 0.3116675019264221 999 0.2681158781051636
2回の運転の結果が同じであることがわかります
pytorch初期化シードとCuDNnの設定
以上の簡単な例ではpytorchのみを設定乱数シードを用いたが、畳み込み操作に関わる場合にはこれでは不十分である.このときCuDNNのGPU加速動作に関わるため、実際にコードを追加すればいいのですが

seed = 3
torch.manual_seed(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

以上のコードをpytorch画像分類コードに追加して初めて実行した結果
Device: cuda [1, 2000] loss: 2.192 [1, 4000] loss: 1.824 [1, 6000] loss: 1.613 [1, 8000] loss: 1.532 [1, 10000] loss: 1.470 [1, 12000] loss: 1.429 [2, 2000] loss: 1.378 [2, 4000] loss: 1.317 [2, 6000] loss: 1.291 [2, 8000] loss: 1.298 [2, 10000] loss: 1.264 [2, 12000] loss: 1.255 Finished Training
2回目の実行結果
Device: cuda [1, 2000] loss: 2.192 [1, 4000] loss: 1.824 [1, 6000] loss: 1.613 [1, 8000] loss: 1.532 [1, 10000] loss: 1.470 [1, 12000] loss: 1.429 [2, 2000] loss: 1.378 [2, 4000] loss: 1.317 [2, 6000] loss: 1.291 [2, 8000] loss: 1.298 [2, 10000] loss: 1.264 [2, 12000] loss: 1.255 Finished Training
numpyに関連する場合はnumpyの初期化シードを設定する必要があります

seed = 3
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

ターミナルが「not a valid identifier」というエラーを吐くときの対処法

剣指offerシリーズ-面接問題-面接問題55-II.バランスツリー(python)