Deeper Look at GD


📒 Gradient Descent


傾斜降下法はBias無しの簡単な方法で理解した.

📝 Dummy Data


Hours(x)Points(y)112233
x_train = torch.FloatTensor([1], [2], [3]])
y_train = torch.FloatTensor([1], [2], [3]])
  • H(x)=xが正しい型番です.(W=1が一番いい数字です.)
  • 📝 Cost function

  • W=1の場合、コスト=0となり、1から遠いほど高くなります.
  • Mean二乗誤差(MSE)で誤差を求めた.
  • cost = torch.mean((hypothesis - y_train) ** 2)

    📝 Gradient descent

  • 斜度が正の値の場合はWを減らし、負の値の場合はWを増やす.
  • gradient = 2 * torch.mean((W * x_train - y_train) * x_train)
    lr = 0.1
    W -= lr * gradient

    📝 Full code

    import torch
    
    # 데이터
    x_train = torch.FloatTensor([[1], [2], [3]])
    y_train = torch.FloatTensor([[1], [2], [3]])
    # 모델 초기화
    W = torch.zeros(1)
    # learning rate 설정
    lr = 0.1
    nb_epochs = 10
    for epoch in range(nb_epochs + 1):
        # H(x) 계산
        hypothesis = x_train * W
        # cost gradient 계산
        cost = torch.mean((hypothesis - y_train) ** 2)
        gradient = torch.sum((W * x_train - y_train) * x_train)
        print('Epoch {:4d}/{} W: {:.3f}, Cost: {:.6f}'.format(
        epoch, nb_epochs, W.item(), cost.item()))
        # cost gradient로 H(x) 개선
        W -= lr * gradient
  • のコストが徐々に減少していることがわかります.
  • 📝 Full code with torch.optim

  • torch.optimは傾斜降下法をより容易に実現した.
  • import torch
    
    # 데이터
    x_train = torch.FloatTensor([[1], [2], [3]])
    y_train = torch.FloatTensor([[1], [2], [3]])
    # 모델 초기화
    W = torch.zeros(1, requires_grad=True)
    # optimizer 설정
    optimizer = torch.optim.SGD([W], lr=0.15)
    nb_epochs = 10
    for epoch in range(nb_epochs + 1):
        # H(x) 계산
        hypothesis = x_train * W
        # cost 계산
        cost = torch.mean((hypothesis - y_train) ** 2)
        print('Epoch {:4d}/{} W: {:.3f} Cost: {:.6f}'.format(
        epoch, nb_epochs, W.item(), cost.item()))
        # cost로 H(x) 개선
        optimizer.zero_grad() # gradient를 0으로 초기화
        cost.backward()       # gradient 계산
        optimizer.step()      # 경사하강법(W에 적용)
  • は、1のWに収束し、低減したコストを確認することもできる.