[7週]顔生成例


HDF5


データの操作
HDFは階層データフォーマットを表す.
これは、大容量データに効率的にアクセスできるデータフォーマットです.
  • HDFデータサンプルコード
  • import h5py
    import zipfile
    import imageio
    import os
    
    %%time
    
    # location of the HDF5 package, yours may be under /gan/ not /myo_gan/
    hdf5_file = 'mount/My Drive/Colab Notebooks/myo_gan/celeba_dataset/celeba_aligned_small.h5py'
    
    # how many of the 202,599 images to extract and package into HDF5
    total_images = 20000
    
    with h5py.File(hdf5_file, 'w') as hf:
    
        count = 0
    
        with zipfile.ZipFile('celeba/img_align_celeba.zip', 'r') as zf:
          for i in zf.namelist():
            if (i[-4:] == '.jpg'):
              # extract image
              ofile = zf.extract(i)
              img = imageio.imread(ofile)
              os.remove(ofile)
    
              # add image data to HDF5 file with new name
              hf.create_dataset('img_align_celeba/'+str(count)+'.jpg', data=img, compression="gzip", compression_opts=9)
              
              count = count + 1
              if (count%1000 == 0):
                print("images done .. ", count)
                pass
                
              # stop when total_images reached
              if (count == total_images):
                break
              pass
    
            pass
          pass
    
    with h5py.File('celeba_aligned_small.h5py') as file_object:
        
        for group in file_object:
            print(group)
    h 5 pyライブラリを使用してデータセットにアクセスできます.

    また、このデータセットはディックに近い方法でアクセスします.
    import numpy as np
    import matplotlib.pyplot as plt
    with h5py.File('celeba_aligned_small.h5py') as file_object:
        dataset = file_object['img_align_celeba']
        image = np.array(dataset['7.jpg'])
        plt.imshow(image, interpolation='none')
    image.shape

    3チャンネル(rgb)フルカラー画像.

    data loader

    from torch.utils.data import Dataset
    
    class CelebADataset(Dataset):
        
        def __init__(self, file):
            self.file_object = h5py.File(file, 'r')
            self.dataset = self.file_object['img_align_celeba']
            
        def __len__(self):
            return len(self.dataset)
        
        def __getitem__(self, index):
            if (index >= len(self.dataset)):
                raise IndexError()
            img = np.array(self.dataset[str(index)+'.jpg'])
            return torch.cuda.FloatTensor(img) / 255.0
        
        def plot_image(self, index):
            plt.imshow(np.array(self.dataset[str(index)+'.jpg']), interpolation='nearest')

    判別器

    # discriminator class
    import torch.nn as nn
    
    class View(nn.Module):
        def __init__(self, shape):
            super().__init__()
            self.shape = shape,
        
        def forward(self, x):
            return x.view(*self.shape)
    
    class Discriminator(nn.Module):
        
        def __init__(self):
            # initialise parent pytorch class
            super().__init__()
            
            # define neural network layers
            self.model = nn.Sequential(
                View(218*178*3),
                nn.Linear(3*218*178, 100),
                nn.LeakyReLU(),
                nn.LayerNorm(100),
                nn.Linear(100, 1),
                nn.Sigmoid()
            )
            
            # create loss function
            self.loss_function = nn.BCELoss()
    
            # create optimiser, simple stochastic gradient descent
            self.optimiser = torch.optim.Adam(self.parameters(), lr=0.0001)
    
            # counter and accumulator for progress
            self.counter = 0;
            self.progress = []
    
        def forward(self, inputs):
            # simply run model
            return self.model(inputs)
        
        def train(self, inputs, targets):
            # calculate the output of the network
            outputs = self.forward(inputs)
            
            # calculate loss
            loss = self.loss_function(outputs, targets)
    
            # increase counter and accumulate error every 10
            self.counter += 1;
            if (self.counter % 10 == 0):
                self.progress.append(loss.item())
            if (self.counter % 10000 == 0):
                print("counter = ", self.counter)
    
            # zero gradients, perform a backward pass, update weights
            self.optimiser.zero_grad()
            loss.backward()
            self.optimiser.step()
        
        def plot_progress(self):
            df = pd.DataFrame(self.progress, columns=['loss'])
            df.plot(ylim=(0), figsize=(16,8), alpha=0.1, marker='.', grid=True, yticks=(0, 0.25, 0.5, 1.0, 5.0))
    このデータセットは218*178サイズのrgb画像からなる.
    入力値の観点から、この画像をスムーズにする必要があります.
    すなわち、218 x 178 x 3、1161412個の入力を受信する必要がある.
    平坦化関数はpytorchでモジュール化されていません.
    次のコードを使用します.
    class View(nn.Module):
    	def __init__(self, shape):
        	super().__init__()
        	self.shape = shape,
        
       def forward(self, x):
       return x.view(*self.shape)
    ビュークラスを使用して、識別器モデル内でスムージングできます.
    mnist例のランダムイメージとシード値.
    def generate_random_image(size):
        random_data = torch.rand(size)
        return random_data
    
    def generate_random_seed(size):
        random_data = torch.randn(size)
        return random_data
    %%time
    import torch
    
    D = Discriminator()
    
    D.to(device)
    
    if torch.cuda.is_available():
        torch.set_default_tensor_type(torch.cuda.FloatTensor)
        
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    
    for image_data_tensor in celeba_dataset:
        D.train(image_data_tensor, torch.cuda.FloatTensor([1.0]))
        D.train(generate_random_image((218,178,3)), torch.cuda.FloatTensor([0.0]))
    判別器が真偽を認識できるかどうか見てみましょう

    学習の損失から見ると,真の画像認識は可能である.

    ビルダー

    # generator class
    
    class Generator(nn.Module):
        
        def __init__(self):
            # initialise parent pytorch class
            super().__init__()
            
            # define neural network layers
            self.model = nn.Sequential(
                nn.Linear(100, 3*10*10),
                nn.LeakyReLU(),
                nn.LayerNorm(3*10*10),
    
                nn.Linear(3*10*10, 3*218*178),
                nn.Sigmoid(),
                View((218,178,3))
            )
            
            # create optimiser, simple stochastic gradient descent
            self.optimiser = torch.optim.Adam(self.parameters(), lr=0.0001)
    
            # counter and accumulator for progress
            self.counter = 0;
            self.progress = []
        
        def forward(self, inputs):        
            # simply run model
            return self.model(inputs)
            
        def train(self, D, inputs, targets):
            # calculate the output of the network
            g_output = self.forward(inputs)
            
            # pass onto Discriminator
            d_output = D.forward(g_output)
            
            # calculate error
            loss = D.loss_function(d_output, targets)
    
            # increase counter and accumulate error every 10
            self.counter += 1;
            if (self.counter % 10 == 0):
                self.progress.append(loss.item())
    
            # zero gradients, perform a backward pass, update weights
            self.optimiser.zero_grad()
            loss.backward()
            self.optimiser.step()
        
        def plot_progress(self):
            df = pd.DataFrame(self.progress, columns=['loss'])
            df.plot(ylim=(0), figsize=(16,8), alpha=0.1, marker='.', grid=True, yticks=(0, 0.25, 0.5, 1.0, 5.0))
    認識器とは逆に、入力->出力の数を調整してください.
    Viewクラスを使用してスムーズ値を3 Dテンソルに変更
  • ジェネレータ確認
  • G = Generator()
    G.to(device)
    
    output = G.forward(generate_random_seed(100))
    img = output.detach().cpu().numpy()
    plt.imshow(img, interpolation='none', cmap='Blues')

    任意の画像の外観を作成できます

    トレーニング

    D = Discriminator()
    D.to(device)
    G = Generator()
    G.to(device)
    
    epochs = 1
    
    for epoch in range(epochs):
        print("epoch = ", epoch + 1)
        
        for image_data_tensor in celeba_dataset:
            # 1단계, 참에 대한 판별기 훈련
            D.train(image_data_tensor, torch.cuda.FloatTensor([1.0]))
            
            # 2단계, 거짓에 대한 판별기 훈련
            D.train(G.forward(generate_random_seed(100)).detach(), torch.cuda.FloatTensor([0.0]))
            
            # 3단계, 생성기 훈련
            G.train(D, generate_random_seed(100), torch.cuda.FloatTensor([1.0]))
  • 訓練過程損失分析
  • lossはbceの理想値0.69に収束した.
    すなわち,ジェネレータと判別器のバランスが理想的な状態である.

  • 判別器


  • ビルダー

  • 生成画像
  • を確認する.
    f, axarr = plt.subplots(2,3, figsize=(16, 8))
    for i in range(2):
        for j in range(3):
            output = G.forward(generate_random_seed(100))
            img = output.detach().cpu().numpy()
            axarr[i, j].imshow(img, interpolation='none', cmap='Blues')
  • 時代に6時まで増えた様子(41分かかる)

  • もっと鲜やかな