デプスネットワーク

15417 ワード

ImageNetは、コンピュータの将来性を研究する際に必ず遭遇する名前です.

深度ネットワーク入門


AlexNet


AlexNet構造
Deep Convolutional Neural Networks画像分類論文

VGG


VGGはAlexNetのように画像ネットワークチャレンジで公開されたモデルである.2014年には挑戦戦準優勝を4回獲得した.これまで優勝したネットワークは10個未満のCNN層を持っていたが、VGG 16とVGG 19という名前の後ろの数字から分かるように、VGGは16個、19個の層から構成されている.
VGGNet
上記のAlexNetとVGGを含む複数のCNNモードが存在する.次のリンクが表示されます.
CNNの主なモデルたち。

傾斜消失増白灰泥


遠くで話している人の声が聞こえないように、模型の深さが深まるにつれて、模型の勾配を学ぶために消える現象を、傾斜消失といいます.

近道をしよう


ResNetは、Skip Connectionという構造を用いて、層の深さに伴うシェーディング/展開グラデーションの問題を解決する.

深度ネットワークについて


Model API


Tensorflow


Tensorflowの予習モード(pre-training model)はslimという高標準APIであり、Tensorflow githubに入るとより多くの内容を知ることができる.

Keras


KerasはKerasアプリケーションによって予習モードを提供し、keras.applications docsを表示するとサポートされているモードを表示することができ、実装されたコードはkeras application githubに入った後に表示することができる.

VGG-16


以前に学んだVGG 16をコードで実現する.
MaxpoolingやSoftmaxなどのアクティブ化関数に加えて,CNN層と全接続層を合わせて16層あり,下図のように理解できる.

VGG 16コード実装
  • Block 1 : ~ first Max pooling
  • Block 2 : ~ second Max pooling
  • Block 3 : ~ third Max pooling
  • Block 4 : ~ fourth Max pooling
  • Block 5 : ~ fifth Max pooling
  • Block 6 : ~ Fully Connected Layer + softmax
  • 以上の概念はコードで以下のように実現される.
    import tensorflow as tf
    from tensorflow import keras
    from tensorflow.python.keras import layers
    from tensorflow.python.keras.applications import imagenet_utils
    
    # CIFAR100 데이터셋을 가져옵시다. 
    cifar100 = keras.datasets.cifar100
    
    (x_train, y_train), (x_test, y_test) = cifar100.load_data()
    x_train, x_test = x_train / 255.0, x_test / 255.0
    
    img_input = keras.Input(shape=(32, 32, 3))

    Block 1

    x = layers.Conv2D(64, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block1_conv1')(img_input)
    x = layers.Conv2D(64, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block1_conv2')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)

    Block 2

    x = layers.Conv2D(128, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block1_conv1')(img_input)
    x = layers.Conv2D(128, (3, 3),
                      activation='relu',
                      padding='same',
                      name='block1_conv2')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)

    Block 3

    x = layers.Conv2D(
      256, (3, 3), activation='relu', padding='same', name='block3conv1')(x)
    x = layers.Conv2D(
      256, (3, 3), activation='relu', padding='same', name='block3conv2')(x)
    x = layers.Conv2D(
      256, (3, 3), activation='relu', padding='same', name='block3conv3')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block3pool')(x)

    Block 4

    x = layers.Conv2D(
      512, (3, 3), activation='relu', padding='same', name='block4conv1')(x)
    x = layers.Conv2D(
      512, (3, 3), activation='relu', padding='same', name='block4conv2')(x)
    x = layers.Conv2D(
      512, (3, 3), activation='relu', padding='same', name='block4conv3')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block4pool')(x)

    Block 5

    x = layers.Conv2D(
      512, (3, 3), activation='relu', padding='same', name='block5conv1')(x)
    x = layers.Conv2D(
      512, (3, 3), activation='relu', padding='same', name='block5conv2')(x)
    x = layers.Conv2D(
      512, (3, 3), activation='relu', padding='same', name='block5conv3')(x)
    x = layers.MaxPooling2D((2, 2), strides=(2, 2), name='block5pool')(x)

    Block 6

    x = layers.Flatten(name='flatten')(x)
    x = layers.Dense(4096, activation='relu', name='fc1')(x)
    x = layers.Dense(4096, activation='relu', name='fc2')(x)
                         
    classes=100
    x = layers.Dense(classes, activation='softmax', name='predictions')(x) 

    モデルの作成と学習

    model = keras.Model(name="VGG-16", inputs=img_input, outputs=x)
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    model.fit(X_train, y_train, epochs=1)

    ResNet50


    Skip接続を追加するResNetを実現しましょう.

    ResNet 50コードの実装
    ResNet 50モデルを作成するためにconv_blockおよびidentity_blockがあり、これにより50個までの複雑な層構造を簡潔に表すことができる.
    from tensorflow.python.keras import backend
    from tensorflow.python.keras import regularizers
    from tensorflow.python.keras import initializers
    from tensorflow.python.keras import models
    
    # block 안에 반복적으로 활용되는 L2 regularizer를 선언해 줍니다.
    def _gen_l2_regularizer(use_l2_regularizer=True, l2_weight_decay=1e-4):
      return regularizers.l2(l2_weight_decay) if use_l2_regularizer else None
    [conv_block]
    def conv_block(input_tensor,
                   kernel_size,
                   filters,
                   stage,
                   block,
                   strides=(2, 2),
                   use_l2_regularizer=True,
                   batch_norm_decay=0.9,
                   batch_norm_epsilon=1e-5):
      filters1, filters2, filters3 = filters
      
      if backend.image_data_format() == 'channels_last':
        bn_axis = 3
      else:
        bn_axis = 1
      conv_name_base = 'res' + str(stage) + block + '_branch'
      bn_name_base = 'bn' + str(stage) + block + '_branch'
    
      x = layers.Conv2D(
          filters1, (1, 1),
          use_bias=False,
          kernel_initializer='he_normal',
          kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
          name=conv_name_base + '2a')(input_tensor)
              
      x = layers.BatchNormalization(
          axis=bn_axis,
          momentum=batch_norm_decay,
          epsilon=batch_norm_epsilon,
          name=bn_name_base + '2a')(x)
    
      x = layers.Activation('relu')(x)
    
      x = layers.Conv2D(
          filters2,
          kernel_size,
          strides=strides,
          padding='same',
          use_bias=False,
          kernel_initializer='he_normal',
          kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
          name=conv_name_base + '2b')(x)
          
      x = layers.BatchNormalization(
          axis=bn_axis,
          momentum=batch_norm_decay,
          epsilon=batch_norm_epsilon,
          name=bn_name_base + '2b')(x)
          
      x = layers.Activation('relu')(x)
    
      x = layers.Conv2D(
          filters3, (1, 1),
          use_bias=False,
          kernel_initializer='he_normal',
          kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
          name=conv_name_base + '2c')(x)
          
      x = layers.BatchNormalization(
          axis=bn_axis,
          momentum=batch_norm_decay,
          epsilon=batch_norm_epsilon,
          name=bn_name_base + '2c')(x)
    
      shortcut = layers.Conv2D(
          filters3, (1, 1),
          strides=strides,
          use_bias=False,
          kernel_initializer='he_normal',
          kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
          name=conv_name_base + '1')(input_tensor)
              
      shortcut = layers.BatchNormalization(
          axis=bn_axis,
          momentum=batch_norm_decay,
          epsilon=batch_norm_epsilon,
          name=bn_name_base + '1')(shortcut)
    
      x = layers.add([x, shortcut])
      x = layers.Activation('relu')(x)
      
      return x
    [identity_block]
    def identity_block(input_tensor,
                       kernel_size,
                       filters,
                       stage,
                       block,
                       use_l2_regularizer=True,
                       batch_norm_decay=0.9,
                       batch_norm_epsilon=1e-5):
    
      filters1, filters2, filters3 = filters
      
      if backend.image_data_format() == 'channels_last':
        bn_axis = 3
      else:
        bn_axis = 1
      conv_name_base = 'res' + str(stage) + block + '_branch'
      bn_name_base = 'bn' + str(stage) + block + '_branch'
    
      x = layers.Conv2D(
          filters1, (1, 1),
          use_bias=False,
          kernel_initializer='he_normal',
          kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
          name=conv_name_base + '2a')(input_tensor)
          
      x = layers.BatchNormalization(
          axis=bn_axis,
          momentum=batch_norm_decay,
          epsilon=batch_norm_epsilon,
          name=bn_name_base + '2a')(x)
          
      x = layers.Activation('relu')(x)
    
      x = layers.Conv2D(
          filters2,
          kernel_size,
          padding='same',
          use_bias=False,
          kernel_initializer='he_normal',
          kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
          name=conv_name_base + '2b')(x)
          
      x = layers.BatchNormalization(
          axis=bn_axis,
          momentum=batch_norm_decay,
          epsilon=batch_norm_epsilon,
          name=bn_name_base + '2b')(x)
          
      x = layers.Activation('relu')(x)
    
      x = layers.Conv2D(
          filters3, (1, 1),
          use_bias=False,
          kernel_initializer='he_normal',
          kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
          name=conv_name_base + '2c')(x)
          
      x = layers.BatchNormalization(
          axis=bn_axis,
          momentum=batch_norm_decay,
          epsilon=batch_norm_epsilon,
          name=bn_name_base + '2c')(x)
    
      x = layers.add([x, input_tensor])
      x = layers.Activation('relu')(x)
      
      return x
    [resnet 50関数]
    def resnet50(num_classes,
                 batch_size=None,
                 use_l2_regularizer=True,
                 rescale_inputs=False,
                 batch_norm_decay=0.9,
                 batch_norm_epsilon=1e-5):
                 
      # CIFAR100을 위한 input_shape 
      input_shape = (32, 32, 3)  
      img_input = layers.Input(shape=input_shape, batch_size=batch_size)
    
      if rescale_inputs:
        # Hub image modules expect inputs in the range [0, 1]. This rescales these
        # inputs to the range expected by the trained model.
        x = layers.Lambda(
            lambda x: x * 255.0 - backend.constant(
                imagenet_preprocessing.CHANNEL_MEANS,
                shape=[1, 1, 3],
                dtype=x.dtype),
            name='rescale')(img_input)
      else:
        x = img_input
    
      if backend.image_data_format() == 'channels_first':
        x = layers.Permute((3, 1, 2))(x)
        bn_axis = 1
      else:  # channels_last
        bn_axis = 3
    
      block_config = dict(
          use_l2_regularizer=use_l2_regularizer,
          batch_norm_decay=batch_norm_decay,
          batch_norm_epsilon=batch_norm_epsilon)
      x = layers.ZeroPadding2D(padding=(3, 3), name='conv1_pad')(x)
      x = layers.Conv2D(
          64, (7, 7),
          strides=(2, 2),
          padding='valid',
          use_bias=False,
          kernel_initializer='he_normal',
          kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
          name='conv1')(x)
      x = layers.BatchNormalization(
          axis=bn_axis,
          momentum=batch_norm_decay,
          epsilon=batch_norm_epsilon,
          name='bn_conv1')(x)
      x = layers.Activation('relu')(x)
      x = layers.MaxPooling2D((3, 3), strides=(2, 2), padding='same')(x)
    
      x = conv_block(
          x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1), **block_config)
      x = identity_block(x, 3, [64, 64, 256], stage=2, block='b', **block_config)
      x = identity_block(x, 3, [64, 64, 256], stage=2, block='c', **block_config)
    
      x = conv_block(x, 3, [128, 128, 512], stage=3, block='a', **block_config)
      x = identity_block(x, 3, [128, 128, 512], stage=3, block='b', **block_config)
      x = identity_block(x, 3, [128, 128, 512], stage=3, block='c', **block_config)
      x = identity_block(x, 3, [128, 128, 512], stage=3, block='d', **block_config)
    
      x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a', **block_config)
      x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b', **block_config)
      x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c', **block_config)
      x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d', **block_config)
      x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e', **block_config)
      x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f', **block_config)
    
      x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a', **block_config)
      x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b', **block_config)
      x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c', **block_config)
    
      x = layers.GlobalAveragePooling2D()(x)
      x = layers.Dense(
          num_classes,
          kernel_initializer=initializers.RandomNormal(stddev=0.01),
          kernel_regularizer=_gen_l2_regularizer(use_l2_regularizer),
          bias_regularizer=_gen_l2_regularizer(use_l2_regularizer),
          name='fc1000')(x)
    
      # A softmax that is followed by the model loss must be done cannot be done
      # in float16 due to numeric issues. So we pass dtype=float32.
      x = layers.Activation('softmax', dtype='float32')(x)
    
      # Create model.
      return models.Model(img_input, x, name='resnet50')
    上記のresnet関数を作成し、モデルを作成して学習します.
    model = resnet50(num_classes=100)
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    model.fit(x_train, y_train, epochs=1)