TensorFlow画像データプリプロセッシング

5500 ワード

コンピュータビジュアル

前に書く
先に紹介した栗では、画像の元の画素行列を直接使用しています.しかし,入力前に画像の前処理を行うことで,モデルが無関係な要因の影響を受けることをできるだけ避けることができる.ほとんどの画像認識問題では,画像前処理過程によりモデルの精度を向上させることができる.
1.画像符号化処理
我々が通常述べたRGB画像は,画像上の異なる位置,異なる色の輝度を表す3次元行列と見なすことができる.しかし、画像を記憶する際には、これらのマトリクスの数字を直接記録するのではなく、圧縮符号化された結果を記録する.従って、使用時に復号化のプロセスが必要である.TensorFlowはjpgとpng形式の画像に対する符号化/復号関数:tfを提供する.image.decode_jpeg()とtf.image.decode_png().コードは次のとおりです.

with tf.Session() as sess:
    img_data = tf.image.decode_jpeg(image_raw_data)
    
    #            。
    print(img_data.eval())
    img_data.set_shape([1797, 2673, 3])
    print(img_data.get_shape())

画像の表示:

#     
with tf.Session() as sess:
    plt.imshow(img_data.eval())
    plt.show()

2.画像サイズの調整
一般に,我々が取得した画像サイズは統一されていないが,ニューラルネットワーク入力ノードの個数は固定されている.したがって、画像の画素を入力としてニューラルネットワークに供給する前に、画像のサイズを統一する必要がある.
(1)アルゴリズムによって、新しい画像ができるだけ元の画像上のすべての情報を保存できるようにする.TensorFlowは、tf.image.resize_images()関数を提供する.

#        
with tf.Session() as sess:
    #      0-255         resize_images，      0-255     ，
    #        。          ，      0-1     。
    image_float = tf.image.convert_image_dtype(img_data, tf.float32)

    #method                   ：0=      ， 1=     ， 2=      ， 3=     
    resized = tf.image.resize_images(image_float, [300, 300], method=0)
    
    plt.imshow(resized.eval())
    plt.show()

(2)画像を切り取ったり塗りつぶしたりする

#       
with tf.Session() as sess:    
    croped = tf.image.resize_image_with_crop_or_pad(resized, 100, 100)
    padded = tf.image.resize_image_with_crop_or_pad(resized, 1000, 1000)
    plt.imshow(croped.eval())
    plt.show()
    plt.imshow(padded.eval())
    plt.show()

(3)スケールによる画像サイズの調整

#    50%   
with tf.Session() as sess:   
    central_cropped = tf.image.central_crop(resized, 0.5)
    plt.imshow(central_cropped.eval())
    plt.show()

3.画像反転
多くの画像認識問題が要約されると、画像の反転は認識結果に影響を及ぼさず、トレーニングセットで反転前処理を行い、トレーニングサンプルを増加させることができる.

#    
with tf.Session() as sess: 
    #     
    flipped1 = tf.image.flip_up_down(img_data)
    #     
    flipped2 = tf.image.flip_left_right(img_data)
    
    #     
    transposed = tf.image.transpose_image(resized)
    #plt.imshow(transposed.eval())
    #plt.show()
    
    #            。
    flipped = tf.image.random_flip_up_down(img_data)
    #            。
    flipped = tf.image.random_flip_left_right(img_data)

4.画像の色調整
画像の色前処理の目的は、画像反転と同様である.
(1)明るさとコントラストの調整

with tf.Session() as sess:
    #            ，           ，         。
    image_float = tf.image.convert_image_dtype(img_data, tf.float32)
    
    #       -0.5。
    #adjusted = tf.image.adjust_brightness(image_float, -0.5)
    
    #       -0.5
    #adjusted = tf.image.adjust_brightness(image_float, 0.5)
    
    #  [-max_delta, max_delta)            。
    adjusted = tf.image.random_brightness(image_float, max_delta=0.5)
    
    #        -5
    #adjusted = tf.image.adjust_contrast(image_float, -5)
    
    #        +5
    #adjusted = tf.image.adjust_contrast(image_float, 5)
    
    #  [lower, upper]            。
    #adjusted = tf.image.random_contrast(image_float, lower, upper)

    #       ，        0-1   。
    adjusted = tf.clip_by_value(adjusted, 0.0, 1.0)
    plt.imshow(adjusted.eval())

(2)色相と彩度の調整

with tf.Session() as sess:
    #            ，           ，         。
    image_float = tf.image.convert_image_dtype(img_data, tf.float32)
    
    adjusted = tf.image.adjust_hue(image_float, 0.1)
    #adjusted = tf.image.adjust_hue(image_float, 0.3)
    #adjusted = tf.image.adjust_hue(image_float, 0.6)
    #adjusted = tf.image.adjust_hue(image_float, 0.9)
    
    #  [-max_delta, max_delta]            。max_delta    [0, 0.5]  。
    #adjusted = tf.image.random_hue(image_float, max_delta)
    
    #        -5。
    #adjusted = tf.image.adjust_saturation(image_float, -5)
    #        +5。
    #adjusted = tf.image.adjust_saturation(image_float, 5)
    #  [lower, upper]            。
    #adjusted = tf.image.random_saturation(image_float, lower, upper)
    
    #                     0，    1。
    #adjusted = tf.image.per_image_whitening(image_float)
    
    #       ，        0-1   。
    adjusted = tf.clip_by_value(adjusted, 0.0, 1.0)
    plt.imshow(adjusted.eval())
    plt.show()

5.マーキングボックスの操作
多くの画像認識問題では,画像に注目すべき物体がマーキングボックスに囲まれる.

with tf.Session() as sess:         
    boxes = tf.constant([[[0.05, 0.05, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]])
    
    # sample_distorted_bounding_box             。
    image_float = tf.image.convert_image_dtype(img_data, tf.float32)
    
    begin, size, bbox_for_draw = tf.image.sample_distorted_bounding_box(
        tf.shape(image_float), bounding_boxes=boxes, min_object_covered=0.4)
    
    #       
    distorted_image = tf.slice(image_float, begin, size)
    plt.imshow(distorted_image.eval())
    plt.show()

    #                。          （2673x1797)，       
    #  Jupyter Notebook             ，              。
    image_small = tf.image.resize_images(image_float, [180, 267], method=0)
    batchced_img = tf.expand_dims(image_small, 0)
    image_with_box = tf.image.draw_bounding_boxes(batchced_img, bbox_for_draw)
    print(bbox_for_draw.eval())
    plt.imshow(image_with_box[0].eval())
    plt.show()

以上、tensorflowを用いて画像データを前処理する基本的な動作であり、完全なコードサンプルは後でGithubにアップロードされ、画像クリップの切り取り、画像サイズの調整、画像の反転、色の調整の全過程を含む.
以上
2018.06.10

文字列を指定すると、各単語の文字順を逆転させながら、空白と初期の文字順を保持する必要があります.

P4Runtime を試してみよう