[tensorflow]量子化パラメータの確認

6163 ワード

TensorFlow TensorflowLite Python3 Python3 テキストリンク

量子化とは

モデルのパラメータや演算結果を浮動小数点(float32など)から固定小数点(int8など)に変換することです。
モデルサイズが削減できる、組み込みデバイスなど浮動小数点演算器を持たないデバイスでも高速に推論できる、などのメリットがあります。

通常のkerasモデルはパラメータを浮動小数点型で保持しており、各レイヤーの入出力も浮動小数点型です。

output(float32) = input(float32) * weight(float32) + bias(float32)

一方、量子化モデルではパラメータを固定小数点型で保持しており、各オペレータの入出力も固定小数点型になります。

output(int8) = input(int8) * weight(int8) + bias(int32)

kerasモデルのパラメータ型の確認

kerasモデルのパラメータのデータ型は以下の変数で確認できます。

from keras.applications.resnet import ResNet50

model = ResNet50(include_top='False', weights='imagenet')

print("Compute dtype is {}".format(model.compute_dtype))
print("Variable dtype is {}".format(model.dtype))

Compute dtype is float32
Variable dtype is float32

量子化モデルのパラメータ型の確認

量子化モデルのパラメータのデータ型はget_tensor_detailsを使って確認できます。

import tensorflow as tf

interpreter = tf.lite.Interpreter(model_path="resnet50.tflite")

tensors = interpreter.get_tensor_details()
for tensor in tensors:
    print('index:{}, name:{}, data type:{}'.format(tensor['index'], tensor['name'], tensor['dtype']))

index:0, name:serving_default_input_1:0, data type:<class 'numpy.int8'>
index:1, name:resnet50/conv1_pad/Pad/paddings, data type:<class 'numpy.int32'>
index:2, name:resnet50/pool1_pad/Pad/paddings, data type:<class 'numpy.int32'>
index:3, name:resnet50/avg_pool/Mean/reduction_indices, data type:<class 'numpy.int32'>
index:4, name:resnet50/conv1_conv/Conv2D, data type:<class 'numpy.int8'>
index:5, name:resnet50/conv1_bn/FusedBatchNormV3, data type:<class 'numpy.int32'>
…

tensorflow v2.8にはTensorflow Lite Model Analyzerという機能もあり、下記のように確認もできます。

import tensorflow as tf

tf.lite.experimental.Analyzer.analyze(model_path="resnet50.tflite")

…
Tensors of Subgraph#0
  T#0(serving_default_input_1:0) shape_signature:[-1, 224, 224, 3], type:INT8
  T#1(resnet50/conv1_pad/Pad/paddings) shape:[4, 2], type:INT32 RO 32 bytes
  T#2(resnet50/pool1_pad/Pad/paddings) shape:[4, 2], type:INT32 RO 32 bytes
  T#3(resnet50/avg_pool/Mean/reduction_indices) shape:[2], type:INT32 RO 8 bytes
  T#4(resnet50/conv1_conv/Conv2D) shape:[64, 7, 7, 3], type:INT8 RO 9408 bytes
  T#5(resnet50/conv1_bn/FusedBatchNormV3) shape:[64], type:INT32 RO 256 bytes
…

per-tensorとper-axis

量子化モデルは各オペレータごとにscaleとzero-pointの値を持っており、元のkerasモデルの値へは以下のように変換されます。

real_value(float32) = (int_value(int8) - zero_point(int32)) * scale(float32)

このzero_pointとscaleの値はget_tensors_detailsで取得した値の['quantization_parameters']から確認できます。
またzero_pointとscaleはtensor毎に1つの値を持つ(per-tensor)場合と、tensorのchannel毎に1つの値を持つ(per-axis)場合とがあります。
tensorflowではoperator毎にどのように持つかが決まっており、以下のページにて確認できます。

Author And Source

この問題について([tensorflow]量子化パラメータの確認), 我々は、より多くの情報をここで見つけました https://qiita.com/t226/items/36f1e6aa78d5c83a3873

著者帰属：元の著者の情報は、元のURLに含まれています。著作権は原作者に属する。

Content is automatically searched and collected through network algorithms . If there is a violation . Please contact us . We will adjust (correct author information ,or delete content ) as soon as possible .