python 3リードファイル符号化エラー

1721 ワード

Exception in thread Thread-4:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/home/yangguang/machineLearning/learn_machineLearning/Tensorflow_learning/cnn_own/data_prepare/src/tfrecord.py", line 129, in _process_image_files_batch
    image_buffer, height, width = _process_image(filename, coder)
  File "/home/yangguang/machineLearning/learn_machineLearning/Tensorflow_learning/cnn_own/data_prepare/src/tfrecord.py", line 71, in _process_image
    image_data = f.read()
  File "/usr/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

INFO:root:2018-05-25 11:11:55.104869: Finished writing all 4800 images in data set.

コードを実行するときに、readのファイルがbinary形式であるため、以上のエラーが見つかりました.解決方法:
with open(filename, 'r') as f:
    image_data = f.read()

readメソッドを'r'から'rb'に変更し、bはbinaryを表す.
tensorflowのBYTELISTはpython 2に文字列を直接入力できますが、python 3ではbytesタイプに変換する必要があります.
text_b = bytes(text, encoding='utf-8')