Githubプロジェクト-mmdetectionモデルトレーニング

16591 ワード

debian Ubuntu Linux

[   - MMDetection: Open MMLab Detection Toolbox and Benchmark - 2019](https://arxiv.org/abs/1906.07155)

[github open-mmlab/mmdetection](https://github.com/open-mmlab/mmdetection) 
[Github    - mmdetection       - AIUAI](https://www.aiuai.cn/aifarm1216.html)

mmdetectionは、MMDistributedDataParallelとMMDataParallelをそれぞれ使用する分散型トレーニングと非分散型トレーニングを実現する.
logファイルとcheckpointsファイルを含むトレーニング中のすべての出力をconfigプロファイルに自動的に保存するwork_dirパスで

学習率(lr)設定

[1]-configファイルにおけるデフォルト学習率は、8 GPUsおよび2 img/gpu(batchsize=8 x 2=16)である.
[2]-Linear Scaleベース
具体的なGPUs数とGPU 1枚あたりのピクチャ数に応じて得られるbatchsizeの大きさに比例して学習率を設定し、例えば4 GPUs x 2 img/gpu=8(batchsize)に対してlr=0.01を設定する.16 GPUs x 4 img/gpu=64(batchsize)についてlr=0.08を設定.2.単GPUトレーニング

python3 tools/train.py ${CONFIG_FILE} \
    --work_dir ${YOUR_WORK_DIR} #  work_dir   .

マルチGPUsトレーニング

dist_train.sh:

#!/usr/bin/env bash

PYTHON=${PYTHON:-"python3"}

CONFIG=$1
GPUS=$2

$PYTHON -m torch.distributed.launch \ 
    --nproc_per_node=$GPUS \
    $(dirname "$0")/train.py $CONFIG --launcher pytorch ${@:3}

マルチGPUsトレーニング:
./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
オプションパラメータ(optional arguments)の説明:
[1]---validate-(強く推奨)、訓練中、k個のepochsごとに検証を行った(デフォルトk=1).
[2]---work_dir${WORK_DIR}-configファイルで設定された動作経路.
[3]---resume_from${CHECKPOINT_FILE}-checkpiontファイルからトレーニングを再開する.
[4] - resume_fromとload_fromの違い:
resume_fromは、モデルウェイト(model weights)と最適化状態(optimizer status)を同時にロードし、epochは指定checkpointを継承する情報である.予期せぬ端末の訓練過程の回復に一般的に用いる.
load_fromはモデルの重み(model weights)だけをロードし、訓練過程のepochは0から訓練を開始した.一般的にモデルfinetuningに用いられる.
注意:
このような分布式訓練を試みて、ずっと問題が発生して、試してみることができます:python3 tools/train.py configs/faster_rcnn_r50_fpn_1x.py --gpus 2 --validate

マルチマシントレーニング

slurm
管理されたクラスタでは、mmdetectionの実行にslurm_を使用できます.train.shスクリプト:
slurm_train.sh :

#!/usr/bin/env bash

set -x

PARTITION=$1
JOB_NAME=$2
CONFIG=$3
WORK_DIR=$4
GPUS=${5:-8}
GPUS_PER_NODE=${GPUS_PER_NODE:-8}
CPUS_PER_TASK=${CPUS_PER_TASK:-5}
SRUN_ARGS=${SRUN_ARGS:-""}
PY_ARGS=${PY_ARGS:-"--validate"}

srun -p ${PARTITION} \
    --job-name=${JOB_NAME} \
    --gres=gpu:${GPUS_PER_NODE} \
    --ntasks=${GPUS} \
    --ntasks-per-node=${GPUS_PER_NODE} \
    --cpus-per-task=${CPUS_PER_TASK} \
    --kill-on-bad-exit=1 \
    ${SRUN_ARGS} \
    python -u tools/train.py ${CONFIG} --work_dir=${WORK_DIR} --launcher="slurm" ${PY_ARGS}

実行:./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR} [${GPUS}]
例えばdevパーティションでは、16 GPUsトレーニングMask R-CNNの例を採用する.
./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x.py/nfs/xxxx/mask_rcnn_r50_fpn_1x 16

カスタムデータセット

カスタムデータセットの場合、最も簡単な方法は、データセットをmmdetectionに既存のデータセットのフォーマット(COCOやPASCAL VOCなど)に置き換えることである.5.1. COCOデータセットフォーマット
5つのカテゴリを含むカスタムデータセットを例に、COCO形式に変換するとする.
[1]-新規mmdet/dataset/custom_dataset.py:

from .coco import CocoDataset
from .registry import DATASETS


@DATASETS.register_module
class CustomDataset(CocoDataset):
    CLASSES = ('a', 'b', 'c', 'd', 'e')

[2]-mmdet/datasets/initを編集する.py、追加:

from .custom_dataset import CustomDataset

[3]-CocoDatasetと同様に、configファイルでCustomDatesetを使用することができる.
次のようになります.

# dataset settings
dataset_type = 'CustomDataset'
data_root = 'data/custom/'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
data = dict(
    imgs_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/custom_train.json',
        img_prefix=data_root + 'custom_train/',
        img_scale=(1333, 800),
        img_norm_cfg=img_norm_cfg,
        size_divisor=32,
        flip_ratio=0.5,
        with_mask=False,
        with_crowd=True,
        with_label=True),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/custom_test.json',
        img_prefix=data_root + 'custom_test/',
        img_scale=(1333, 800),
        img_norm_cfg=img_norm_cfg,
        size_divisor=32,
        flip_ratio=0,
        with_mask=False,
        with_crowd=True,
        with_label=True),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'annotations/custom_test.json',
        img_prefix=data_root + 'custom_test/',
        img_scale=(1333, 800),
        img_norm_cfg=img_norm_cfg,
        size_divisor=32,
        flip_ratio=0,
        with_mask=False,
        with_label=False,
        test_mode=True))

5.2. 非COCOデータセットフォーマット
カスタムデータセットの寸法データをCOCOまたはPASCAL形式に変換したくない場合は、mmdetectionもサポートする.
mmdetectionは簡単な寸法データフォーマットを定義し、すべてのデータセットはオンラインでもオフラインでも互換性がある.
mmdetectionのデータ表示フォーマットはdictからなるlistフォーマットであり、各dictは1枚のピクチャに対応する.
[1]-testingの場合、3つのfield:filename(相対パス)、width、heightを含む.
[2]-trainingの場合、4つのfield:filename(相対パス)、width、height、annを含む.annは少なくとも2つのfield:boxesとlabelsを含むdictであり、いずれもnumpy arrays形式である.crowd/difficult/ignored bboxesなどの他の寸法情報を提供するデータセットもありますが、mmdetectionはbboxes_を使用しています.ignoreとlabels_ignoreで表す
たとえば、

[
    {
        'filename': 'a.jpg',
        'width': 1280,
        'height': 720,
        'ann': {
            'bboxes':  (n, 4),
            'labels':  (n, ),
            'bboxes_ignore':  (k, 4),
            'labels_ignore':  (k, ) (optional field)
        }
    },
    ...
]

カスタムデータセットには、次の2つの処理方法があります.
[1]-オンライン変換データ寸法フォーマット
新しいDataset classをカスタマイズし、CustomDatasetに継承し、load_を書き換えます.annotations(self,ann_file)とget_ann_info(self,idx)は、mmdet/datasets/cocoに類似する.pyとmmdet/datasets/voc.py.
[2]-オフライン変換データ寸法フォーマット
カスタムデータセットの寸法フォーマットを、上記の所望のフォーマットに変換し、tools/convert_と同様にpickleファイルまたはjsonファイルに保存します.datasets/pascal_voc.py. その後、CustomDatasetを使用することができる.
pascal_voc.py:

import argparse
import os.path as osp
import xml.etree.ElementTree as ET

import mmcv
import numpy as np

from mmdet.core import voc_classes

label_ids = {name: i + 1 for i, name in enumerate(voc_classes())}


def parse_xml(args):
    xml_path, img_path = args
    tree = ET.parse(xml_path)
    root = tree.getroot()
    size = root.find('size')
    w = int(size.find('width').text)
    h = int(size.find('height').text)
    bboxes = []
    labels = []
    bboxes_ignore = []
    labels_ignore = []
    for obj in root.findall('object'):
        name = obj.find('name').text
        label = label_ids[name]
        difficult = int(obj.find('difficult').text)
        bnd_box = obj.find('bndbox')
        bbox = [
            int(bnd_box.find('xmin').text),
            int(bnd_box.find('ymin').text),
            int(bnd_box.find('xmax').text),
            int(bnd_box.find('ymax').text)
        ]
        if difficult:
            bboxes_ignore.append(bbox)
            labels_ignore.append(label)
        else:
            bboxes.append(bbox)
            labels.append(label)
    if not bboxes:
        bboxes = np.zeros((0, 4))
        labels = np.zeros((0, ))
    else:
        bboxes = np.array(bboxes, ndmin=2) - 1
        labels = np.array(labels)
    if not bboxes_ignore:
        bboxes_ignore = np.zeros((0, 4))
        labels_ignore = np.zeros((0, ))
    else:
        bboxes_ignore = np.array(bboxes_ignore, ndmin=2) - 1
        labels_ignore = np.array(labels_ignore)
    annotation = {
        'filename': img_path,
        'width': w,
        'height': h,
        'ann': {
            'bboxes': bboxes.astype(np.float32),
            'labels': labels.astype(np.int64),
            'bboxes_ignore': bboxes_ignore.astype(np.float32),
            'labels_ignore': labels_ignore.astype(np.int64)
        }
    }
    return annotation


def cvt_annotations(devkit_path, years, split, out_file):
    if not isinstance(years, list):
        years = [years]
    annotations = []
    for year in years:
        filelist = osp.join(devkit_path, 'VOC{}/ImageSets/Main/{}.txt'.format(
            year, split))
        if not osp.isfile(filelist):
            print('filelist does not exist: {}, skip voc{} {}'.format(
                filelist, year, split))
            return
        img_names = mmcv.list_from_file(filelist)
        xml_paths = [
            osp.join(devkit_path, 'VOC{}/Annotations/{}.xml'.format(
                year, img_name)) for img_name in img_names
        ]
        img_paths = [
            'VOC{}/JPEGImages/{}.jpg'.format(year, img_name)
            for img_name in img_names
        ]
        part_annotations = mmcv.track_progress(parse_xml,
                                               list(zip(xml_paths, img_paths)))
        annotations.extend(part_annotations)
    mmcv.dump(annotations, out_file)
    return annotations


def parse_args():
    parser = argparse.ArgumentParser(
        description='Convert PASCAL VOC annotations to mmdetection format')
    parser.add_argument('devkit_path', help='pascal voc devkit path')
    parser.add_argument('-o', '--out-dir', help='output path')
    args = parser.parse_args()
    return args


def main():
    args = parse_args()
    devkit_path = args.devkit_path
    out_dir = args.out_dir if args.out_dir else devkit_path
    mmcv.mkdir_or_exist(out_dir)

    years = []
    if osp.isdir(osp.join(devkit_path, 'VOC2007')):
        years.append('2007')
    if osp.isdir(osp.join(devkit_path, 'VOC2012')):
        years.append('2012')
    if '2007' in years and '2012' in years:
        years.append(['2007', '2012'])
    if not years:
        raise IOError('The devkit path {} contains neither "VOC2007" nor '
                      '"VOC2012" subfolder'.format(devkit_path))
    for year in years:
        if year == '2007':
            prefix = 'voc07'
        elif year == '2012':
            prefix = 'voc12'
        elif year == ['2007', '2012']:
            prefix = 'voc0712'
        for split in ['train', 'val', 'trainval']:
            dataset_name = prefix + '_' + split
            print('processing {} ...'.format(dataset_name))
            cvt_annotations(devkit_path, year, split,
                            osp.join(out_dir, dataset_name + '.pkl'))
        if not isinstance(year, list):
            dataset_name = prefix + '_test'
            print('processing {} ...'.format(dataset_name))
            cvt_annotations(devkit_path, year, 'test',
                            osp.join(out_dir, dataset_name + '.pkl'))
    print('Done!')


if __name__ == '__main__':
    main()

モデルトレーニングのメインユニット

mmdetection訓練検出器に基づく主なユニットは、データロード(data loading)、モデル(model)、反復パイプ(iteration pipeline)を含む.6.1. データ・ロード
mmdetectionはDatasetとDataLoaderを使用してmultiple workersのデータロードを行う.
Datasetはa dict of data items corresponding the arguments of models’forward methodを返します.
ターゲット検出タスクでは、データが同じサイズ(例えば、image size,gt box size等)でない可能性があるため、mmdetectionは、mmcvライブラリにおける新しいDataContainerを採用する、異なるサイズのデータを収集し配布する.参照data_container.py
. 6.2. モデル定義
mmdetectionは、4つの基本的なカスタマイズ可能なモデルモジュール(モデル部品)を定義します.
[1]-backbone:FCNネットワークモジュール、抽出特徴図、例えばResNet、MobileNet.
[2]-neck:backbonesとheadsネットワークの間のモジュール、例えばFPN、APFPN.
[3]-head:bbox予測やmask予測などの特定のタスクのネットワークモジュール.
[4]-roi extractor:RoIの特徴を特徴図から抽出するためのモジュール、例えばRoI Align.
基本モジュールに基づいて、SingleStageDetectorとTwoStageDetectorの共通検出モデルの設計フレームワークを図のようにする.
6.2.1. backbonesモジュールの構築
MobileNetが新しい部品を開発した例:
[1]-新しいファイルの作成-mmdet/models/backbones/mobilenet.py:
import torch.nn as nn
from …registry import BACKBONES
@BACKBONES.register_module class MobileNet(nn.Module):

def __init__(self, arg1, arg2):
    pass

def forward(x):  # should return a tuple
    pass

[2]-mmdet/models/backbones/init.pyモジュールをインポートするには、次の手順に従います.
from .mobilenet import MobileNet
[3]-configファイルでの使用:
model = dict( … backbone=dict( type=‘MobileNet’, arg1=xxx, arg2=xxx), …
6.2.2. necksモジュールの構築
mmdetectionが提供する基本モジュールと検出器の設計フレームワークに基づいて、configファイルを通じて無痛にネットワークモデルを定義することができる.
Path Aggregation Network for Instance Segmentationのような新しいネットワークモジュールを実装する必要がある場合
論文中のPAFPN(path aggregation FPN)は、2つのことをする必要があります.
[1]-新しいファイル、mmdet/models/necks/pafpnを作成します.py :
from …registry import NECKS
@NECKS.register class PAFPN(nn.Module):

def __init__(self,
            in_channels,
            out_channels,
            num_outs,
            start_level=0,
            end_level=-1,
            add_extra_convs=False):
    pass

def forward(self, inputs):
    # implementation is ignored
    pass

[2]-configファイルの変更:
元FPN設定内容:
neck=dict( type=‘FPN’, in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5)
次のように変更します.
neck=dict( type=‘PAFPN’, in_channels=[256, 512, 1024, 2048], out_channels=256, num_outs=5)
6.2.3. 新しいモデルの定義
mmdetectionは新しいモデルを定義し、BaseDetectorを継承する必要があります.主に以下のabstractメソッドを定義します.
[1] - extract_Feat()は、image batchが与える、shapeは(n,c,h,w)であり、特徴図を抽出する.
[2] - forward_train()、訓練モードのforward方法.
[3] - simple_test()は、データの増強、単一スケール(single scale)テストを行わない.
[4] - aug_test()、データの増強(multi-scale、flipなど)をテストする.
具体的には、TwoStageDetector
. 6.3. 反復パイプ
mmdetectionは単機と多機環境に対して、分布式訓練を採用する.
サーバにGPUsが8個あると仮定すると、訓練時には8個のプロセス(processes)が起動し、各プロセスは1個のGPU上で実行する.
各プロセスには独立したモデル、データロード、およびオプティマイザ(optimizer)がある.
モデルパラメータは、開始時に一度だけ同期.
1回のforwardとbackwardの計算後、すべてのGPUsの勾配はallreducedを行い、オプティマイザはモデルパラメータを更新する.
勾配はallreducedであるため、反復終了後、すべてのプロセスのモデルパラメータは一致する.

モデルテスト7.1.データセットテスト

mmdetectionでは、COCO、PASCAL VOCなどのデータセット全体の精度評価を行うテストスクリプトを提供し、以下をサポートします.
[1]-単一GPUテスト
[2]-マルチGPUテスト
[3]-可視化検出結果.
次のようになります.

# single-gpu testing
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] [--show]

# multi-gpu testing
./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

パラメータの説明:
[1] - RESULT_FILE-出力結果保存ファイル、pickle形式.指定しない場合は、テスト結果は保存する.
[2] - EVAL_METRICS-検出結果の評価に使用される項目.オプションはproposal_fast, proposal, bbox, segm, keypoints.
[3]---show-このパラメータを指定すると、検出結果が可視化される.(単GPUテストのみに適用する.)
例えば、既に訓練するcheckpointファイルがある、checkpoints/パスに置かれているとする.

[1] -    Faster R-CNN，        .

python tools/test.py configs/faster_rcnn_r50_fpn_1x.py \
    checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth \
    --show

[2] -    Mask R-CNN，    bbox   mask AP.

python tools/test.py configs/mask_rcnn_r50_fpn_1x.py \
    checkpoints/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth \
    --out results.pkl --eval bbox segm

[3] -   8 GPUs     Mask R-CNN，   bbox   mask AP.

./tools/dist_test.sh configs/mask_rcnn_r50_fpn_1x.py \
    checkpoints/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth \
    8 --out results.pkl --eval bbox segm

7.2. 画像テスト

#!/usr/bin/python3
#!--*-- coding:utf-8 --*--
import os
from mmdet.apis import init_detector, inference_detector, show_result
import time
import random

#    
config_file = 'configs/cascade_rcnn_r101_fpn_1x.py'
checkpoint_file = 'checkpoints/cascade_rcnn_r101_fpn_1x_20181129-d64ebac7.pth'

#    
model = init_detector(config_file, checkpoint_file, device='cuda:0')

#      
img = '/path/to/test.jpg'  
# 
#img = mmcv.imread(img), which will only load it once
start = time.time()
result = inference_detector(model, img)
print('[INFO]timecost: ', time.time() - start)
show_result(img, result, model.CLASSES)

#      
imgs = ['test1.jpg', 'test2.jpg']
for i, result in enumerate(inference_detector(model, imgs)):
    show_result(imgs[i], result, model.CLASSES, 
print('[INFO]Done.')

PXE自動マウントスクリプトの概要

[スプリング]MyBatis