01_MMDetection 알아보기

2022. 4. 29. 11:46

MMDetection이란?

MMDetection은 Pytorch 기반의 Object Detection 오픈소스 라이브러리 입니다.

Modular Design
- Customized object detection framework를 쉽게 만들 수 있습니다.
Support of multiple frameworks out of box
- 인기있고 최신 detection frameworks를 지원합니다.
High efficiency
- Detectron2, maskrcnn-bechmark, SimpleDet 보다 training 속도가 빠릅니다.

일단 공부를 하는 중이어서 그런지 framework를 쉽게 만들 수 있다는건 공감이 안됩니다.

MMDetection은 모듈과 상속형식의 디자인을 config 파일에 통합시킨 시스템입니다.

config를 이용해서 다양한 실험을 편리하게 실행할 수 있습니다.

"tools/train.py" or "tools/test.py"를 사용할때 --cfg-options를 지정하여 구성을 수정할 수 있습니다.

Update config keys of dict chains.
- dict 키 순서에 따라 지정할 수 있습니다.
- ex) --cfg-options model.backbone.norm_eval=False
Update keys inside a list of configs.
- 몇몇의 config dict들은 list로 되어있습니다.
- ex) data.train.pipline은 보통 [dict(type='LoadImageFromFile'), ...] 이런 형식입니다.
- LoadImageFromFile을 LoadImageFromWebcam으로 바꾸고 싶다면
- --cfg-options data.train.pipeline.0.type=LoadImageFromWebcam
Update values of list/tuples.
- workflow=[('train', 1)] 이런 형식은 --cfg-options workflow="[(train,1),(val,1)]" 이렇게 바꾸면 됩니다.

config/_base_에는 4가지의 구성으로 이루어져 있습니다.

_base_내부의 구성으로 이루어져 있는 configs를 primitive라고 부릅니다.

_base_ 안에는 기본적으로 만들어져 있는 파일들이 있습니다.

ex) Faster R-CNN, Mask R-CNN, Cascade R-CNN, RPN, SSD.

저자들은 primitive를 상속하여 사용할 것을 추천합니다.

예를 들어, Faster R-CNN를 base로 수정본이 있다고 하면 _base_ 안에 있는 Faster R-CNN을 상속받아서 사용하면 됩니다.

_base_ = ../faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py

config 파일들의 이름은 아래의 규칙으로 이루어져 있습니다.

{model}_[model setting]_{backbone}_{neck}_[norm setting]_[misc]_[gpu x batch_per_gpu]_{schedule}_{dataset}

{xxx} is required field and [yyy] is optional.

{model}: model type like faster_rcnn, mask_rcnn, etc.
[model setting]: specific setting for some model, like without_semantic for htc, moment for reppoints, etc.
{backbone}: backbone type like r50 (ResNet-50), x101 (ResNeXt-101).
{neck}: neck type like fpn, pafpn, nasfpn, c4.
[norm_setting]: bn (Batch Normalization) is used unless specified, other norm layer type could be gn (Group Normalization), syncbn (Synchronized Batch Normalization). gn-head/gn-neck indicates GN is applied in head/neck only, while gn-all means GN is applied in the entire model, e.g. backbone, neck, head.
[misc]: miscellaneous setting/plugins of model, e.g. dconv, gcb, attention, albu, mstrain.
[gpu x batch_per_gpu]: GPUs and samples per GPU, 8x2 is used by default.
{schedule}: training schedule, options are 1x, 2x, 20e, etc. 1x and 2x means 12 epochs and 24 epochs respectively. 20e is adopted in cascade models, which denotes 20 epochs. For 1x/2x, initial learning rate decays by a factor of 10 at the 8/16th and 11/22th epochs. For 20e, initial learning rate decays by a factor of 10 at the 16th and 19th epochs.
{dataset}: dataset like coco, cityscapes, voc_0712, wider_face.

ex)

faster_rcnn_r50_caffe_c4.py

model : faster_rcnn

backbone : r50 (ResNet-50)

neck : c4

원본: