32_DenseNet

2022. 4. 11. 23:15

DenseNet

이번에는 DenseNet이라는 모델을 알아보겠습니다.

DenseNet은 ResNet과 비교를 하여 설명합니다.

ResNet은 부분적으로 shortcut connection (잔차 학습)을 진행해주지만,

DenseNet은 전체를 다 이어주는 방법을 사용합니다.

연결이 되는 횟수는 $\frac{L(L+1)}{2}$ 입니다.

위의 사진을 예로, $\frac{5(5+1)}{2}$ = 15번의 connection이 이루어집니다.

하지만 여기서 또 다른점이 있는데,

ResNet은 각 데이터의 원소값들을 Sum 해주는 방법을 사용 했지만,

DenseNet은 Concatenation을 이용하여 연산량을 줄여줌과 동시에

이전 layer들에서 나온 feature map 정보를 보존할 수 있게 해줍니다.

각각의 layer들은 loss function과 input signal로부터의 gradient에 direct로 접근할 수 있어서

더 쉽게 학습할 수 있습니다.

또한, 논문에서는 regularizing effect가 있어서 overfitting을 방지해준다고도 나와있습니다.

Growth rate

DenseNet은 구조상 계속해서 Concatenation이 이루어져야 합니다.

그래서 layer에서 알맞은 channel 값을 사용해야 하는데

여기서 Growth rate (k)를 제시합니다.

처음 Conv2d는 2k를 사용하고,

Bottleneck layers에서 1x1 Conv2d layer에서는 4k의 channel 값을 사용합니다.

DenseBlock에서는 본 논문에서 제시한 Densely Connected Convolution이 이루어지고,

그 뒤에 Convolution, Pooling을 통해서 Down-sampling이 이루어집니다.

그리고 너무 많은 channel이 생기는것을 방지하기위해 0.5로 줄여줍니다.

DenseNet 구조

Code

import torch
import torch.nn as nn
from torchsummary import summary


#################################################################################
class BottleNeck(nn.Module):
    def __init__(self, in_channel, growth_rate):
        super(BottleNeck, self).__init__()
        self.residual = nn.Sequential(
            nn.BatchNorm2d(in_channel),
            nn.ReLU(),
            nn.Conv2d(in_channel, growth_rate * 4, kernel_size=1, stride=1, padding=0),
            nn.BatchNorm2d(growth_rate * 4),
            nn.ReLU(),
            nn.Conv2d(growth_rate * 4, growth_rate, kernel_size=3, stride=1, padding=1)
        )

    def forward(self, x):
        return torch.cat([x, self.residual(x)], 1)


class Transition(nn.Module):
    def __init__(self, in_channel, out_channel):
        super(Transition, self).__init__()
        self.down_sample = nn.Sequential(
            nn.BatchNorm2d(in_channel),
            nn.ReLU(),
            nn.Conv2d(in_channel, out_channel, 1, stride=1, padding=1),
            nn.AvgPool2d(2, stride=2)
        )

    def forward(self, x):
        return self.down_sample(x)


class CustomDenseNet(nn.Module):
    def __init__(self, nblocks, growth_rate=32, reduction=0.5, num_classes=10, init_weights=True):
        super(CustomDenseNet, self).__init__()

        self.growth_rate = growth_rate
        inner_channel = 2 * growth_rate

        self.conv1 = nn.Sequential(
            nn.Conv2d(3, inner_channel, kernel_size=7, stride=2, padding=3),
            nn.MaxPool2d(3, 2, padding=1)
        )

        self.feature = nn.Sequential()
        for i in range(len(nblocks) - 1):
            self.feature.add_module('dense_block_{}'.format(i), self._make_dense_block(nblocks[i], inner_channel))
            inner_channel += growth_rate * nblocks[i]
            out_channel = int(reduction * inner_channel) # 너무 많은 channel이 생기는걸 방지
            self.feature.add_module('transition_layer_{}'.format(i), Transition(inner_channel, out_channel))
            inner_channel = out_channel

        self.feature.add_module('dense_block_{}'.format(len(nblocks)-1),
                                self._make_dense_block(nblocks[len(nblocks) - 1], inner_channel))
        inner_channel += growth_rate * nblocks[len(nblocks) - 1]
        self.feature.add_module('bn', nn.BatchNorm2d(inner_channel))
        self.feature.add_module('relu', nn.ReLU())

        self.avg_pool = nn.AdaptiveAvgPool2d((1, 1))
        self.linear(nn.Linear(inner_channel, num_classes))

    def forward(self, x):
        x = self.conv1(x)
        x = self.feature(x)
        x = self.avg_pool(x)
        x = x.view(x.size(0), -1)
        x = self.linear(x)
        return x

    def _make_dense_block(self, nblock, inner_channel):
        dense_block = nn.Sequential()
        for i in range(nblock):
            dense_block.add_module('bottle_neck_layer_{}'.format(i), BottleNeck(inner_channel, self.growth_rate))
            inner_channel += self.growth_rate
        return dense_block


if __name__ == '__main__':
    device = 'cuda'
    model = CustomDenseNet([6, 12, 24, 6])
    # check model
    # model = DenseNet_121().to(device)
    summary(model, (3, 224, 224))

출처:

https://deep-learning-study.tistory.com/545

'Deep Learning > Pytorch' 카테고리의 다른 글

C++, Pytorch 적용하기 02 (.pt 만들기) - Visual Studio 2019 (0)	2022.11.02
33_Xception (0)	2022.04.20
31_ResNext (0)	2022.04.08
25_Pytorch_GoogLeNet 1x1 Convolution (0)	2022.03.15
30_AlexNet (0)	2022.03.01

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

IT 공부방

32_DenseNet

DenseNet

Growth rate

DenseNet 구조

Code

'Deep Learning > Pytorch' 카테고리의 다른 글

+ Recent posts

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역