1000字范文 > 论文笔记（FCN网络语义分割）：Fully Convolutional Networks for Semantic Segmentation

论文笔记（FCN网络语义分割）：Fully Convolutional Networks for Semantic Segmentation

时间：2023-05-25 02:39:13

FCN论文笔记：Fully Convolutional Networks for Semantic Segmentation

语义分割模型结构时序：

FCN

SegNet

Dilated Convolutions

DeepLab (v1 & v2)

RefineNet

PSPNet

Large Kernel Matters

DeepLab v3

FCN模型：全卷积网络（CVPR ）

参考文献：

Fully Convolutional Networks for Semantic Segmentation

Jonathan Long∗ Evan Shelhamer∗ Trevor Darrell

UC Berkeley

FCN是端到端的，pixel-to-pixel的，优点在于：像素级别的预测，并且可以有监督的预训练。

Semantic segmentation faces an inherent tension between semantics and location: global information resolves what while local information resolves where.

Convnets are built on translation invariance. 卷积神经网络是基于平移不变性的，因为conv操作只针对相对坐标。

作者指出，全连接层可以看作是对整个fm的卷积操作。（实际上这种观点在ImageNet的某篇经典文章的中就是已经指出，起初是为了应对test过程中image尺寸不一样，所以就把全连接直接改成对于训练用的patch的卷积层，这样对于patch得到的结果就是各个类别的vector，但是对于test 的图像就是各个位置上可能是某类别的概率。在图像分类问题中，直接取各个空间位置的平均）。

但是这样就带来一个问题，就是由于stride和池化，导致输出的图像的类别的heatmap更加coarse，缩放的比例就是感知域的stride。所以考虑用某种方法把分辨率升上去。这里用的是deconvolution的方法。In a sense, upsampling with factor f is convolution with a fractional input stride of 1/f . So long as f is integral, a natural way to upsample is therefore backwards convolution (sometimes called deconvolution) with an output stride of f . 在deconvolution中，不一定要用fixed的layer，比如双线性插值，而是可以用带有激活函数的deconvolution layer学习一个非线性的upsampling。作者通过whole image trainning提高了效率。

语义分割网络的结构：用ILSVRC的分类器并且把它们用pixel-wise loss和in-network upsampling改造成dense prediction，然后通过fine-tuning训练一个分割网络。另外，还加入了层之间的跳线，从而fuse 语义的coarse信息与局部的appearance信息。

其中FCN-32s是32倍率直接升采样，而16×和8×分别是对应于利用了pool4和（pool4以及pool3）的信息，这样可以更好的恢复细节。

结果：

05月15日16:16:10

婚姻的许多技巧根拆弹专家的技巧重合 —— 作家，阿兰德波顿

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。