11

In paper Generating High-Quality Crowd Density Maps using Contextual Pyramid CNNs, in Section 3.4, it said

Since, the aim of this work is to estimate high-resolution and high-quality density maps, F-CNN is constructed using a set of convolutional and fractionally-strided convolutional layers. The set of fractionally-strided convolutional layers help us to restore details in the output density maps. The following structure is used for F-CNN: CR(64,9)-CR(32,7)- TR(32)-CR(16,5)-TR(16)-C(1,1), where, C is convolutional layer, R is ReLU layer, T is fractionally-strided convolution layer and the first number inside every brace indicates the number of filters while the second number indicates filter size. Every fractionally-strided convolution layer increases the input resolution by a factor of 2, thereby ensuring that the output resolution is the same as that of input.

I would like to know the detail of fractionally-strided convolution layer.

Ethan
  • 1,633
  • 9
  • 24
  • 39
Haha TTpro
  • 243
  • 1
  • 2
  • 7

1 Answers1

13

Here is an animation of fractionally-strided convolution (from this github project):

where the dashed white cells are zero rows/columns padded between the input cells (blue). These animations are visualizations of the mathematical formulas from the article below:

A guide to convolution arithmetic for deep learning

Here is a quote from the article:

Figure [..] helps understand what fractional strides involve: zeros are inserted between input units, which makes the kernel move around at a slower pace than with unit strides [footnote: doing so is inefficient and real-world implementations avoid useless multiplications by zero, but conceptually it is how the transpose of a strided convolution can be thought of.]


Also, here is a post on this site asking "What are deconvolutional layers?" which is the same thing.

And here are two quotes from a post by Paul-Louis Pröve on different types of convolutions:

Transposed Convolutions (a.k.a. deconvolutions or fractionally strided convolutions)

and

Some sources use the name deconvolution, which is inappropriate because it’s not a deconvolution [..] An actual deconvolution reverts the process of a convolution.

Esmailian
  • 9,312
  • 2
  • 32
  • 48