When you create an AI which generates images, such as semantic segmentation, you have to expand the image size after you reduce the size by convolution process. Deconvolution is a kind of calculation item for this purpose.
The detailed calculation process on convolution is described in Lesson 15 , but no explanation is done on stride and padding. We are going to discuss on stride, because it's necessary for our explanation on deconvolution.
The following is an example of convolution process on a gray scale image, which is shown at Lesson 15 .
The convolution process in the example above gets results moving the filter to the right and bottom at one pixel per calculation. But we can also move it at two pixels. The number of filter movement for convolution is called stride. Please see an example below.
If we add null pixels around an image before convolution, the result image size will be larger. The null pixel size around image is called padding. The following is an example of convolution calculation with the value of 3 as stride and 2 as padding.
Now, we are going to see what deconvolution is.
Deconvolution is a process, which is different from convolution, to expand the input data size before calculation.
As we have already seen, we can change the output image size by selecting filter size, stride, and padding values appropriately. Deconvolution is a reverse process with regards to the relation between parameters, which are filter size, stride, and padding size, and output image size. However, you have to take care that the process is not just a reverse process of convolution.
Let's look back on the example above. The input parameters were,
Input Image Size : ×
Filter Size : ×
Stride : ×
Padding : ×
The output size is × .
Therefore, if you execute deconvolution with
Input Image Size : ×
Filter Size : ×
Stride : ×
Padding : ×
as input parameters, the output size is, × .
Now, we start our explanation on the details of the calculation process.
(1) Adding null pixels between pixels with regards to stride
Add null pixels between adjacent pixels of input data. The each null pixel count is the value of stride minus one. Now, you can get an output image with the expected size, just by the same calculation process as convolution, with one as the stride value.
(2) Adding null pixels around input image with regards to filter size
Add null pixels around input data. The each null pixel count is the value of filter size minus one. Now, even if the filter size is larger than one, you can get an output image with the expected size, just by the same calculation process as convolution, with one as the stride value.
(3) Deleting null pixels around input image with regards to padding
Delete null pixels around input data. The each deletion pixel count is the value of padding. Padding on convolution means adding null pixels to input data. Therefore, deleting null pixels from input data results in the expected output image size on deconvolution.
(4) Conducting convolution with one as stride
Conduct convolution process to the created input data. Please take care you use 1 as the stride value.
Just text explanation must not be easy to understand. Now, let's see an example.
Suppose input image size is × , and filter size is × . The following is an example.
We are going to see how the process works on some examples of stride and padding values.
♦ Stride=1 , Padding=0
->
♦ Stride=2 , Padding=0
->
♦ Stride=2 , Padding=1
->
If you execute convolution process on each example with the output image by deconvolution, the output image size will be × . This result means, the previous description Deconvolution is a reverse process with regards to the relation between parameters, which are filter size, stride, and padding size, and output image size. has been confirmed.
The procedure on multi-channel image is the same as convolution. You can get multi-channel output image with equal to or more than two filter groups.