rschandrastechblog: Why 7*7*256 as the number of units of input dense layer ?

Friday, May 21, 2021

Why 77256 as the number of units of input dense layer ?

This has reference to the google's tensorflow mnist dcgan tutorial.

The first dense layer at the input is configured to have number of units 7*7*256 and we are not able to find an explanation for this in the tutorial.

My impression about this is as follows:

Remember we want a 28x28 grey scale image as output of the generator. That means the required output shape is (None, 28, 28, 1) where first entity is batch size, which is none if a single image is required.

Now note that a Conv2DTranspose layer with strides=(2,2) essentially upsamples the input shape by a factor of 2, it doubles it. Secondly the number of filters of Conv2DTranspose layer become the channels, if I want the output to be grey scale, the number of filters should be one. Thus, if I want (None, 28,28,1) at the output of Conv2DTranspose layer, the shape of its input should be (None, 14,14,x). (No if channels is rather decided by current layer, x can be any value at input).

Suppose I am again putting one more Conv2DTranspose layer with strides=(2,2), obviously the input to this layer should be (None, 7,7,x) where x is number of filters.

In general, if a batch of images of size (h, w) is input to a Conv2DTranspose layer with strides = (2,2), its output will have shape (batch_size, 2*h, 2*w , no_of_filters)

The google tutorial further puts one more Conv2DTranspose layer [but with strides =(1,1) so it does not have the upsampling effect] and a Dense layer on top of it. These layers are not doing upsampling so the input shape remains 7x7. 7x7 is the image shape here. The first dense layer's output is in flattened shape, so if it has 7*7*x units, we can always reshape it to get an (7,7,x) image.

This is theory behind that 7*7*x number of units of first dense layer. The value 256 they have used is an arbitrary value which they might have derived empirically, I guess.

Another question on stack overflow

https://stackoverflow.com/questions/66844444/tensorflow-tutorial-dcgan-models-for-different-size-images

My answer to a question on SO

https://stackoverflow.com/questions/56081975/output-dimension-of-reshape-layer