Hi Folks. This is a little post where I’m leaving a bunch of single line notes of things I’ve found useful in tensorflow, for image based tasks. I mostly have no idea what I’m doing, but I’ve found some things while working on it that work well for me. Maybe you’ll find this useful!
- The RMSPropOptimizer seems to be very good for a wide range of tasks.
0.001
seems to be a good default learning rate - Deeper networks need more training rounds to converge. Anecdotally, doubling the number of training rounds per layer (1000, 2000, 4000, etc) seems to be good
- The inception module is a very good tool for image discrimination tasks (e.g. recognition, the discriminator in a GAN, the encoder layer in a VAE). Most of the implementations on the internet are complicated. here’s mine.
- Directly connecting dense layers to undownsampled convolutional networks will cause a computational complexity/memory explosion. As a rule of thumb, a dense layer should have no more than 2048 inputs.
- If you’re building a GAN or VAE, DCGAN has a very good decoder layer. here’s my implementation
- When you’re building a GAN or VAE, all the output looking exactly the same for the first few hundred generations is pretty normal, don’t worry, give it some time.
- Visualise everything
tf.nn.sparse_softmax_cross_entropy_with_logits
is the right loss metric for when your classification problem has exclusive classes (e.g. MNIST)
please hit me up on twitter @penelopezone if any of this strikes you as wrong!