This book starts with setting up a Python virtual environment with the deep learning framework TensorFlow and then introduces the fundamental concepts of TensorFlow. Before moving on to Computer Vision, you will learn about neural networks and related aspects such as loss functions, gradient descent optimization, activation functions and how backpropagation works for training multi-layer perceptrons.
To understand how the Convolutional Neural Network (CNN) is used for computer vision problems, you need to learn about the basic convolution operation. You will learn how CNN is different from a multi-layer perceptron along with a thorough discussion on the different building blocks of the CNN architecture such as kernel size, stride, padding, and pooling and finally learn how to build a small CNN model.
Next, you will learn about different popular CNN architectures such as AlexNet, VGGNet, Inception, and ResNets along with different object detection algorithms such as RCNN, SSD, and YOLO. The book concludes with a chapter on sequential models where you will learn about RNN, GRU, and LSTMs and their architectures and understand their applications in machine translation, image/video captioning and video classification.