Build your first Neural Network for Image Restoration

Computer Graphics Jan 7, 2021

With the launch of Nvidia RTX 3000 series graphic cards, Nvidia also launched DLSS 2.0 (Deep Learning Super Sampling) uses artificial intelligence and machine learning to produce an image that looks like a higher-resolution image, without the rendering overhead. Nvidia’s algorithm learns from tens of thousands of rendered sequences of images that were created using a supercomputer. That trains the algorithm to be able to produce similarly beautiful images, but without requiring the graphics card to work as hard to do it.

Playing Watch Dogs: Legion and Cyberpunk 2077 with DLSS On with GeForce NOW, it looks really good with fewer dropped frames as compared to DLSS Off.

Simply put, the above process takes low-resolution rendered images on say 1440p resolution and upscale them as the end result of an exhaustive process of teaching Nvidia’s A.I. algorithm to generate better-looking games. This process of recovering a high resolution image from a low resolution input is called Super Resolution (SR).
SR-CNN Model Architecture

In this (introductory) tutorial, we build an SR-CNN model that learns end-to-end mapping of low resolution to high-resolution images. As a result, we can use it to improve the image quality of low-resolution images.

The Dataset:

Set 5 and 14 from the MATLAB code on this URL.

For the scope of this tutorial, we will not be deep diving into the preprocessing techniques, however, below is a short summary of the preprocessing steps involved:

  1. Produce low-resolution versions of these images by resizing the images, both downwards and upwards, using OpenCV and bilinear interpolation.
  2. Crop the images for training.
  3. Colour space conversions.

The SR-CNN consists of the following operations:

  1. Feature extraction: Extracts a set of feature maps from the upscaled Low Resolution image.
  2. Non-linear mapping: Maps the feature maps representing Low Resolution to High Resolution patches.
  3. Reconstruction: Produces the High Resolution image from High Resolution patches.

Let's have a quick look at our model:

def src_cnn_model():
    SR_CNN = Sequential()
    SR_CNN.add(Conv2D(filters=128, kernel_size = (9, 9), kernel_initializer='glorot_uniform',
                     activation='relu', padding='valid', use_bias=True, input_shape=(None, None, 1)))
    SR_CNN.add(Conv2D(filters=64, kernel_size = (3, 3), kernel_initializer='glorot_uniform',
                     activation='relu', padding='same', use_bias=True))
    SR_CNN.add(Conv2D(filters=1, kernel_size = (5, 5), kernel_initializer='glorot_uniform',
                     activation='linear', padding='valid', use_bias=True))
    adadelta = Adadelta(lr=0.0003)
    SR_CNN.compile(optimizer=adadelta, loss='mean_squared_error', metrics=['mean_squared_error'])
    return SR_CNN

Checkout the below repository for the full code:

Jay Sinha / SRCNN-Blog
GitLab Community Edition

To evaluate the performance of this model, we will be using three image quality metrics: Peak Signal to Noise Ratio (PSNR), Mean Squared Error (MSE), and the Structural Similarity Index (SSIM).

For saving time during training, we will be importing pre-trained weights of the above model from this repository and testing it. Below screenshot compares the evaluation metrics like MSE which drops from 121.81 to 47.36 between the degraded image and image generated from the SR-CNN Model.


As we can see, the Reconstructed Image takes a lot less time and computational resources to be generated and therefore, can optimise Gaming Performance (Details and Frame Rates) by a lot in real-time.