Train your first Autoencoder for reconstructing a noisy image

Deep-Learning Apr 27, 2021

The current demographics of datasets includes more unlabelled than labelled data. The acquisition of labeled data for a learning problem often requires a skilled human agent or a physical experiment. While analysis on labelled data is pretty straightforward and mostly boils down to figuring out the right approach and building & testing an appropriate model, unlabelled dataset poses many challenges depending on its size, type, depth etc.

Photo by gebhartyler / Unsplash

Unlabelled datasets (paired with some labelled data) are at the core of semi-supervised learning. Semi-supervised learning helps improve accuracy of a model by making it learn representations of unlabelled data by using labelled data as a ground truth. This is called Representation Learning (a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data.)

What are Autoencoders?

Autoencoders are an unsupervised learning technique which we leverage for the task of representation learning which forces a compressed knowledge representation of the original input. The aim of an autoencoder is to learn a representation for a set of data, typically for dimensionality reduction, by training the network to ignore signal “noise”.

Applications of Autoencoders

  1. Dimensionality reduction
  2. Image processing
  3. Principal component analysis
  4. Machine Translation
  5. Information retrieval
  6. Anomaly detection

Tutorial:

In this tutorial, we will build an Autoencoder that will retrieve clean images from a set of noised MNIST images. We will be using Keras to build the model, MNIST as our dataset and we will train the model on Kaggle with GPU acceleration enabled.

Let's get into it!

Step 1: Make helper functions to...

  1. Display 8 images
  2. Add random noise to MNIST images
  3. Preprocess dataset by normalising the dataset
def display(array1, array2):
    n = 8

    indices = np.random.randint(len(array1), size=n)
    images1 = array1[indices, :]
    images2 = array2[indices, :]

    plt.figure(figsize=(20, 4))
    for i, (image1, image2) in enumerate(zip(images1, images2)):
        ax = plt.subplot(2, n, i + 1)
        plt.imshow(image1.reshape(28, 28))
        plt.gray()
        ax.get_xaxis().set_visible(False)
        ax.get_yaxis().set_visible(False)

        ax = plt.subplot(2, n, i + 1 + n)
        plt.imshow(image2.reshape(28, 28))
        plt.gray()
        ax.get_xaxis().set_visible(False)
        ax.get_yaxis().set_visible(False)

    plt.show()

def make_some_noise(array):

    noise_factor = 0.35
    noisy_array = array + noise_factor * np.random.normal(
        loc=0.0, scale=1.0, size=array.shape
    )

    return np.clip(noisy_array, 0.0, 1.0)
 
 
 def preprocess_dataset(array):

    array = array.astype("float32") / 255.0
    array = np.reshape(array, (len(array), 28, 28, 1))
    return array

Step 2: Divide the dataset into Training and Test set

# We only use images in this tutorial, so no need to load the labels
(train_data, _), (test_data, _) = mnist.load_data()

# Normalize and reshape the data
train_data = preprocess_dataset(train_data)
test_data = preprocess_dataset(test_data)

# Create "noised" data
noisy_train_data = make_some_noise(train_data)
noisy_test_data = make_some_noise(test_data)

# Display the train data and a version of it with added noise
display(train_data, noisy_train_data)
Images before and after adding noise

Step 3:

Make the model by adding the input to Encoder and Decoder Layers. Our Encoder layer consists of Conv2D & MaxPooling2D Layers and Decoder layers consists of Conv2DTranspose & Conv2D layers.

input = layers.Input(shape=(28, 28, 1))

# Encoder
model = layers.Conv2D(32, (3, 3), activation="relu", padding="same")(input)
model = layers.MaxPooling2D((2, 2), padding="same")(model)
model = layers.Conv2D(32, (3, 3), activation="relu", padding="same")(model)
model = layers.MaxPooling2D((2, 2), padding="same")(model)

# Decoder
model = layers.Conv2DTranspose(32, (3, 3), strides=2, activation="relu", padding="same")(model)
model = layers.Conv2DTranspose(32, (3, 3), strides=2, activation="relu", padding="same")(model)
model = layers.Conv2D(1, (3, 3), activation="sigmoid", padding="same")(model)
       
          
# Autoencoder
ae = Model(input, model)
ae.compile(optimizer="adam", loss="binary_crossentropy")
ae.summary()

Step 4: Train the model

ae.fit(
    x=noisy_train_data,
    y=train_data,
    epochs=100,
    batch_size=128,
    shuffle=True,
    validation_data=(noisy_test_data, test_data),
)

Step  5: Test model performance

predictions = ae.predict(noisy_test_data)
display(noisy_test_data, predictions)
Denoised Prediction of Noisy Images

Post training the training and validation loss was 0.0797 and 0.0794 respectively.

As we can see from above, the model prediction was pretty good with the test images.

[Optional]

Checkout the public notebook:

Image Denoising
Explore and run machine learning code with Kaggle Notebooks | Using data from no data sources

This post was inspired by Francois Chollet's blog post:

Building Autoencoders in Keras

[Bonus]

You can checkout the below article from Jeremy Jordan on Autoencoders for a deep dive:

Introduction to autoencoders.
Autoencoders are an unsupervised learning technique in which we leverage neuralnetworks for the task of representation learning. Specifically, we’ll design aneural network architecture such that we impose a bottleneck in the networkwhich forces a compressed knowledge representation of the origina…

Cheers!

Tags