Train your first Autoencoder for reconstructing a noisy image
The current demographics of datasets includes more unlabelled than labelled data. The acquisition of labeled data for a learning problem often requires a skilled human agent or a physical experiment. While analysis on labelled data is pretty straightforward and mostly boils down to figuring out the right approach and building & testing an appropriate model, unlabelled dataset poses many challenges depending on its size, type, depth etc.
Unlabelled datasets (paired with some labelled data) are at the core of semi-supervised learning. Semi-supervised learning helps improve accuracy of a model by making it learn representations of unlabelled data by using labelled data as a ground truth. This is called Representation Learning (a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data.)
What are Autoencoders?
Autoencoders are an unsupervised learning technique which we leverage for the task of representation learning which forces a compressed knowledge representation of the original input. The aim of an autoencoder is to learn a representation for a set of data, typically for dimensionality reduction, by training the network to ignore signal “noise”.

Applications of Autoencoders
- Dimensionality reduction
- Image processing
- Principal component analysis
- Machine Translation
- Information retrieval
- Anomaly detection
Tutorial:
In this tutorial, we will build an Autoencoder that will retrieve clean images from a set of noised MNIST images. We will be using Keras to build the model, MNIST as our dataset and we will train the model on Kaggle with GPU acceleration enabled.
Let's get into it!
Step 1: Make helper functions to...
- Display 8 images
- Add random noise to MNIST images
- Preprocess dataset by normalising the dataset
def display(array1, array2):
n = 8
indices = np.random.randint(len(array1), size=n)
images1 = array1[indices, :]
images2 = array2[indices, :]
plt.figure(figsize=(20, 4))
for i, (image1, image2) in enumerate(zip(images1, images2)):
ax = plt.subplot(2, n, i + 1)
plt.imshow(image1.reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
ax = plt.subplot(2, n, i + 1 + n)
plt.imshow(image2.reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
def make_some_noise(array):
noise_factor = 0.35
noisy_array = array + noise_factor * np.random.normal(
loc=0.0, scale=1.0, size=array.shape
)
return np.clip(noisy_array, 0.0, 1.0)
def preprocess_dataset(array):
array = array.astype("float32") / 255.0
array = np.reshape(array, (len(array), 28, 28, 1))
return array
Step 2: Divide the dataset into Training and Test set
# We only use images in this tutorial, so no need to load the labels
(train_data, _), (test_data, _) = mnist.load_data()
# Normalize and reshape the data
train_data = preprocess_dataset(train_data)
test_data = preprocess_dataset(test_data)
# Create "noised" data
noisy_train_data = make_some_noise(train_data)
noisy_test_data = make_some_noise(test_data)
# Display the train data and a version of it with added noise
display(train_data, noisy_train_data)

Step 3:
Make the model by adding the input to Encoder and Decoder Layers. Our Encoder layer consists of Conv2D & MaxPooling2D Layers and Decoder layers consists of Conv2DTranspose & Conv2D layers.
input = layers.Input(shape=(28, 28, 1))
# Encoder
model = layers.Conv2D(32, (3, 3), activation="relu", padding="same")(input)
model = layers.MaxPooling2D((2, 2), padding="same")(model)
model = layers.Conv2D(32, (3, 3), activation="relu", padding="same")(model)
model = layers.MaxPooling2D((2, 2), padding="same")(model)
# Decoder
model = layers.Conv2DTranspose(32, (3, 3), strides=2, activation="relu", padding="same")(model)
model = layers.Conv2DTranspose(32, (3, 3), strides=2, activation="relu", padding="same")(model)
model = layers.Conv2D(1, (3, 3), activation="sigmoid", padding="same")(model)
# Autoencoder
ae = Model(input, model)
ae.compile(optimizer="adam", loss="binary_crossentropy")
ae.summary()
Step 4: Train the model
ae.fit(
x=noisy_train_data,
y=train_data,
epochs=100,
batch_size=128,
shuffle=True,
validation_data=(noisy_test_data, test_data),
)
Step 5: Test model performance
predictions = ae.predict(noisy_test_data)
display(noisy_test_data, predictions)

Post training the training and validation loss was 0.0797 and 0.0794 respectively.
As we can see from above, the model prediction was pretty good with the test images.
[Optional]
Checkout the public notebook:

This post was inspired by Francois Chollet's blog post:

[Bonus]
You can checkout the below article from Jeremy Jordan on Autoencoders for a deep dive:

Cheers!