Information

Author(s) Pierre Dupont, Benoit Ronval
Deadline 12/04/2026 23:00:00
Submission limit No limitation

Sign in

A4.2 - Deep Learning: feedforward networks

During this assignment, you will build neural networks with the help of the Keras python library, which is build upon the TensorFlow library. There are 2 ways to use Tensorflow:

  • Install it on your machine using pip. We recommend TensorFlow version 2.20.0 (which is the version run by Inginious).
  • Use an interactive environment such as Kaggle Notebook (where TensorFlow is already installed).

If you do not have a local GPU or do not want to install tensorflow locally, we strongly advise you to use Kaggle, where you can access a GPU for 30 hours per week.

If you decide to use Kaggle, you first need to complete all of the following steps to gain access to the GPU:

  1. Create a Kaggle account and log in.
  2. Go to your account settings (icon in the upper right corner).
  3. Add a display name and complete the phone verification process.

You can create a new notebook on Kaggle by cliking on Code (in the left panel), then New Notebook. Once in the notebook, verify that you are using the GPU by clicking on Settings then Accelerator, and making sure that the GPU is selected.

IMPORTANT: Do not forget to stop your session when you are not working on your code, otherwise you will continue to consume your GPU hours (30h/week)!

In the notebook, you can upload files (such as the dataset) by following these steps:

  1. Select Upload in the right panel.
  2. Choose a name for the uploaded file, and select Private for the visibility.
  3. Click on Create at the bottom left.
  4. The file should now be visible in the Input section. You can copy the path to this file by hovering over it and selecting Copy file path.

You can save any file created by your code (such as the fitted models) by following these steps:

  1. Go to Save version -> Quick save (in version type).
  2. In Advanced settings, choose wether to save the output always or for this version.
  3. Clik on Save.

When you re-open the same notebook, you can add as input the previously saved files:

  1. On the right panel, select Add Input and select the previously saved notebook.
  2. The saved files should now be visible in the Input section.

IMPORTANT: the files created on Kaggle are TEMPORARY and will be deleted once you close the notebook! Check this documention for additional information about this, including alternative procedures.

Your task is to tackle an image recognition problem, for which deep learning has proven to be particularly effective. Your task at hand is to analyze Street View screenshots of house numbers, known as the SVHN dataset, from which we selected a sample of the images for this project. The goal is to predict which number, between 0 and 9, is contained at the center of each image. Each image is a \(32 \times 32 \times 3\) RGB image, and the dataset contains 50,000 train images and 10,000 test images. To make the task more challenging, the goal is to predict the number at the center of the image, but other numbers may be present in the image as well!

https://inginious.info.ucl.ac.be/course/LINFO2262/A4-2/house_numbers.png

The data can be downloaded from Moodle. Once downloaded in a local directory on your disk (or loaded in Kaggle), you can load the data using the following instructions:

import numpy as np
import tensorflow as tf
from tensorflow import keras

x_train = np.load("p4_2026_svhn_train.npz")["images"].transpose(3, 2, 0, 1)
y_train = np.load("p4_2026_svhn_train.npz")["labels"].squeeze()
x_test = np.load("p4_2026_svhn_test.npz")["images"].transpose(3, 2, 0, 1)
y_test = np.load("p4_2026_svhn_test.npz")["labels"].squeeze()

# Convert string labels as int, followed by one-hot encoding
labels = sorted(list(set(y_train)))
y_train = keras.utils.to_categorical([labels.index(x) for x in y_train])
y_test = keras.utils.to_categorical([labels.index(x) for x in y_test])

After executing the above instructions, x_train and x_test respectively contain the multi-dimensional arrays of the train and test images. The 3 last lines transform the labels (0, 1, ..., 9) contained in y_train and y_test to a binary array using one-hot encoding (i.e. label 0 is represented by the binary vector [1,0,0,0,0,0,0,0,0,0], label 1 by [0,1,0,0,0,0,0,0,0,0], etc.).

It is now time to build your first neural network, using the Sequential class of Keras.


Question 1: A first linear neural network

For this question, we ask you to build and to train a neural network with the following specifications:

  • The network contains 2 layers:

    • A flatten layer which flattens the \(32 \times 32 \times 3\) 2D RGB images into 1D vectors of size \(3072\). This layer has no parameter to train, it just reshapes the input data.
    • A dense output layer with a softmax activation function such that it can predict the target categorical (=class) variable. The kernel and bias are initializers set to RandomNormal.
  • The network loss will be the categorical cross entropy loss. You should also specify that the categorical accuracy will be used as metric to evaluate the model.

  • The network optimizer is the Adam optimizer (an optimized version of the gradient descent procedure) with a learning rate of \(10^{-5}\).

We are here essentially training 10 linear models and then applying a softmax on them. This is not yet a deep neural network.

Implement your neural network in the variable model. Just define and compile the network, don't fit it on the training data (actually, you don't have access to the training set on Inginious, so this would generate an error).

Question 2: A first linear neural network: model fitting

Fit your model from question 1 on the train data (with the default batch size, i.e. 32). Run 100 epochs to fit your model.

Once your neural network is fitted, save it in a .keras file using the save function of Keras and upload it below.


Max file size: 4.8 MiB
Allowed extensions: .keras
Question 3: A first linear neural network: performance

How many trainable parameters are contained in the whole network you just built? What is the measured test accuracy of the model you fitted in question 2?

Report your answer under the format: number_param, test_acc (use a decimal notation with at least 3 digits (ex. 0.748) for the accuracies, not %).

You must first validate the two previous questions before getting feeedback for this one. Note that this question depends on the specific model you uploaded in the previous question. If you change it, you may need to update your answer to this question accordingly.

Question 4: A non-linear network

Test question [15 points]: This question will be graded after the deadline. You will only receive credit for this question if you answered questions 1 to 3 correctly.

Build a new model, by adding a layer before the output layer of your neural net from question 1 (all other elements should be the same as in question 1). This additional layer must be a dense layer with a tanh activation function, and should contain 256 units. The kernel and bias are initialized to RandomNormal.

Implement you neural network in the variable model. Just define and compile the network, don't fit it on the training data.

Question 5: A non-linear network: model fitting

Test question [10 points]

Fit your model from question 4 on the train data. Run 100 epochs to fit your model. Use the default batch size value as before (i.e. 32).

Once your neural network is fitted, save it in a .keras file using the save function of Keras with and upload it below.


Max file size: 23.8 MiB
Allowed extensions: .keras
Question 6: A non-linear network: performance

Test question [10 points]

How many trainable parameters are contained in the whole network you built in question 4? What is the measured test accuracy of the model as fitted in question 5?

Report your answer under the format: number_param, test_acc (use a decimal notation for the accuracies, not %).

Question 7: A non-linear network: activation functions

Test question [15 points]

Besides a tanh activation funtion, other non-linear functions can be implemented in a hidden layer. Let's consider the ReLU activation: use the exact same network architecture as in the previous question, but with ReLU instead of tanh activation in the hidden layer. Which one performs better?

Since there is a lot of randomness involved, different training runs for the same network might yield different results. To get more robust results, perform 10 distinct runs for each model and report the average test accuracies.

Train each model during 100 epochs with default batch size value (i.e. 32).

Report the mean validation accuracy of both networks using the format: tanh_mean_acc, relu_mean_acc (use a decimal notation, not %).

Question 8: Multiple choice

Test question [10 points]

Based on your observations from the previous questions, and possibly some extra experiments, what can you conclude?

Select all valid affirmations

Question 9: Hidden units

Test question [15 points] This question will be graded after the deadline.

Take the neural network with ReLU activation you just defined. Let's study the impact of the number of units in the hidden layer.

Compare the mean accuracy over 10 runs of this network with 256 units (which you already computed in question 7) to those obtained with models with 64 and 1024 hidden units.

Perform 10 distinct runs (training + testing) for each model and average the results. Use 100 epochs to fit each model with default batch size value (i.e. 32).

Don't change anything in your network besides the number of hidden units.

Report the mean validation accuracies of the three models using the format: mean_test_acc_64, mean_test_acc_256, mean_test_acc_1024

Question 10: Adding Hidden Layers

Test question [15 points]

We will now investigate whether we can improve the classification accuracy by adding extra layers to the model.

Compare (locally, on your machine) the models from question 9 to models having 2 or 3 hidden layers. For example, the (64, 256) configuration denotes a network with 64 units in the first hidden layer and 256 units in the second hidden layer. Evaluate the following configurations to determine which one has the best test accuracy:

  1. (64, 64)
  2. (64, 256)
  3. (256, 64)
  4. (256, 256)
  5. (1024, 256)
  6. (256, 1024)
  7. (64, 64, 64)
  8. (256, 256, 256)

Keep the same hyper-parameters (e.g. learning rate, ReLU non-linearities, ...) in your network and play only with the numbers of layers and of hidden units.

Hint: There are several configurations (in terms of number of hidden layers and number of hidden units), so it would take a large amount of computing to perform 10 runs for each one. First get some results by averaging performance on only 3 runs (but of course consider several epochs of training and, ideally, monitor convergence). Afterwards, you should perform 10 runs to discriminate between the most promising models trained on 100 epochs with default batch size value (i.e. 32).

Once you have identified the best configuration in terms of test accuracy, report your answer below using the following format: config, test_acc, num_params where

  • config is a configuration index (e.g. 0 for \((64, 64)\), 1 for \((64, 256)\), ...) as listed above.
  • test_acc is the test accuracy of this model trained on 100 epochs (use at least 3 digits in decimal format),
  • num_params is the number of trainable parameters of this model.

Example of a well-formatted (but not necessarily correct) answer, assuming \((256, 64)\) would be the best configuration: 2, 0.335, 123456.

Question 11: Adding Hidden Layers (continued)

Test question [10 points]

Based on the experiments you made for the previous questions, select all valid affirmations.