Information

Author(s) Pierre Dupont, Benoit Ronval
Deadline 12/04/2026 23:00:00
Submission limit No limitation

Συνδεθείτε

A4.3 - Deep Learning: convolutional networks

With the traditional neural networks used in the previous tasks, the image, i.e. \(3\) 2D-matrices of pixels (one matrix for each color channel), has been flattened into a 1D vector in the very first layer. By doing this, we loose a lot of spatial information... This is where Convolutional Neural Networks come in!

CNNs are inspired from the neural architecture of the visual cortex, and have therefore proven to be particularily effective for many image recognition tasks. Individual neurons receive information from a restricted region of the image, and the network is designed such that the filter positions overlap to cover the entire picture. Using this principle, features of the image are learned before flattening the information in a vector. This vector is then used for classification using dense layers like you experimented in the previous tasks.

https://inginious.info.ucl.ac.be/course/LINFO2262/A4-3/cnn.png

You can load the data with the following code snippet. Note that the order of the channels is changed (as compared to the previous task) to fit the expected input of Keras Conv2D layers.

x_train = np.load("p4_2026_svhn_train.npz")["images"].transpose(3, 0, 1, 2)
y_train = np.load("p4_2026_svhn_train.npz")["labels"].squeeze()
x_test = np.load("p4_2026_svhn_test.npz")["images"].transpose(3, 0, 1, 2)
y_test = np.load("p4_2026_svhn_test.npz")["labels"].squeeze()

# Convert string labels as int, followed by one-hot encoding
labels = sorted(list(set(y_train)))
y_train = keras.utils.to_categorical([labels.index(x) for x in y_train])
y_test = keras.utils.to_categorical([labels.index(x) for x in y_test])

Question 1: A Convolutional Neural Network

Graded question with feedback [20 points]

In this question, we ask you to build a model closely matching the above picture.

The model must have:

  • 3 convolutional blocks with a number of filters of respectively 16, 32, and 64; each block containing:

    • A Conv2D layer with a ReLU activation.
    • A Max Pooling layer with a pool size = (2, 2)
  • A dense layer with 64 hidden units.

  • A dense output layer with a softmax activation.

  • The network loss should be the categorical cross entropy loss. You should also specify that the categorical accuracy will be used as metric to evaluate the model.

  • The network optimizer is the Adam optimizer (an optimized version of the gradient descent procedure) with a learning rate = \(10^{-5}\).

You are free to set the values for the other metaparameters (kernel size, initializers, ...).

Implement you neural network in the variable model. Just define and compile the network, don't fit it on the training data.

Question 2: Model fitting

Test question [50 points]: This question will be graded after the deadline. You will only receive credit for this question if you answered question 1 correctly.

Fit the model you defined in Question 1 on the training data and evaluate its test set accuracy. To get a better view of its performance also analyze the confusion matrix between the true class labels and the predicted class labels.

Next, try to improve the test accuracy by adapting some or all of the following meta-parameters. Note that one uses here the test set to evaluate the impact of meta-parameter choices. It means that the so-called test set rather serve as validation set in this case.

Here are the meta-parameters you should study:

  • the number of convolutional blocks (one could use fewer or more blocks than the model defined in question 1),
  • the kernel size of each Conv2D layer,
  • the pool size of each Max Pooling layer,
  • whether or not padding should be used,
  • whether or not one could improve the test accuracy by weighting the different classes at training time (looking at the impact on the confusion matrix computed on the test set could be instructive),
  • which initializer to consider for the kernel in each Conv2D layer,
  • which regularization, if any, should be used when learning the parameters of each Conv2D layer,
  • the number of hidden units (instead of 64 used in the model of question 1) in the Dense Hidden Layer (you must stick to only 1 hidden layer in this part of the model, consistently with the image above),
  • which Dropout or SpatialDropout rate(s), if any, should be used.

In all cases, you must also satisfy all the following constraints:

  • your final model size (in Mbytes) should not be larger than the max file size (reported below). Otherwise, you will not be able to submit it!
  • consider only ReLU activation functions (except for the final output layer).
  • always use the default batch size = 32 when training your models.
  • always fit your model for exactly 100 epochs.

Note that evaluating the impact of all the meta-parameters listed may take quite some time. Furthermore, the results may fluctuate from one run (of 100 epochs) to the next, due to random initialization and the local optimum found by the optimizer. Ideally, one should reproduce the results over e.g. 10 runs to get a sense of the real impact of fine tuning these meta-parameters. Use your understanding of the role of each of the meta-parameter, and possibly some preliminary tests with few runs, to focus on the most promising configurations.

Submit your best model once you are satisfied with the obtained test accuracy.

Once your best neural network is fitted for 100 epochs on the training set, save it in a .keras file using the save function of Keras and upload it below.

Your score will be proportional to the obtained test accuracy.

Apart from reshaping the data as indicated above, do not perform any pre-processing. Inginious evaluates your neural network on the reshaped test set.


Max file size: 238.4 MiB
Allowed extensions: .keras
Question 3: Multiple choice

Test question [30 points]: This question will be graded after the deadline. You will only receive credit for this question if you answered question 1 correctly.

Select all valid affirmations