Informations

Auteur(s) Pierre Dupont, Benoit Ronval
Date limite 09/03/2025 23:00:00
Limite de soumission Pas de limite

Se connecter

A2.3 - Linear Discriminant Implementations

In this task, you will implement a perceptron algorithm and evaluate it on the waveform dataset.


Question 1: Perceptron with margin

Graded question with feedback [20 points]

Implement in Python the perceptron with margin algorithm as described in the lecture slides.

Specifically, we ask the learning examples to be considered in order. Note that, according to the Python indexing, this order is x[0], x[1], ..., then back to x[0], .... Note also that the indexing convention may slightly differ in the lecture slides but the logic is exactly the same. For simplicity, no shuffling of the training examples is considered here.

Your function will be tested on (subsets of) the data available in the task A2-2, reduced here to a binary classification problem. The x input is a matrix containing the same input features. The y output (passed as a separate parameter) is a binary vector with value = 1 (representing class C1 C2) or value = 0 representing class C0.

You may notice that, in the perceptron algorithm, each y value is normally represented as 1 or -1. Therefore, the first step of the implementation of the perceptron should be to transform y to match this specification. This can be implemented by the line y = 2 * y - 1, already included in the code below.

The specification of the function is as follows (pay attention to the type of each parameter):

def perceptron(x, y, b, w_init, eta, epoch, decay=False):
    """
    Learn a binary classifier: single-example perceptron with margin
    Inputs:    x : a feature matrix containing an example on each row [pandas DataFrame of shape n x d]
               y : a vector with the class (either 1 or 0) of each example  [numpy array of size n]
               b : a margin [float]
               w_init : a vector with the initial weight values (intercept in w_init[0]) [numpy array of size d+1]
               eta : an (initial) learning rate [float]
               epoch : the maximal number of iterations (1 epoch = 1 iteration
                       of the "repeat" loop in the lecture slides) [int]
               decay : a boolean [default=False], when True the learning rate at iteration k should be equal to eta/k. When False,
                       the learning rate should be equal to eta and should remain constant.
    Output:    A weight vector [list or numpy array of size d+1] (intercept in w[0])
    """

Replace ... by your own Python code below.

Question 2: Predicting using a linear discriminant

Graded question with feedback [20 points]

Implement in Python the function predict which predicts the labels of the examples contained in the data matrix x, according to the weight vector w.

Note that, even though you have to use 1 or -1 in your implementation of the perceptron, the predict function must output the class labels according to their original representation: either 1 or 0.

The specification of the function is as follows:

def predict(w, x):
    """
    Predicts the class labels of the data matrix x
    Inputs:    w a vector with the weight values defining a linear discriminant (intercept in w[0]) [list or numpy array of size d+1]
               x a feature matrix containing an example on each row [pandas DataFrame of shape n x d]
    Output:    A vector of labels (1 or 0) [numpy array of size n]
    """

Replace pass by your own Python code below.

Question 3: Windrow-Hoff Algorithm

Test question [30 points]

Implement in Python the Windrow-Hoff (or LMS) algorithm as described in the lecture slides.

Specifically, we ask the learning examples to be considered in order. Note that, according to the Python indexing, this order is x[0], x[1], ..., then back to x[0], .... Note also that the indexing convention may slightly differ in the lecture slides but the logic is exactly the same. For simplicity, no shuffling of the training examples is considered here.

Your function will be tested on (subsets of) the data available in the task A2-2, reduced here to a binary classification problem.

The signature of the lms(...) function is identical to the signature of the perceptron(...) function, but the interpretation of some of its parameters (or their default value) may differ.
The specification of the function is as follows (pay attention to the type of each parameter):

def lms(x, y, b, w_init, eta, epoch, decay=True):
    """
    Learn a binary classifier: single-example perceptron with margin
    Inputs:    x : a feature matrix containing an example on each row [pandas DataFrame of shape n x d]
               y : a vector with the class (either 1 or 0) of each example  [numpy array of size n]
               b : equality constraint value (same value for each training point) [float]
               w_init : a vector with the initial weight values (intercept in w_init[0]) [numpy array of size d+1]
               eta : an (initial) learning rate [float]
               epoch : the maximal number of iterations (1 epoch = 1 iteration
                       of the "repeat" loop in the lecture slides) [int]
               decay : a boolean [default=True], when True the learning rate at iteration k should be equal to eta/k. When False,
                       the learning rate should be equal to eta and should remain constant.
    Output:    A weight vector [list or numpy array of size d+1] (intercept in w[0])
    """

Replace ... by your own Python code below.

Question 4: Analysis

Test question [30 points]: This question will be graded after the deadline. You will only receive credit for this question if you answered questions 1 and 2 correctly.

Run this algorithm on the training set till observing convergence or when a maximal number of iterations is reached. How is the margin and/or the learning rate chosen influencing the number of iterations performed? And the accuracy measured on the test set ?

Which conclusions can you draw from your experiments? To obtain sound results, you should probably repeat your experiments for various initializations of the model.

Select all valid affirmations.