Loading [MathJax]/jax/output/HTML-CSS/jax.js

Information

Author(s) Alexander Gerniers, Pierre Dupont
Deadline Keine Frist
Abgabenlimit No limitation

Einloggen

Getting Started


Welcome to the Machine Learning course!

You are about to go on a journey that will shape you into skillful data scientists, able to navigate through high-dimensional spaces, explore mysterious forests, invoque separating hyperplanes and defeat overfitting!


start/matrix.jpg

You will learn how to apply common machine learning algorithms such as decision trees, support vector machines and of course the inescapable neural networks, among many others. At the end of the semester, you will use your freshly acquired skills to solve a practical and complex learning task during a real machine learning competition.


Software

The Python programming language will be your faithful companion during these assignments (check out this tutorial or this other tutorial, if you are unfamiliar with Python). Several libraries will be used throughout the course:

  • NumPy is the most commonly used Python library for scientific computing.
  • pandas will be used to represent data under the form of data frames, a structure which allows you to easily manipulate your data, create plots, etc.
  • scikit-learn contains Python implementations of a large number of standard machine learning algorithms.
  • possibly, additional libraries will also be mentioned for some later assignment

These libraries can easily be installed using Python's package installer:

pip3 install -U numpy
pip3 install -U pandas
pip3 install -U scikit-learn

The assignments

This course is divided into 5 assignments. Assignments 1 to 4 will train you in different machine learning topics. Each one contains two parts:

  • A theoretical part (typically 30 % of the grade of each assignment). These tasks contain theoretical questions about the topics covered in the lectures, and don't involve any coding. The grades and feedback for these tasks are only available after the submission deadline.

  • A practical part (typically 70 % of the grade of each assignment), in which you will use Python to build machine learning models on actual data. These tasks contain two types of questions:

    • First, there will be some warm-up questions, for which you will get real-time feedback. These questions don't count in the final grade, but will be very useful in order to make sure you understand how things work and have the skills to go further.
    • Once you will have correctly answered to every warm-up question, you will be able to move on to the test questions, which are graded. For these questions, you won't have a real-time feedback about the result. This means you are only going to get limited feedback upon submission: if ever you do not respect the requested format, possible syntactical errors in your code (up to the implemented checks on inginious), ... The grading of these questions will only be available after the submission deadline.

The 5th assignment will have a different format. It is a machine learning competition, during which you will be free to implement your best model to solve a challenging learning task.

Please note that, as these projects are graded, plagiarism is strictly forbidden. We ask you to agree to the below anti-plagiarism statement before starting the first project. Plagiarism concerns specically the sharing of code (fragments) or any answer that you are supposed to provided on Inginious. You are nevertheless more than welcome to share questions you might have about the tasks you have to solve. The Moodle student forum of the course is the perfect platform, to submit your own questions and to suggest answers to the questions submitted by others.

Have fun!


Anti-Plagiarism Statement

I hereby certify that:

  • The material (text, results, code...) that I will submit for this project is coming from my own work. My submissions will not contain material from the work of other students. Re-use of publicly available code fragments (e.g. from Stack Overflow) or generative AI models is allowed, provided any such use is explictly quoted as a comment in each function of your submitted code.
  • I will not distribute any material related to these projects, in person or on any repository (discord, github, bitbucket, Facebook groups, etc.) accessible to anybody, even after the deadlines.

Any violation of the above statements will be considered as cheating or plagiarism and will be reported as such to the President of the Jury.