A1.1 - Decision Tree Learning: theory - [LINFO2262] Machine Learning: classification and evaluation

This task will be graded after the deadline

עליך להדביק את הפקודה הזאת במסוף שלך:

הסיסמה להתחברות אל

נא לענות על כל השאלות. שגיאה פנימית ל־{} אין סיומת תקפה. {} כבדה מדי.

שאלה 1: Attribute selection

Suppose you have the following training set with four boolean attributes $x_1$ , $x_2$ , $x_3$ and $x_4$ and a boolean output $y$ .

https://inginious.info.ucl.ac.be/course/LINFO2262/A1-1/q1_B.png

What is the tree learned by CART (without any pruning mechanism) from this training set?

You should be able to construct it from your general understanding of this algorithm without going into all the details of computing explicitly every step of this algorithm.

https://inginious.info.ucl.ac.be/course/LINFO2262/A1-1/q1_t1_B.png

https://inginious.info.ucl.ac.be/course/LINFO2262/A1-1/q1_t4_B.png

https://inginious.info.ucl.ac.be/course/LINFO2262/A1-1/q1_t5_B.png

https://inginious.info.ucl.ac.be/course/LINFO2262/A1-1/q1_t2_B.png

שאלה 2: Attribute selection (continued)

Is there another binary decision tree which would perfectly classify the same training examples and would not be as deep as the one proposed by CART?

No, CART always finds the minimum-depth tree because it maximizes the information gain at each step.

No, CART found the minimum-depth three for this training set, but it is a matter of chance. In general, CART does not always find the minimum-depth tree.

Yes, since CART never finds the minimum-depth tree, on any training set, because it is a greedy algorithm (the attribute selection at a node is done without regard to the selection at children nodes).

Yes, for this example, there exists a smaller tree that perfectly classify all training examples. This tree is not found by CART since it is a greedy algorithm (the attribute selection at a node is done without regard to the selection at children nodes).

שאלה 3: Drop of impurity

Suppose you have a training set with 20 positive and 20 negative examples.

Compute the drop of impurity for the two following splits, performed at the root of the tree:

$[4+, 14-]$ (left) $\ [16+, 6-]$ (right)

$[8+, 18-]$ (left) $\ [12+, 2-]$ (right)

Give your answer in the following format: drop_first, drop_second

When rounding, give at least 3 decimals. For example, 0.452.

שאלה 4: Drop of impurity (continued)

Based on your answer in the previous question, which split will be chosen by CART?

Beware: you will only receive credit for this question if you answered the previous one correctly.

$[4+, 14-]$ (left) $\ [16+, 6-]$ (right)

$[8+, 18-]$ (left) $\ [12+, 2-]$ (right)

שאלה 5: Drop of impurity (continued)

Suppose now that the mistakes on the positive examples are about 5 times as costly as mistakes to the negative. One way of dealing with such a cost imbalance is to replicate the positive examples 5 times each. The splits become:

$[20+, 14-]$ (left) $\ [80+, 6-]$ (right)

$[40+, 18-]$ (left) $\ [60+, 2-]$ (right)

What is their drop of impurity?

Give your answer in the following format: drop_first, drop_second

When rounding, give at least 3 decimals. For example, 0.452.

שאלה 6: Drop of impurity (continued)

Consequently, which split will be performed by CART?

Beware: you will only receive credit for this question if you answered the previous one correctly.

$[20+, 14-]$ (left) $\ [80+, 6-]$ (right)

$[40+, 18-]$ (left) $\ [60+, 2-]$ (right)

שאלה 7: Continuous attributes

Consider a classification problem based only on 2 continuous attributes (the instance space is the plane $\mathbb{R}^2$ ). CART incorporates these attributes by defining threshold based boolean attributes. In the induced tree, each node corresponds to a particular decision boundary splitting the examples into two regions. What is the shape (in the instance space) of the decision boundaries learned by CART? Into how many regions is the instance space divided before pruning? Does it depend on the attribute values of the training examples? Does it depend on the number of classes?

Select all valid sentences

The decision boundaries are parallel to the axes. Thus, the instance space is split into rectangles.

The instance space will be divided into four regions, as each attribute will be used to split the space in two.

The number of classes influences the shape (e.g. circle shape, rectangle shape, ...) of the regions.

The number of regions depends only on the number of classes.

The number of regions does not depend on the attribute values of the training examples.

The number of regions depends on the depth of the decision tree.

יוצרים	Pierre Dupont, Benoit Ronval
מועד הגשה	23/02/2025 23:00:00
מגבלת הגשות	אין הגבלה

מידע

כניסה