DECISION TREES
Overview
Decision Trees are supervised learning algorithms used for both classification and regression. They create a model that predicts target values by learning simple decision rules inferred from data features in a tree-like structure.
Tree Components
- ›Root Node - Top decision point
- ›Internal Nodes - Decision points
- ›Leaf Nodes - Final predictions
- ›Branches - Outcome paths
Information Gain
Information Gain = H(S) - Σ(|Sv|/|S| * H(Sv)) Entropy: H(S) = -Σ(p_i * log2(p_i)) Gini Impurity: Gini(S) = 1 - Σ(p_i)² # Algorithm: 1. Calculate impurity of dataset 2. For each feature, calculate weighted impurity after split 3. Choose feature with highest information gain 4. Recursively build subtrees
Advantages
Easy to interpret
No feature scaling needed
Handles missing values
Feature selection automatic