Naive Bayes - Yousef's Notes
Naive Bayes

Naive Bayes

#What it does

  • Based on [[Baye’s Theorem]]
  • It assumes that features are conditionally independent given the class label.
  • Simplifies computation and often performs well in practice, despite the naive assumption.
$$ P(C|X) = \frac{P(X|C) \cdot P(C)}{P(X)} $$
  • $P(C|X)$: Posterior probability (probability of the class given the features).
  • $P(X|C)$: Likelihood (probability of the features given the class).
  • $P(C)$: Prior probability (probability of the class occurring).
  • $P(X)$: Evidence (probability of the features).

#How it works

  • 3 types
    • Gaussian (continuous data)
    • Multinomial (discrete data)
    • Bernoulli (binary)
  • If a Feature value doesn’t appear in the training data for a class, it results in zero probability for that class (can be addressed using Laplace Smoothing)
  • Calculates posterior probabilities.

#Preconditions

  • Conditional independence rarely holds in the real world.
  • Requires minimal training data and has few parameters to tune.

#Evaluation

#Advantages

  • Fast and efficient
  • Simple to implement
  • Performs well with small data
  • Handles high-dimensional data

#Limitations

  • Assumes independence of features
  • Struggles with zero probabilities