A Comparison of Event Models for Naive Bayes Text Classification

McCallum & Nigam 1998

Contrasts two types of naive bayes models:

  • bernoulli model -- binary word features & no dependencies between words
  • multinomial model -- unigram language model w/ integer word counts

This paper talks about how these 2 models differ, and when each works better. In short, they prefer multinomial models.