Author Topic: Are characters such as ”,”, ”.”, ”;”, etc ignored?  (Read 12801 times)

Offline Anders L Madsen

  • HUGIN Expert
  • Hero Member
  • *****
  • Posts: 2288
    • View Profile
Characters such as ”,”, ”.”, ”;”, etc are ignored when creating a Boolean model from unstructured data as in the example of mail classification unless the <-wpunct> option is specified. In this case the aforementioned characters are assumed to be part of the word preceding the character.

White spaces should be used as word separators when building a Boolean model.