Author Topic: Are characters such as ”,”, ”.”, ”;”, etc ignored?  (Read 10603 times)

Offline Anders L Madsen

  • HUGIN Expert
  • Hero Member
  • *****
  • Posts: 2282
    • View Profile
Characters such as ”,”, ”.”, ”;”, etc are ignored when creating a Boolean model from unstructured data as in the example of mail classification unless the <-wpunct> option is specified. In this case the aforementioned characters are assumed to be part of the word preceding the character.

White spaces should be used as word separators when building a Boolean model.