forum.hugin.com

User Forums => Patterns & Predictions => FAQ => Topic started by: Anders L Madsen on August 22, 2007, 08:49:49

Title: Are characters such as ”,”, ”.”, ”;”, etc ignored?
Post by: Anders L Madsen on August 22, 2007, 08:49:49
Characters such as ”,”, ”.”, ”;”, etc are ignored when creating a Boolean model from unstructured data as in the example of mail classification unless the <-wpunct> option is specified. In this case the aforementioned characters are assumed to be part of the word preceding the character.

White spaces should be used as word separators when building a Boolean model.