Author Topic: Analysing AIC, BIC and LL scores  (Read 4105 times)

Offline LetyR

  • Newbie
  • *
  • Posts: 1
    • View Profile
Analysing AIC, BIC and LL scores
« on: February 06, 2015, 21:52:04 »
Hello everyone,

I am using Hugin Lite for academic purposes and I have some perplexities, probably due to the fact of not being familiar with Bayesian Networks. My doubts concern the analysis of the results reached after the training of the model: when I test the network with a test set (of examples that don't belong to the training set), how should I interpret the AIC, BIC and log-likelihood scores? When are they consistent or good enough? Have they significance on their own or only compared with the scores of another model? And in the comparison with another model, how to choose the best one?
For example, if I got

AIC = -1225.1667153249246
BIC = -1315.0450867415138
LogLikelihood = -1156.1667153249246

can I deduce something (good or bad) about my network?

Another parameter of the analysis is the ROC curve: which role does it play in the valuation of the goodness of the model?

If someone can help or suggest some book or guide or something similar, it would be great.

Thank you,

L.

Offline Anders L Madsen

  • HUGIN Expert
  • Hero Member
  • *****
  • Posts: 2278
    • View Profile
Re: Analysing AIC, BIC and LL scores
« Reply #1 on: February 24, 2015, 14:34:54 »
Quote
My doubts concern the analysis of the results reached after the training of the model: when I test the network with a test set (of examples that don't belong to the training set), how should I interpret the AIC, BIC and log-likelihood scores? When are they consistent or good enough? Have they significance on their own or only compared with the scores of another model? And in the comparison with another model, how to choose the best one?

The AIC, BIC and LogLikelihood (LL) scores are criteria for selecting between a set of candidate models representing a data set. The LL specifies how well a model represents a data set and the LL can be increased by making the model more complex. So, this score should only be used to compare models with the same complexity. Both the BIC and AIC scores are based on the LL with a penalty score for complexity. The penalty score is different for BIC and AIC.

You should use these scores to select between candidate models as a representation of a data set. The higher the score, the better the model represents the data.

Quote
can I deduce something (good or bad) about my network?
No, not as far as I know. You can use it as a relative score to compare models and select a model with highest score.

Quote
Another parameter of the analysis is the ROC curve: which role does it play in the valuation of the goodness of the model?
The ROC can be used for assessing the performance of a classification model, i.e., your model should be used to assign a class label to a set of instances. The ROC (and the area under the ROC) is a measure of classification performance showing the True Positive rate as a function of the False Positive rate.

You can find some introductory material on these concepts on Wikipedia.
HUGIN EXPERT A/S