16

**Patterns & Predictions / End of Life**

« **on:**September 21, 2011, 14:53:00 »

This product has reached end of life.

HUGIN forum

Welcome the HUGIN Forum!

HUGIN is a tool for probabilistic graphical models.

We hope you enjoy using our forum.

Thanks!

HUGIN Expert A/S**Registration and Spam Robots**

We have disabled the automatic member registration function due to the increasing number of spam-robot member registrations.**How To Create a Member Account**

Simply send an email with your user name to *forum (at) hugin (dot) com*, and we will create a member account for you.

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

16

This product has reached end of life.

17

Using the HUGIN .NET API it is possible to make use of HUGIN functionality in Matlab. The appropriate HUGIN dll file should be loaded as an assembly into Matlab.

The following example illustrates how to load the StudFarm example from the Samples/oobn directory of the HUGIN installation and print the belief in node John.offspring:

Notice that all classes of the example are stored as a class collection in a single file.

The following example illustrates how to load the StudFarm example from the Samples/oobn directory of the HUGIN installation and print the belief in node John.offspring:

Code: [Select]

`ghapi = NET.addAssembly ('C:\Program Files\Hugin Expert\Hugin Researcher 7.4\HDE7.4CS\Lib\hugincs-7.4-2.0-x64.dll');`

cc = HAPI.ClassCollection();

cc.ParseClasses ('C:\Program Files\Hugin Expert\Hugin Researcher 7.4\Samples\oobn\StudFarm.net', HAPI.DefaultClassParseListener);

c = cc.GetClassByName ('StudFarm');

d = c.CreateDomain ();

d.Compile ();

d.GetNodeByName ('John.offspring').GetBelief (0)

ans =

0.2734

Notice that all classes of the example are stored as a class collection in a single file.

18

The *adaptation * algorithm processes the cases individually. It assumes

that the conditional distributions (one for each parent configuration)

follow the Dirichlet distribution. When a new case arrives, it

computes the updated distribution. However, if the new case is

incomplete, the updated distribution is in general a mixture of

Dirichlet distributions. In order to avoid exponential growth of the

number of terms of the distribution, the updated distribution is

replaced by a single Dirichlet distribution that has the same means

and average variance (of the Dirichlet parameters) as the correct

updated distribution.

The*EM * algorithm, on the other hand, considers all cases before it

updates the CPTs. This is repeated until the CPTs converges to a

stable maximum (stability is determined by a sufficiently small change

of the log-likelihood between two iterations).

If all cases are complete, there is usually no difference between the

methods -- with one exception: the adaptation algorithm needs to have

a valid starting distribution (this means that experience counts must

be positive), whereas the EM algorithm is happy to start with

experience counts equal to zero.

The adaptation algorithm is primarily intended to continuously update

the CPT parameters when a system is already online (for example, it is

useful if the parameters can change over time).

The EM algorithm will generally give the best results, but it is also

more costly to run (since it processes all cases, and often does that

several times). It is therefore not suitable if you want online

updating of CPT parameters.

that the conditional distributions (one for each parent configuration)

follow the Dirichlet distribution. When a new case arrives, it

computes the updated distribution. However, if the new case is

incomplete, the updated distribution is in general a mixture of

Dirichlet distributions. In order to avoid exponential growth of the

number of terms of the distribution, the updated distribution is

replaced by a single Dirichlet distribution that has the same means

and average variance (of the Dirichlet parameters) as the correct

updated distribution.

The

updates the CPTs. This is repeated until the CPTs converges to a

stable maximum (stability is determined by a sufficiently small change

of the log-likelihood between two iterations).

If all cases are complete, there is usually no difference between the

methods -- with one exception: the adaptation algorithm needs to have

a valid starting distribution (this means that experience counts must

be positive), whereas the EM algorithm is happy to start with

experience counts equal to zero.

The adaptation algorithm is primarily intended to continuously update

the CPT parameters when a system is already online (for example, it is

useful if the parameters can change over time).

The EM algorithm will generally give the best results, but it is also

more costly to run (since it processes all cases, and often does that

several times). It is therefore not suitable if you want online

updating of CPT parameters.

19

Quote

Unfortunately I am having some issues running hugin on my ubuntu 10.04 system:

anders@ubuntu:> bin $ ./hugin

Exception in thread "main" java.lang.NoClassDefFoundError: Could not initialize class COM.hugin.HAPI.Native.HAPI

at COM.hugin.HAPI.NetworkModel.getNativeID(NetworkModel.java:527)

at COM.hugin.HGUI.HuginGUIUtils.setRunningAndValidateAPIVersion(HuginGUIUtils.java:83)

at COM.hugin.HGUI.Hugin.<init>(Hugin.java:193)

at COM.hugin.HGUI.Hugin.main(Hugin.java:322)

...

The

The README file in the HUGIN Lite distribution contains more details.

20

Using the HUGIN .NET API it is possible to make use of HUGIN functionality in Matlab. The appropriate HUGIN *dll* file should be loaded as an *assembly* into Matlab.

The following example illustrates how to load the ChestClinic example from the*Samples * directory of the HUGIN installation and print the belief in node *L*:

The next code fragment shows how to select a state of a node, propagate the evidence and show the posterior beliefs in node*L*:

The following example illustrates how to load the ChestClinic example from the

Code: [Select]

`ghapi = NET.addAssembly ('C:\Program Files\Hugin Expert\Hugin Researcher 7.4\HDE7.4CS\Lib\hugincs-7.4-2.0-x64.dll');`

d = HAPI.Domain ('C:\Program Files\Hugin Expert\Hugin Researcher 7.4\Samples\ChestClinic.net', HAPI.DefaultClassParseListener);

L = d.GetNodeByName ('L');

d.Compile ();

L.GetBelief (0)

ans =

0.0550

L.GetBelief (1)

ans =

0.9450

The next code fragment shows how to select a state of a node, propagate the evidence and show the posterior beliefs in node

Code: [Select]

`X = d.GetNodeByName ('X');`

X.SelectState (0);

d.Propagate (HAPI.Equilibrium.H_EQUILIBRIUM_SUM, HAPI.EvidenceMode.H_EVIDENCE_MODE_NORMAL);

L.GetBelief (0)

ans =

0.4887

L.GetBelief (1)

ans =

0.5113

21

Yes, the HUGIN GUI allows the user to save and load structural constraints between nodes in a file. Structure constraints can be saved in and loaded from a file in the “Structure Constraints”-step of the Learning Wizard. You use the “save network with constraints” button to save structure constraints in a network specification file and the “import model information” button to load constraints from a network specification file.

For instance, if you want to disallow a certain direction on a link, then you would use the “No Arrow Constraint Tool” to define a constraint that a node X is not allowed to be parent of node Y. You can add multiple constraints in one operation by selecting a set of nodes and pressing the right mouse button on another node.

The constraints are saved in a HUGIN .net file. This file can be loaded into the “Structure Constraints”-step of the Learning Wizard. The constraints are defined as attributes on nodes in the .net file. For instance, the “HR_Constraint_X” attribute of this node definition of node D in a .net file:

node D

{

label = "D";

position = (210 110);

states = ("yes" "no");

HR_Constraint_X = "NoArrow";

}

Specifies that node D is not allowed to be a parent of X.

For instance, if you want to disallow a certain direction on a link, then you would use the “No Arrow Constraint Tool” to define a constraint that a node X is not allowed to be parent of node Y. You can add multiple constraints in one operation by selecting a set of nodes and pressing the right mouse button on another node.

The constraints are saved in a HUGIN .net file. This file can be loaded into the “Structure Constraints”-step of the Learning Wizard. The constraints are defined as attributes on nodes in the .net file. For instance, the “HR_Constraint_X” attribute of this node definition of node D in a .net file:

node D

{

label = "D";

position = (210 110);

states = ("yes" "no");

HR_Constraint_X = "NoArrow";

}

Specifies that node D is not allowed to be a parent of X.

22

There are important differences related to influence diagrams between HUGIN 6.9 and HUGIN 7.x.

In HUGIN 7.0 we introduced "LIMIDs" (Lauritzen & Nilsson'01). The support for LIMIDs changes the semantics of the information arcs in the diagram and the solution algorithm.

In the traditional influence diagram (HUGIN 6.9) we assume the decision maker to be non-forgetting and use the Jensen, Jensen & Dittmer (94) algorithm to solve the diagram. This implies that the decision maker is assumed to recall all past observations and decisions. By this assumption some information arcs are assumed present. E.g., for the last decision all observations prior to the first decision are assumed observed due to not-forgetting. The diagram is solved by solving for each decision in reverse time ordering.

In the LIMID, all information arcs should be explicitly drawn in the diagram. There is no assumption about perfect recall (non-forgetting). Thus, an information arc from node X into the first decision does not imply that we assume an information arc from node X into any later decision to be implicitly present. Hence, we may model that the decision maker is forgetful.

Also, the solution algorithm is changed. The solution algorithm for LIMIDs is Single Policy Updating where we iteratively solve for each decision (in reverse time order, if an ordering is present). The user has to press the "SPU" button in the toolbar to run Single Policy Updating after compiling the network in order compute updated decision policies.

To change your model into a traditional influence diagram (if you want), you will have to add information arcs to the diagram.

In HUGIN 7.0 we introduced "LIMIDs" (Lauritzen & Nilsson'01). The support for LIMIDs changes the semantics of the information arcs in the diagram and the solution algorithm.

In the traditional influence diagram (HUGIN 6.9) we assume the decision maker to be non-forgetting and use the Jensen, Jensen & Dittmer (94) algorithm to solve the diagram. This implies that the decision maker is assumed to recall all past observations and decisions. By this assumption some information arcs are assumed present. E.g., for the last decision all observations prior to the first decision are assumed observed due to not-forgetting. The diagram is solved by solving for each decision in reverse time ordering.

In the LIMID, all information arcs should be explicitly drawn in the diagram. There is no assumption about perfect recall (non-forgetting). Thus, an information arc from node X into the first decision does not imply that we assume an information arc from node X into any later decision to be implicitly present. Hence, we may model that the decision maker is forgetful.

Also, the solution algorithm is changed. The solution algorithm for LIMIDs is Single Policy Updating where we iteratively solve for each decision (in reverse time order, if an ordering is present). The user has to press the "SPU" button in the toolbar to run Single Policy Updating after compiling the network in order compute updated decision policies.

To change your model into a traditional influence diagram (if you want), you will have to add information arcs to the diagram.

23

This is a network for predicting the risk of ship-ship collisions. The network is described in "Risk-Based Ship Design: Methods, tools and applications", Papanikolaou, Editor, (2009).

The network was developed by Peter Friis Hansen et al.

The network was developed by Peter Friis Hansen et al.

24

The main objective of this project is to develop a cost-effective vaccination strategy for the poultry production, hereby reducing the colonization of Campylobacter in both parental and broiler flocks.

Vaccination is one of the few measures that can be applied to reduce the colonization of Campylobacter in free range organic poultry. The project aims to identify a vaccination strategy based on reduction, since risk assessment studies have shown that a 2 log reduction of colonization in poultry can reduce the risk of human infection by 30 times.

Different candidate vaccines will be optimized to obtain a cost-effective production of low-risk poultry if feasible.

The CamVac project is supported by The Danish Counsil for Strategic Research.

Visit project web site http://www.camvac.dk

Visit HUGIN project web site http://camvac.hugin.dk

Vaccination is one of the few measures that can be applied to reduce the colonization of Campylobacter in free range organic poultry. The project aims to identify a vaccination strategy based on reduction, since risk assessment studies have shown that a 2 log reduction of colonization in poultry can reduce the risk of human infection by 30 times.

Different candidate vaccines will be optimized to obtain a cost-effective production of low-risk poultry if feasible.

The CamVac project is supported by The Danish Counsil for Strategic Research.

Visit project web site http://www.camvac.dk

Visit HUGIN project web site http://camvac.hugin.dk

25

For interval nodes, the values specified for state i and state i + 1 are the left and right endpoints of the interval denoted by state i (the dividing point between two neighboring intervals is taken to belong to the interval to the right of the dividing point), e.g., [0;1[.

26

Here is an example that shows how to construct a simple OOBN model using the HUGIN ActiveX server API:

Code: [Select]

`Sub oobn_example()`

Dim ClassColl As ClassCollection

Dim hClass As Class, hMainClass As Class

Dim NodeA As Node, InputNode As Node, OutputNode As Node, NodeB As Node

Dim instanceNode As Node

On Error GoTo errorhandler

' Create class collection

Set ClassColl = GetNewClassCollection

' Create class

Set hClass = ClassColl.GetNewClass()

hClass.name = "hClass"

'Tell:

MsgBox ("Class created!")

'Create some nodes

Set NodeA = hClass.GetNewNode(hCategoryChance, hKindDiscrete)

NodeA.name = "NodeA"

Set InputNode = hClass.GetNewNode(hCategoryChance, hKindDiscrete)

InputNode.name = "Input"

InputNode.AddToInputs

Set OutputNode = hClass.GetNewNode(hCategoryChance, hKindDiscrete)

OutputNode.name = "OutputNode"

OutputNode.AddToOutputs

'Tell:

MsgBox ("Nodes created!")

' Adding Structure Input->A->Output

Call NodeA.AddParent(InputNode)

Call OutputNode.AddParent(NodeA)

'Tell:

MsgBox ("Structure created!")

' Creating main class

Set hMainClass = ClassColl.GetNewClass()

hMainClass.name = "main_class"

' create instance of hClass

Set instanceNode = hMainClass.AddNewInstance(hClass)

' create node in hMainClass

Set NodeB = hMainClass.GetNewNode(hCategoryChance, hKindDiscrete)

NodeB.name = "NodeB"

' link NodeB to instanceNode.Input

Call instanceNode.SetInput(InputNode, NodeB)

'Tell:

MsgBox ("Main class created!")

' save class collection

ClassColl.SaveAsNet ("C:\oobn_example.oobn")

Exit Sub

errorhandler:

MsgBox ("Error: " & Err.Description)

Exit Sub

End Sub

27

The HUGIN API is thread-safe and can be used in multithreaded applications. You may find more information on multiprocessing and multitreaded applications in the HUGIN (C) API Reference Manual section 1.8.

The first part of section 1.8 reads:

*The HUGIN API can be used safely in a multithreaded application. The major obstacle to thread-safety is shared data—for example, global variables. The only global variable in the HUGIN API is the error code variable. When the HUGIN API is used in a multithreaded application, an error code variable is maintained for each thread.*

The section also includes some advice on the use og HUGIN in multithreaded applications:

*The most common usage of the HUGIN API in a multithreaded application will most likely be to have one or more dedicated threads to process their own domains (e.g., insert and propagate evidence, and retrieve new beliefs). In this scenario, there is no need (and is also unnecessarily inefficient) to protect each node or domain by a mutex (mutual exclusion) variable, since only one thread has access to the domain. However, if there is a need for two threads to access a common domain, a mutex must be explicitly used.*

The first part of section 1.8 reads:

The section also includes some advice on the use og HUGIN in multithreaded applications:

28

The structure learning algorithms (the PC and NPC algorithms) use the results of tests of the form "Is X and Y independent given S?", where S is a set of variables of size less than or equal to 3, to construct a directed graph. In a specific test, we only use the cases that have values for all the variables involved (that is, X and Y and the variables in S).

The EM algorithm uses inference to*guess* the values of the unobserved variables (that is, the conditional distribution for an unobserved variable given the observed variables of the case is used as the "guess" for the unobserved variable). The initial distribution (formed from the priors specified for the variables of the network) is used in the first iteration of the EM algorithm. As the final step of an iteration of the EM algorithm, new CPTs for all variables are computed, and these CPTs are used for inference in the next iteration.

This EM algorithm is also described in Cowell et al. The adaptation algorithm assumes Dirichlet priors formed from the prior distribution and the experience count. Moreover, in order to avoid exponential growth in the number of terms of the updated distribution, an approximation is used: The approximation is a Dirichlet distribution with the same means and variance as the true distribution. This means that the order in which the cases are processed affects the final distribution (because it affects the approximations performed). [The actual order of the cases does not matter for the EM algorithm.]

For further information on Adaptation, see the book by Cowell et al (1999). Or see the original paper by Spiegelhalter and Lauritzen from 1990. [Full citation details can be found in the Hugin API Reference Manual.]

The EM algorithm uses inference to

This EM algorithm is also described in Cowell et al. The adaptation algorithm assumes Dirichlet priors formed from the prior distribution and the experience count. Moreover, in order to avoid exponential growth in the number of terms of the updated distribution, an approximation is used: The approximation is a Dirichlet distribution with the same means and variance as the true distribution. This means that the order in which the cases are processed affects the final distribution (because it affects the approximations performed). [The actual order of the cases does not matter for the EM algorithm.]

For further information on Adaptation, see the book by Cowell et al (1999). Or see the original paper by Spiegelhalter and Lauritzen from 1990. [Full citation details can be found in the Hugin API Reference Manual.]

29

Quote

How do you determine what the prior experience values should be if you derive your prior CPTs directly from expert elicitation (and not from a dataset, in which case the prior experience values would correspond to the number of data points)?

The experience counts you give to expert knowledge determines the weight of the prior CPTs in the learning process. For example, if you provide an experience count for a given parent configuration equal to the number of cases with that configuration, then the prior CPT and the distribution determined by the data (for that configuration) have equal weights.

30

Quote

How is HUGIN treating missing data in parameter learning? When I include missing data in the data input files and use the EM algorithm to learn the CPT tables, I get non-integer experience values. When I remove the rows with missing data, I get only integer experience values.) How is HUGIN extrapolating to calculate these non-integer experience values?

At the start of an iteration, we have CPTs for all nodes in the network (the CPTs are updated at the end of each iteration, so the CPTs for the next iteration will be better). An incomplete case is entered as evidence in the network, and the evidence is propagated.

If not all parents of a node in the case are instantiated, we get a probability distribution over the parents. Each probability in this distribution is the contribution of the case to the experience count of the corresponding parent configuration. This is where the non-integer counts arise (if all cases are complete, the probabilities will all be 0 or 1).