Author Topic: CPT size is small but “Hugin ran out of memory". A big number of nodes?  (Read 14651 times)

Offline Sophie

  • Newbie
  • *
  • Posts: 1
    • View Profile
I have a Bayesian network with 255 nodes; each node has 7 states.
8 nodes have not fathers; instead each of other 247 nodes has only 2 fathers and so its CPT size is equal to 73=343.
The total CPT size for the Bayesian network is 84777, since a "double" occupies 8 bytes in the computer memory, the total size is 84777 * 8= 678216 bytes, that are 662,3203125 KB. The amount of memory (RAM) in my computer is equal to 1 GB. Why I have the following error: “Hugin ran out of memory while carrying out some operation” when I compile the network? There is an error in calculated memory? I have changed also the triangulation method, I have modify the parameters "-Xmx1024m -Xss8192 -XX:MaxPermSize=256m” in the virtual machine, but I have the same error. I attach the network.

Let n  the number of root nodes (in the previous case n=8), I want use this model also with a bigger value of n and I want that each node has n+1 states. (If I have n   nodes in the first level (root nodes), then the Bayesian network has 2n-1 nodes where 2n-n-1 nodes have  2 fathers and their  CPT size is equal to (n+1)3   and n nodes have not fathers) . Each node calculates the max of fathers (see the attached file).   How I can change the structure of the network? What can I do? My goal is to calculate the probability of a root node to be equal to its state if I insert evidence on one or more than one not-root nodes.  Thank!!!

Note: I am using Hugin Researcher Version 6.7.
« Last Edit: September 21, 2007, 17:32:14 by Sophie »

Offline Frank Jensen

  • HUGIN Expert
  • Hero Member
  • *****
  • Posts: 576
    • View Profile
It is not just the sizes of the CPTs that affect the total clique size of the junction tree.  Typically, the structure of the network is the most important factor.  If the network structure is not sufficiently sparse, then the junction tree tends to have very big cliques.

Your network is "layered", and many (if not all) pairs of nodes in a given layer have a common child.  Due to the "moralization" step of the compilation process, this causes links to be created between the nodes of the pairs, resulting in very big cliques.

The optimal triangulation of your network has a total clique size of 372870841 (times 8 bytes = approx. 3GB).

It seems to me that your network represents 8 variables and all subsets (except the empty set) of these variables, and the network computes maxima of these subsets.  Do you need to explicitly represent
all those subsets?  Or, in other words: Do you need to specify (as evidence) what the maximum of a given subset is?  And for all subsets?

Offline Frank Jensen

  • HUGIN Expert
  • Hero Member
  • *****
  • Posts: 576
    • View Profile
If you don't know in advance which subsets will be needed, then you basically need to represent all subsets.  At the least, for each subset, there must exist a node that represents that subset or a superset of the subset.  In the latter case, the evidence would be represented as "multi-state" evidence (that is, you select more than one state as being possible for the node).

However, as you have discovered, building one network that takes into account all possible subsets that you might want to specify evidence for results in a very big junction tree (with respect to total clique size).

Another approach would be to build the network when the evidence is known.  Since the network could be automatically constructed, this is most likely the most efficient approach.

The compilation process generates a log of the actions taken.  The total clique size of the junction tree is reported in that log. Select "Information" in the "View" menu.  This shows the "Network log".  You can find the total clique size by entering "Total" in the search field and clicking "Search".