Author Topic: How may likelihood evidence be used?  (Read 21499 times)

Offline Anders L Madsen

  • HUGIN Expert
  • Hero Member
  • *****
  • Posts: 2295
    • View Profile
How may likelihood evidence be used?
« on: April 02, 2007, 11:52:27 »
Let A be the node that you wish to enter (likelihood) evidence for. Suppose you added a new node B as a child of A and specified the conditional probability table P(B|A) as follows:
  a1  a2
  b1   0.3    0.4
  b2    0.7    0.6
Then entering the observation B=b1 in the modified net is equivalent to entering the likelihood (0.3, 0.4) for A in the original net.
This feature can be used for inexact observations. Suppose A represents something that you cannot observe with 100% certainty, and B represents your observation of A. If there is 10% risk of making a wrong observation, then P(B|A) would be:
  a1  a2
  b1   0.9    0.1
  b2    0.1    0.9
If B is part of the net, then you would enter either B=b1 or B=b2 according to your actual observation. If B is not part of the net, you would instead enter either (0.9, 0.1) or (0.1, 0.9) as likelihood for A.
Another use of likelihoods: To make inference pretending that some ``root'' node had some other prior distribution. This is done by specifying a likelihood equal to the quotient of the desired prior and the original prior. (This trick, of course, only works when division by zero is not involved.)
HUGIN EXPERT A/S

Offline joost

  • Newbie
  • *
  • Posts: 7
    • View Profile
Re: How may likelihood evidence be used?
« Reply #1 on: May 16, 2007, 14:12:11 »
Can likelihood evidence also be interpreted in the following manner?:
Let A be the parent node, B the child node, both are interval nodes. Then I have evidence (on B) on a population of experiments, thus I want to insert a distribution (no single observation) on the intervals of B, and see how this leads to a change in the distribution on A?

Offline Anders L Madsen

  • HUGIN Expert
  • Hero Member
  • *****
  • Posts: 2295
    • View Profile
Re: How may likelihood evidence be used?
« Reply #2 on: May 18, 2007, 08:39:19 »
The inference engine does not put any interpretation on the states of discrete nodes. Thus, entering likelihood evidence on an interval node will produce the same behaviour as the example described above. Does this answer your questions?
HUGIN EXPERT A/S

Offline Gary

  • Newbie
  • *
  • Posts: 7
    • View Profile
Re: How may likelihood evidence be used?
« Reply #3 on: May 21, 2007, 13:46:17 »


Joost - likelihood can be quite tricky to interpret but I think, if you are careful, what you say is correct. In practice I dont use likelihood evidence and prefer to transform it to a finding with an appropriate likelihood table. Some while ago I wrote some notes on this process and I have included them below - hope it helps (it may have lost some detail in transforming from MS word)

Gary


Likelihood evidence?

There are several ways to enter new information into an established belief system. Most commonly, for variables that are observed to be in specific discrete states, the information is entered as a finding and often called evidence. The marginal probability of a state that corresponds to a finding is fixed at unity – consequently all the complementary states (other states for the same variable) have probabilities fixed at zero.

Likelihood is an alternative form of new information which is sometimes difficult to interpret – in practice it is simply shorthand for real evidence. Consequently it is always possible to operate belief systems without entering likelihoods – but this may involve a slightly different network structure.

We may illustrate the relationship between likelihood information and evidence with a simple three node network.  Consider two variables A and B which have a relationship that is represented by a conditional probability table p(B|A) i.e. B depends on A. The form of the relationship is unimportant but might be constructed from a ‘model’ and some table generator tools. The prior probabilities for A, p(A), are equally unimportant.

In many real situations a direct observation of B is used to establish belief concerning the state of the parent variable A – this is classical inference. In the probabilistic scheme this involves Bayes’ theorem, the conditional probability table p(B|A) and the prior information p(A). In a belief system we would enter new information as a finding for the observed state of B and the ‘updated’ probability for variable A (i.e. following propagation) would express subsequent beliefs (posterior information).

However observations are not always this simple. In many cases an observation does not give definitive information about the variable under investigation. In most cases there are uncertainties surrounding the experimental technique and there is a small probability that the real state of the variable in question is different from the state that is indicated by any particular observation. If we know the small probabilities that describe ‘failures’ in the experimental method we can express the relationship between the state of the observable variable, B, and the state indicated by an experimental observation. Then this relationship can be quantified as a conditional probability table, p(B’|B), where B’ is a variable that represents the experimental observation. B’ has states that are identical to B and the conditional probability, p(B’|B) is a square table. The diagonal elements of p(B’|B) will be close to unity if the experimental technique is a reliable one – i.e. if by observing that B is in a particular state we are confident that it is actually in that state.

A belief system that explicitly includes the experimental observation of B has an additional node B’ and a cpt p(B’|B) that is usually called a ‘likelihood’ function. In this three node network new information concerning B has the form of a finding for B’. When evidence is entered for B’ the marginal probabilities for B and A are updated according to the usual laws for propagating evidence.

The normal rule for propagating evidence entered at B’ uses Bayes’ theorem

p(B|B’) = p(B’|B) p(B) / p(B’)

Although the table for p(B|B’) has not been specified explicitly within the belief system the right hand side of this expression gives a prescription for computing updates in terms of components that are explicitly included in the structure.  When there is evidence concerning B’, i.e. B’ is in a definite state B’ = b, the computation of p(B|B’=b) is particularly simple. On the right hand side the elements from one row of p(B’ = b |B) multiply the elements of the marginal probability, p(B), to give the corresponding elements of the posterior marginal probability (the denominator p(B’ = b) is a constant only used for normalization – it is an element of the marginal at B’ prior to adding new information). The new evidence can then be propagated onwards through the network to update the probabilities for A etc.

This update process

p(Bi | B’ = b) ~ p(B’ = b | Bi) x p(Bi)
is particularly common and it is included in shorthand in several belief modelling tools (Bi is a state of variable B). In shorthand, rather than entering the finding B’ = b for a third node B’ that represents the actual result of an observation, the corresponding elements p(B’ = b | Bi) (i.e. conditional probabilities for observing a state b while the variable in question is in each of its possible states) are entered directly at variable B. This is called likelihood information as the entries correspond the one row of the likelihood table for the B’ variable. The belief system is updated (including the node where likelihood in entered) as if the corresponding auxiliary node and cpt existed (this is not the only belief system operation where additional nodes are useful).
Thus updating the belief system with uncertain measurements includes a choice between entering likelihood information directly or adding a new network node, and a likelihood table, so that information can be entered directly as a finding.
Additionally it is possible to extend this interpretation of likelihood: the likelihood table p(B’|B) can be considered in terms of the frequencies of observations in a large ensemble of independent events. In this case the analysis presented above is unchanged but the interpretation is slightly different.
In a simple example we can consider a variable A with uniform prior and four interval states [0-1], [1-2], [2-3] and [3-4]. A dependent variable B = A2 has four interval states [0-2],[2-4], [4-6] and [6-8]. We describe a likelihood function for measurement of B by a table
 
where B’ is the measurement variable with states identical to B. In the figure we show the three node network and a corresponding two node network (with nodes A2, B2 that are identical to A,B but without a node that corresponds to B’). In the figure the networks correspond to a situation in which a finding B’ = [2-4] has been entered into the three node network and likelihood B2 = [0.1,0.7,0.1,0.1] has been entered into the two node network. Posterior beliefs concerning B, A and B2, A2 are identical.