Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Topics - Anders L Madsen

Pages: 1 2 [3] 4 5 6
FAQ / What is the performance of structure learning?
« on: August 14, 2008, 08:58:18  »
The data used for learning is assumed to be a sample of cases generated by a process that can be described as a Bayesian network. The DAG structure of this Bayesian network is a referred to as the graph underlying the data generating process. Structure learning is the task of identifying the graph of the data generating process or an equivalent graph where two DAGs are equivalent if they induce the same set of (conditional) dependence and independence statements. 

Structure learning in HUGIN software is implemented using the PC and NPC algorithms. The two algorithms are constraint-based approaches with the NPC algorithm being an extension of the PC algorithm with a Necessary Path Condition.

A constraint-based structure learning algorithm identifies the structure of the underlying graph by performing a set of statistical tests for pairwise independence between each pair of variables. In principle, the independence of any pair of variables given any subset of other variables is tested.

The Necessary Path condition is applied to determine the validity of an independence statement derived by the statistical tests. The NPC algorithm is usually slower than the PC algorithm.

The HUGIN implementation of the NPC/PC algorithm performs statistical tests conditional on subsets of size 0,1,2,3. It stops testing for independence between a pair of variables once an independence statement has been found. The implementation also uses the structure of the graph to guide the selection of the test to perform next. This is referred to as the PC* algorithm in the literature.

The more sparse the underlying graph is the fewer tests have to be performed (as the algorithm stops testing when an independence statement has been found). If the graph is very dense a large number of independence tests have to fail. This may take up a significant amount of time.

In general it is very difficult to say what running times the user should expect as it depends on the actual data and the hardware. We know of users who have performed learning on data sets with 100s of variables and 1000s of cases.

FAQ / How do I export the table for a node with an expression?
« on: March 05, 2008, 23:08:40  »
One way to export the content of a table for a node with an expression is to temporarily "switch to manual" specification of the table, export the content of the table, and subsequently "switch to expressions".

The content of the table should be generated using "Show as Table" prior to switching to manual specification.

Bayesian Networks and Influence Diagrams / Information on the book
« on: November 01, 2007, 11:32:34  »
Information on the book can be found at

HiTS/ISAC / Project description
« on: August 22, 2007, 09:10:59  »
We are proud to announce that HUGIN is part of HiTS / ISAC - Highway to security: Secure interoperability of intelligence services. HiTS/ISAC is a prepatory action on 'Enhancement of the European industrial potential in the field of Security Research 2004-2006' (PARS)

The vision of the proposed project HiTS/ISAC is a more secure Europe through prevention of terrorism and organized crime. It addresses the interoperability of intelligence services to exchange information on suspicious activities in order to enable information analysis and fusion from different sources.

Superior situation awareness and cross-border interoperability are key enablers, leading to new technical and operational methods to work, train and co-operate across Europe. Today, information on suspicious activities in databases at law-enforcement authorities is distributed across Europe. The information is not easily available to other authorities in Europe, especially not "on-line".

The objective of HiTS/ISAC is to enable information analysis and fusion from many different sources, through secure cross-border on-line group co-operation between authorities, in order to detect and provide early warnings for suspicious activities, be it communication between suspected criminals, or anomalous movement of persons, goods or money, etc. HiTS/ISAC will develop a problem solving environment and demonstrate it in a virtual operations room which can be
established anywhere, at any time. Tools and processes will be developed and implemented, and demonstrated using realistic scenarios.

The consortium is composed of industries and SME's covering 9 countries of which 4 new Member States.

Project leader: Saab AB, SV

The consortium is composed of 12 partners from 10 countries. The project runs for 18 months starting June 2006.

BIOTRACER / Project description
« on: August 22, 2007, 09:09:39  »
We are proud to announce that HUGIN is part of BIOTRACER - Improved bio-traceability of unintended microorganisms and their substances in food and feed chains. BIOTRACER is an IP project under the 6th framework.
All food and feed producers (including water-bottling companies) within the European Union are required to provide records to food safety authorities upon request, but there is no common standard for the format or in what media these records must be produced.

In addition, current methods for microbial analysis of food-borne pathogens are time-consuming, have limited sample throughput, and do not provide all of the information needed.

Prediction of how pathogens in food and feed would spread in a given situation is essential to control potential threats to public health. Thus, it is important to develop fast and simple methods for tracking microbial pathogens in the complex environment of the food chain.

The objective of BIOTRACER is to develop methods to trace the course of food and feed contamination.

Using a total food chain approach, BIOTRACER will develop recommendations to control any risk through the integration of novel genomic and metabolomic data resulting in a better understanding of the physiology of the micro-organisms, combining these with advances in predictive food-based microbiological models.

Contribution to Advancing Proven Scientific Methods
  • Collection of physiological data on virulence, gene expression persistence, pathology and metabolite composition of micro-organisms that might enter into the food chain.
  • Application of whole genome micro-arrays and polymerase chain reaction-type methods to the fast-tracking of pathogens in the food and feed chains.
  • New quantitative food chain modelling systems using many combinations of current methods.
  • Computation of risk assessment for pathogens in animal products.

Dissemination and Exploitation
BIOTRACER has brought together experts from microbiology, software and database development, risk assessment, legislation and standards, as well as food retailers, to develop novel frontier technologies and exploit them to track and trace micro-organisms in selected food and feed chains, and will model the behaviour of the pathogens in these environments.

In order to assure a wide impact on food safety and quality, dissemination and technology transfer issues are at the core of the BIOTRACER strategy. This information will be available through regional, national and European Union databases and will promote a better understanding of how microbes are transmitted within Europe.

Basic project information
Full project title: Improved bio-traceability of unintended micro organisms and their substances
in food and feed chains
Duration: 48 months
Starting year: 2007
EU funding: EUR 11 million
FP6 instrument used: Integrated Project
Partners: 47
Project coordinator:
Jeffrey Hoorfar
Danish Institute for Food and Veterinary Research, Copenhagen Denmark

Project website:

FAQ / Does PHPP support database connectivity?
« on: August 22, 2007, 08:53:29  »
Yes. The README file distributed with the installation specifies how to use the database connectivity of PHPP.

FAQ / Can PHPP handle different types of data in the same
« on: August 22, 2007, 08:53:08  »
Boolean, continuous and discrete valued entities may be represented in the data set from which a model is constructed. The PHPP will represent a continuous valued entity using discretization.

FAQ / How do I make adjustments to a PHPP model?
« on: August 22, 2007, 08:52:33  »
The HUGIN GUI may be used to adjust a model constructed using PHPP. PHPP uses the underlying HUGIN Decision Engine as its score engine. The models built by PHPP are stored in HUGIN network files.

In addition the <ph -update> allows the user to update a model based on a new set of data.

FAQ / How do I display the model?
« on: August 22, 2007, 08:52:06  »
The HUGIN GUI may be used to display a model constructed using PHPP. PHPP uses the underlying HUGIN Decision Engine as its score engine. The models built by PHPP are stored in HUGIN network files.

FAQ / Can I apply PHPP on a partly specified model structure?
« on: August 22, 2007, 08:51:38  »
A model constructed with the HUGIN GUI and saved as a HUGIN network file can be given as input to PHPP using the <ph -update>  command.

FAQ / How do I find the most informative case?
« on: August 22, 2007, 08:51:10  »
The most informative case is the case which reduces the entropy of the target variable the most. The entropy may be interpreted as a measure of the degree of chaos in a probability distribution. Since a PHPP model contains a unique target variable, the most informative case is the case which reduces the entropy of the posterior distribution of the target variable the most.

The <ph -voicase> command computes a measure of how well a single case predicts the target variable. The measure computed by this command is referred to as the value-of-information score of the case. The higher the score, the better the case predicts the target  variable. Finding the most informative case is then a question of identifying the case with the highest value-of-information score.

FAQ / What kind of models does PHPP construct?
« on: August 22, 2007, 08:50:26  »
PHPP supports the construction of Naive Bayes, Tree Augmented Naive Bayes models, Hierarchical Naive Bayes Models.

Using PHPP it is possible to construct models containing a mixture of Boolean, qualitative and quantitative variables.

Each model will have a unique target variable. The target variable may be considered as the hypothesis variable or classification variable when the model is to be used for classification. The target variable may have multiple states.

FAQ / Are characters such as ”,”, ”.”, ”;”, etc ignored?
« on: August 22, 2007, 08:49:49  »
Characters such as ”,”, ”.”, ”;”, etc are ignored when creating a Boolean model from unstructured data as in the example of mail classification unless the <-wpunct> option is specified. In this case the aforementioned characters are assumed to be part of the word preceding the character.

White spaces should be used as word separators when building a Boolean model.

PHPP implements n-fold cross validation.

The proper algorithm to use for evaluating the performance of a model may, however, depend on the domain in which the model is to be used and what the model is to be used for.

Additional algorithms to consider for model evaluation are receiver operating characteristic curves (ROC) and area-under-curve of ROC to name a few. Later versions of PHPP will implement such algorithms.

If you would like to construct a HUGIN case file for a particular case in a HUGIN data file, then you can use the <dat2hcs> command. Assume you have a HUGIN data file named datafile.csv with 10000 cases and would like to construct a HUGIN case file named case_10.hcs for the case with index 10 in datafile.csv.

The <dat2hcs> takes as argument a model that is constructed from datafile.csv. The command is

dat2hcs datafile.csv 10

where contains the model.

Pages: 1 2 [3] 4 5 6