User Tools

Site Tools


how_to_develop_a_bn

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
how_to_develop_a_bn [2018/12/18 17:15] – [Developing a Bayesian Network] stritihahow_to_develop_a_bn [2023/04/21 15:30] (current) – external edit 127.0.0.1
Line 29: Line 29:
 However, many functionalities are also available in other software packages. For an overview, see review by [[https://doi.org/10.1016/j.envsoft.2016.07.007|Pérez-Miñana (2016)]]. However, many functionalities are also available in other software packages. For an overview, see review by [[https://doi.org/10.1016/j.envsoft.2016.07.007|Pérez-Miñana (2016)]].
  
-In order to map ES, we link BNs to spatial data using a specialized online application (gBay.ethz.ch) developed at ETH-PLUS within the framework of the ECOPOTENTIAL project. +
  
 ==== 3. Defining the nodes==== ==== 3. Defining the nodes====
Line 71: Line 71:
 === 5.1 Defining node states in Netica === === 5.1 Defining node states in Netica ===
 To modify the states of a node in [[https://www.norsys.com/WebHelp/NETICA.htm|Netica]], double click on the node to open the node properties window. Click on “Description” and select “States”. Then, simply enter the names of the states in the box below. For continuous nodes, the states are defined through discretization intervals. Select “Discretization” and enter the threshold values between the states.  To modify the states of a node in [[https://www.norsys.com/WebHelp/NETICA.htm|Netica]], double click on the node to open the node properties window. Click on “Description” and select “States”. Then, simply enter the names of the states in the box below. For continuous nodes, the states are defined through discretization intervals. Select “Discretization” and enter the threshold values between the states. 
-{{:states_netica.png?600|}}+{{:states_netica.png?600|Defining states for discrete and continuous nodes in Netica}}
  
 // Defining states for discrete and continuous nodes in Netica.// // Defining states for discrete and continuous nodes in Netica.//
Line 79: Line 79:
 In [[https://www.norsys.com/WebHelp/NETICA.htm|Netica]], the CPTs can be filled either manually (right click on the node -> Table) or through equations. To use an equation, open the node properties, select Equation, and enter the equation in the box. Then, click on   to calculate the CPT based on the equation.  In [[https://www.norsys.com/WebHelp/NETICA.htm|Netica]], the CPTs can be filled either manually (right click on the node -> Table) or through equations. To use an equation, open the node properties, select Equation, and enter the equation in the box. Then, click on   to calculate the CPT based on the equation. 
  
-{{:equations.png?600|}}+{{:equations.png?600|Examples of a deterministic (left) and probabilistic (right) equation in Netica.}}
  
 // Examples of a deterministic (left) and probabilistic (right) equation in Netica.// // Examples of a deterministic (left) and probabilistic (right) equation in Netica.//
Line 87: Line 87:
 When node states are binary or ordered, this problem can be reduced by using various interpolation methods ([[http://nora.nerc.ac.uk/id/eprint/9461/1/N009461BO.pdf|Cain 2001]], [[https://arxiv.org/abs/cs/0411034|Das 2004]]). For example, if a child node C has three states (“low”, “medium”, “high”), and three parents Pa with the same three states, we can elicit three probability distributions of C: where all three parents are in state “high”, all in state “medium”, and all in state “low”. In addition, we need the relative weights of each parents (w1, w2, w3, which should sum up to one). Then, all the other rows of the CPT can be calculated by interpolation.  When node states are binary or ordered, this problem can be reduced by using various interpolation methods ([[http://nora.nerc.ac.uk/id/eprint/9461/1/N009461BO.pdf|Cain 2001]], [[https://arxiv.org/abs/cs/0411034|Das 2004]]). For example, if a child node C has three states (“low”, “medium”, “high”), and three parents Pa with the same three states, we can elicit three probability distributions of C: where all three parents are in state “high”, all in state “medium”, and all in state “low”. In addition, we need the relative weights of each parents (w1, w2, w3, which should sum up to one). Then, all the other rows of the CPT can be calculated by interpolation. 
  
-Distributions of continuous variables can also be elicited from experts. One useful approach is the four-point estimation method ([[https://doi.org/10.1111/j.1539-6924.2009.01337.x|Speirs-Bridge et al. 2010]]), where we ask experts for the expected value of the node for a specific combination of parents, the expected upper and lower bounds of possible values, and their confidence in their estimate. Using this information, we can estimate a probability distribution (e.g. a normal or triangular distribution). An example of this approach is available in the avalanche protection case study.+//Example of interpolation in a CPT, where the weights of the parents are w1 = 0.17, w2 = 0.5, w3 = 0.33. Elicited probabilities are shown in bold, while all other probabilities in the table are calculated using interpolation, as shown below. // 
 +{{:interpolation_table.png|}} 
 + 
 +  
 +Distributions of continuous variables can also be elicited from experts. One useful approach is the four-point estimation method ([[https://doi.org/10.1111/j.1539-6924.2009.01337.x|Speirs-Bridge et al. 2010]]), where we ask experts for the expected value of the node for a specific combination of parents, the expected upper and lower bounds of possible values, and their confidence in their estimate. Using this information, we can estimate a probability distribution (e.g. a normal or triangular distribution). An example of this approach is available in the [[Avalanche protection in Davos, Switzerland|avalanche protection case study]].
    
 In some cases, experts find it easier to deal with categories rather than continuous variables, and it may be useful to translate continuous nodes to discrete classes using fuzzy logic. In some cases, experts find it easier to deal with categories rather than continuous variables, and it may be useful to translate continuous nodes to discrete classes using fuzzy logic.
Line 96: Line 100:
  
 === 6.2 Linking remote sensing proxies to the state of the ecosystem=== === 6.2 Linking remote sensing proxies to the state of the ecosystem===
-To map ecosystem services, proxies of ecosystem structure are often derived from remote sensing (e.g. land cover classifications or LiDAR-based measurements of vegetation cover). However, these remote sensing products often include some uncertainty due to measurement errors or misclassifications.  To make these uncertainties explicit, we can create separate nodes representing the observed value and the actual state of the variable. The observation is caused by the actual state, not vice-versa, and defining the structure of the network based on this causality helps to define conditional probabilities (see avalanche protection case study).+To map ecosystem services, proxies of ecosystem structure are often derived from remote sensing (e.g. land cover classifications or LiDAR-based measurements of vegetation cover). However, these remote sensing products often include some uncertainty due to measurement errors or misclassifications.  To make these uncertainties explicit, we can create separate nodes representing the observed value and the actual state of the variable. The observation is caused by the actual state, not vice-versa, and defining the structure of the network based on this causality helps to define conditional probabilities (see [[Avalanche protection in Davos, Switzerland|avalanche protection case study]]).
  
 === 6.3 From existing empirical models=== === 6.3 From existing empirical models===
 Often, some parts of the network have already been extensively researched and empirical or process-based models are available in literature. In this case, the model can be incorporated in the BN in the form of probabilistic equations. This usually means that the probability distribution of the child node is a normal distribution, where the mean is a function of its parents, and the standard deviation is derived from the reported uncertainty in the model. Other types of distributions can also be used. Often, some parts of the network have already been extensively researched and empirical or process-based models are available in literature. In this case, the model can be incorporated in the BN in the form of probabilistic equations. This usually means that the probability distribution of the child node is a normal distribution, where the mean is a function of its parents, and the standard deviation is derived from the reported uncertainty in the model. Other types of distributions can also be used.
  
-For an example of how an empirical model can be incorporated in a BN, see the avalanche protection case study. +For an example of how an empirical model can be incorporated in a BN, see the [[Avalanche protection in Davos, Switzerland|avalanche protection case study]]
  
 === 6.4 Learning from data or simulations === === 6.4 Learning from data or simulations ===
Line 108: Line 112:
 To “learn” from data in [[https://www.norsys.com/WebHelp/NETICA.htm|Netica]], create a text file where the column names match the names of the nodes in the network, and rows represent cases (i.e. observations, plots, measurements). Go to Cases -> Learn -> Incorp Case File (to derive CPTs by simply counting the cases) or Learn Using EM (to use the expectation maximisation algorithm, e.g. in case of missing data). To use learning only for the CPT of one node, select the node before performing the learning.  To “learn” from data in [[https://www.norsys.com/WebHelp/NETICA.htm|Netica]], create a text file where the column names match the names of the nodes in the network, and rows represent cases (i.e. observations, plots, measurements). Go to Cases -> Learn -> Incorp Case File (to derive CPTs by simply counting the cases) or Learn Using EM (to use the expectation maximisation algorithm, e.g. in case of missing data). To use learning only for the CPT of one node, select the node before performing the learning. 
  
-Learning from simulations was used to populate one of the nodes in the avalanche protection network, while in-situ data were used to quantify some nodes in the BN of ecosystem services in the Wadden Sea.+Learning from simulations was used to populate one of the nodes in the [[Avalanche protection in Davos, Switzerland|avalanche protection network]], while in-situ data were used to quantify some nodes in the BN of ecosystem services in the Wadden Sea.
  
 ==== 7. Testing, evaluating, and updating the BN ==== ==== 7. Testing, evaluating, and updating the BN ====
how_to_develop_a_bn.1545149700.txt.gz · Last modified: 2023/04/21 15:30 (external edit)