how_to_develop_a_bn
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
how_to_develop_a_bn [2018/12/18 16:49] – stritiha | how_to_develop_a_bn [2023/04/21 15:30] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 11: | Line 11: | ||
Evidence can be data (e.g. when we know the type of land cover in a pixel) or scenarios (e.g. when we explore what happens in the system when we increase the harvesting rate). When we know the state of a node with 100% certainty, this is called **hard evidence** (the land cover is a forest), while **soft evidence** contains some uncertainty and is in the form of a probability distribution (the land cover is a forest with 70% probability and a grassland with 30% probability). | Evidence can be data (e.g. when we know the type of land cover in a pixel) or scenarios (e.g. when we explore what happens in the system when we increase the harvesting rate). When we know the state of a node with 100% certainty, this is called **hard evidence** (the land cover is a forest), while **soft evidence** contains some uncertainty and is in the form of a probability distribution (the land cover is a forest with 70% probability and a grassland with 30% probability). | ||
+ | {{: | ||
+ | //Example of a Bayesian Network that describes the ecosystem service of recreation. The CPT of node " | ||
==== 1. Model purpose and context ==== | ==== 1. Model purpose and context ==== | ||
Before the construction of the BN begins, it is important to clarify the problem and objectives, as well as the modelling context: | Before the construction of the BN begins, it is important to clarify the problem and objectives, as well as the modelling context: | ||
Line 25: | Line 27: | ||
A free test version of Netica (with a limit of 15 nodes per network) can be downloaded at http:// | A free test version of Netica (with a limit of 15 nodes per network) can be downloaded at http:// | ||
- | However, many functionalities are also available in other software packages. For an overview, see review by Pérez-Miñana (2016). | + | However, many functionalities are also available in other software packages. For an overview, see review by [[https:// |
- | In order to map ES, we link BNs to spatial data using a specialized online application (gBay.ethz.ch) developed at ETH-PLUS within the framework of the ECOPOTENTIAL project. | ||
==== 3. Defining the nodes==== | ==== 3. Defining the nodes==== | ||
Line 45: | Line 47: | ||
=== 3.2 Creating nodes in Netica=== | === 3.2 Creating nodes in Netica=== | ||
- | To create a node in Netica, click on the icon and then place the new node somewhere on the canvas. Double click on the node to open a window with the node properties, where you can change the node name and define it as discrete or continuous. | + | To create a node in [[https:// |
==== 4. Designing the network ==== | ==== 4. Designing the network ==== | ||
Line 69: | Line 71: | ||
=== 5.1 Defining node states in Netica === | === 5.1 Defining node states in Netica === | ||
To modify the states of a node in [[https:// | To modify the states of a node in [[https:// | ||
- | + | {{: | |
+ | |||
+ | // Defining states for discrete and continuous nodes in Netica.// | ||
==== 6. Filling the CPTs ==== | ==== 6. Filling the CPTs ==== | ||
The links between nodes in a BN are represented by conditional probability tables, where a probability distribution of a child node is defined for every combination of states of its parent nodes. Depending on the availability of data or models, various methods can be used to populate CPTs, from expert elicitation to “learning” from data. During this process, we may find it necessary to return to previous steps and redefine the nodes, their states, or the links between them. | The links between nodes in a BN are represented by conditional probability tables, where a probability distribution of a child node is defined for every combination of states of its parent nodes. Depending on the availability of data or models, various methods can be used to populate CPTs, from expert elicitation to “learning” from data. During this process, we may find it necessary to return to previous steps and redefine the nodes, their states, or the links between them. | ||
In [[https:// | In [[https:// | ||
- | + | ||
+ | {{: | ||
+ | |||
+ | // Examples of a deterministic (left) and probabilistic (right) equation in Netica.// | ||
=== 6.1 With experts or stakeholders === | === 6.1 With experts or stakeholders === | ||
- | When data is lacking, the CPTs can be filled manually by experts or by stakeholders. Usually, this means that the experts should specify the probability of each state of the child node given each combination of parent nodes. When a node has many parents with several states, many rows of CPTs need to be filled, which can lead to fatigue and boredom, and it is difficult to ensure consistent distributions (Das 2004). This is why it is important to limit the number of parents, and the number of node states. | + | When data is lacking, the CPTs can be filled manually by experts or by stakeholders. Usually, this means that the experts should specify the probability of each state of the child node given each combination of parent nodes. When a node has many parents with several states, many rows of CPTs need to be filled, which can lead to fatigue and boredom, and it is difficult to ensure consistent distributions ([[https:// |
- | When node states are binary or ordered, this problem can be reduced by using various interpolation methods (Cain 2001, Das 2004). For example, if a child node C has three states (“low”, “medium”, | + | When node states are binary or ordered, this problem can be reduced by using various interpolation methods ([[http:// |
- | Distributions of continuous variables can also be elicited from experts. One useful approach is the four-point estimation method (Speirs-Bridge et al. 2010), where we ask experts for the expected value of the node for a specific combination of parents, the expected upper and lower bounds of possible values, and their confidence in their estimate. Using this information, | + | //Example of interpolation in a CPT, where the weights of the parents are w1 = 0.17, w2 = 0.5, w3 = 0.33. Elicited probabilities are shown in bold, while all other probabilities in the table are calculated using interpolation, |
+ | {{: | ||
+ | |||
+ | |||
+ | Distributions of continuous variables can also be elicited from experts. One useful approach is the four-point estimation method ([[https:// | ||
In some cases, experts find it easier to deal with categories rather than continuous variables, and it may be useful to translate continuous nodes to discrete classes using fuzzy logic. | In some cases, experts find it easier to deal with categories rather than continuous variables, and it may be useful to translate continuous nodes to discrete classes using fuzzy logic. | ||
Line 89: | Line 100: | ||
=== 6.2 Linking remote sensing proxies to the state of the ecosystem=== | === 6.2 Linking remote sensing proxies to the state of the ecosystem=== | ||
- | To map ecosystem services, proxies of ecosystem structure are often derived from remote sensing (e.g. land cover classifications or LiDAR-based measurements of vegetation cover). However, these remote sensing products often include some uncertainty due to measurement errors or misclassifications. | + | To map ecosystem services, proxies of ecosystem structure are often derived from remote sensing (e.g. land cover classifications or LiDAR-based measurements of vegetation cover). However, these remote sensing products often include some uncertainty due to measurement errors or misclassifications. |
=== 6.3 From existing empirical models=== | === 6.3 From existing empirical models=== | ||
Often, some parts of the network have already been extensively researched and empirical or process-based models are available in literature. In this case, the model can be incorporated in the BN in the form of probabilistic equations. This usually means that the probability distribution of the child node is a normal distribution, | Often, some parts of the network have already been extensively researched and empirical or process-based models are available in literature. In this case, the model can be incorporated in the BN in the form of probabilistic equations. This usually means that the probability distribution of the child node is a normal distribution, | ||
- | For an example of how an empirical model can be incorporated in a BN, see the avalanche protection case study. | + | For an example of how an empirical model can be incorporated in a BN, see the [[Avalanche protection in Davos, Switzerland|avalanche protection case study]]. |
=== 6.4 Learning from data or simulations === | === 6.4 Learning from data or simulations === | ||
Line 101: | Line 112: | ||
To “learn” from data in [[https:// | To “learn” from data in [[https:// | ||
- | Learning from simulations was used to populate one of the nodes in the avalanche protection network, while in-situ data were used to quantify some nodes in the BN of ecosystem services in the Wadden Sea. | + | Learning from simulations was used to populate one of the nodes in the [[Avalanche protection in Davos, Switzerland|avalanche protection network]], while in-situ data were used to quantify some nodes in the BN of ecosystem services in the Wadden Sea. |
==== 7. Testing, evaluating, and updating the BN ==== | ==== 7. Testing, evaluating, and updating the BN ==== | ||
Line 109: | Line 120: | ||
Sensitivity analysis is a useful tool that determines the influence of individual variables on the outputs, which can help us decide where more data is necessary. When new data becomes available, it can be used to update the conditional probability distributions in an iterative process. | Sensitivity analysis is a useful tool that determines the influence of individual variables on the outputs, which can help us decide where more data is necessary. When new data becomes available, it can be used to update the conditional probability distributions in an iterative process. | ||
- | In Bayesian Network modelling, sensitivity analysis is often used to evaluate the influence of variables in the modelled system on the posterior probability distribution of a node of interest (Uusitalo 2007, Marcot 2012). Sensitivity to findings can be measured by the reduction in uncertainty (e.g. entropy or variance) in the target node due to a finding on another node. Entropy reduction is expressed by the measure of mutual information (Kjaerulff and Madsen 2013): | + | In Bayesian Network modelling, sensitivity analysis is often used to evaluate the influence of variables in the modelled system on the posterior probability distribution of a node of interest ([[https:// |
$ I(X,Y) = H(X) - H(X|Y) = H(Y) - H(X|Y)= ∑_Y P(Y)∑_X P(Y)(P(X, | $ I(X,Y) = H(X) - H(X|Y) = H(Y) - H(X|Y)= ∑_Y P(Y)∑_X P(Y)(P(X, |
how_to_develop_a_bn.1545148176.txt.gz · Last modified: 2023/04/21 15:30 (external edit)