Table of Contents
1. Upload a BN
In the first step, the BN we want to run should be uploaded in the .dne or .net format (as created in Netica or another BN software). This is done by using “choose file” and selecting the network file, or dragging and dropping the network file into the application. Then, we click “Proceed”.
Once a network has been uploaded, it is visualized in the GUI, displaying the nodes with their states (in case of continuous nodes, the upper bound of each interval is displayed), and the links between them. The nodes can be moved around.
2. Configure the run
2.1 Select target nodes
When hovering with your mouse above a node, additional options appear. A node can be selected as a target node, which means that the output of running the network will contain the posterior probability distribution of this node. When we select a node as a target node, a target icon will appear in its upper right corner. You may want to save the configuration of the run (including the information about which are the target nodes, etc.), in case you will run it again later. You can also upload a previously saved configuration.
2.2 Set non-spatial evidence
By clicking on the state of a node, we set that state as hard evidence, which is then shown in bold. This is useful for non-spatial evidences, for nodes which have the same state over the whole study area, such as an agricultural policy.
3. Spatial inputs
3.1 Raster
In the raster mode of gBay, the input data as well as outputs are in raster (.tif) format. The input rasters correspond to nodes that we have data on. We can upload input rasters by simply dragging the .tif file from our file explorer to the corresponding node in gBay. When data is added to a node, the colour of the node changes, and the name of the .tif file is displayed.
The input rasters should all have the same extent, cell size, and spatial reference (coordinate system).
When the input data represents hard evidence (we know the state of the node at each pixel, with 100% certainty, e.g. we know that the land cover of a pixel is forest), then the input raster has one band, where the value of each pixel corresponds to the number of a state of the node. For continuous nodes, the value of the pixel represents the real value (e.g. forest cover of 75%).
When we use soft evidence (a probability distribution for each pixel), the input raster should have as many bands as the number of states of the corresponding node. Each band represents the probability of a state (e.g. the probability that the land cover of a pixel is forest). The values of all bands should sum up to 100.
If soft evidence is given (to a root or intermediate node), the effect is identical to the Netica's “calibrate…” function (in the “Enter findings” menu).
3.2 Vector
When using vector data (zipped .shp files or .gdb ESRI geodatabase files), the input nodes are represented in the attribute table of the dataset.
For hard evidence, the attribute table should have a column corresponding to the name of the input node, with values corresponding to the states of the node. A value of 0 means NODATA.
For soft evidence, each state of the input node should be represented by a column of the attribute table, with values representing the probabilities of the states (which should sum up to 100). The column names should be the node name, followed by two underscores, an 's' (from state) and the number of the state. If all values are 0, it means NODATA.
Example: to set soft evidence on node 'lu_t0', with three states, the attribute table should contain the columns:
lu_t0__s1, lu_t0__s2, lu_t0__s3
The vector file can be added to the network by dragging it to the box labelled “Vector file”. Please note that gBay does not modify the geometry of the vector file, but simply performs inference on each object, using information from the attribute table.
4. Run the network
Once you have set up the network, selected the target nodes, and uploaded the spatial inputs, you can click on “Run” to run the network. gBay will use your spatial data to perform inference in the BN for each pixel or feature, and produce an output of the results.
Any potential errors or warnings will appear as pop-ups. If you want to see the progress of the processing, select the option to “Show console”.
If you have a complex network and are running it with a large spatial dataset, this may take some time. In this case, you should enter your email address, where you will receive a notification when the process is completed, along with a link where you can download the data.
5. Outputs
5.1 Raster
The output of a gBay run is a posterior probability distribution raster of each target node (named target_node.tif). This raster has one band for each state of the target node, where the value represents the probability of the state. In addition, the last band of the raster represents the most probable state.
In addition, for each target node, some metrics of the posterior probability are calculated and stored in an additional raster file called target_node_stats.tif. For discrete target nodes, this file contains one band with Shannon’s evenness index of the posterior probability distribution:
\(J = H'/H_{max}\), where \(H' =-sum_{i=1}^N(p_i*log_2p_i)\), \(H_{max} = log_2(N)\), \(p_i\) is the probability of state i and N is the number of states. (We use \(log_2\) to reflect the common use in information theory.)
The index is a standardized measure of entropy (can be compared between nodes with different numbers of states) and expresses uncertainty. It has values between 0 and 1, where 1 denotes a uniform distribution between all possible states (maximum uncertainty), and 0 denotes complete certainty that the output node is in a specific state.
If the target node is continuous, the stats output contains six bands with the following values:
1. Evenness index 2. Mean 3. Median 4. Standard deviation
Overview of gBay spatial inputs and outputs
Input format | Input values | Output | |
---|---|---|---|
Raster | .tif file per input node | Hard evidence: | target.tif: |
Value = node state (discrete nodes) or continuous value (continuous nodes) | one band per state: | ||
Soft evidence: | value = probability of state | ||
One band per state | additional last band: | ||
value = probability of state | value = most likely state | ||
target_stats.tif: bands: | |||
1. Evenness index | |||
Only for continuous nodes: | |||
2. Mean | |||
3. Median | |||
4. Standard deviation | |||
Vector | one .shp file or geodatabase .gdb (reads the attribute table) | Hard evidence: | Same geometry as input |
Column of attribute table with same name as input node | attribute table with a column for each state of the target node: | ||
value = node state (discrete nodes) or continuous value (continuous nodes) | value = probability of state | ||
Soft evidence: | additional column: | ||
Columns with node name and state number | value = most likely state | ||
(lu_t0s1, lu_t0s2, lu_t0__s3): | |||
value = probability of state |