In a Bayesian network (BN), how a node of interest is affected by the observation at another node is a main concern, especially in backward inference. This challenge necessitates the proposed global sensitivity analysis (GSA) for BN, which calculates the Sobol’ sensitivity index to quantify the contribution of an observation node toward the uncertainty of the node of interest. In backward inference, a low sensitivity index indicates that the observation cannot reduce the uncertainty of the node of interest, so that a more appropriate observation node providing higher sensitivity index should be measured. This GSA for BN confronts two challenges. First, the computation of the Sobol’ index requires a deterministic function while the BN is a stochastic model. This paper uses an auxiliary variable method to convert the path between two nodes in the BN to a deterministic function, thus making the Sobol’ index computation feasible. Second, the computation of the Sobol’ index can be expensive, especially if the model inputs are correlated, which is common in a BN. This paper uses an efficient algorithm proposed by the authors to directly estimate the Sobol’ index from input–output samples of the prior distribution of the BN, thus making the proposed GSA for BN computationally affordable. This paper also extends this algorithm so that the uncertainty reduction of the node of interest at given observation value can be estimated. This estimate purely uses the prior distribution samples, thus providing quantitative guidance for effective observation and updating.
During the past 30 years, the Bayesian network (BN) has become a key method for representation and reasoning under uncertainty in the fields of engineering [1,2], machine learning [3,4], artificial intelligence [5,6], etc. In a BN, random variables are denoted by nodes and their dependence relationships are denoted by directed arcs (arrows). An arc (arrow) between two nodes indicates that the distribution of the downstream node (child node) is conditioned on the value of the upstream node (parent node), i.e., the arc is associated with a conditional probability distribution (CPD). The entire BN represents the joint probability distribution of the random variables. As long as the BN has been established, two tasks can be pursued: (1) calculate the distribution of a downstream node based on the given observation of an upstream node, i.e., forward propagation and (2) estimate the posterior distribution of an upstream node at the given observation of a downstream node, i.e., backward inference. Usually, the purpose of backward inference is to reduce the uncertainty of the variable (node) of interest, as shown in Fig. 1.
We can see that both the forward propagation and the backward inference concern that whether fixing a node would affect the distribution of another node. And this concern actually requires a sensitivity analysis: in forward propagation, if the node of interest has a low sensitivity with respect to an input node (an upstreaming node which causes uncertainty in the node of interest), we can be simply fix this input node to achieve dimension reduction without affecting the resultant distribution of the node of interest; and in backward propagation, if the node of interest has a low sensitivity with respect to an observation node, then this observation node cannot effectively reduce the uncertainty of the node of interest (as shown in Fig. 1(b)); thus, we should switch to observe a more appropriate node leading to higher sensitivity. Note that this sensitivity analysis is desired before conducting the forward propagation and/or the backward inference.
where means all the model inputs other than , and means the variance, and means the mean value. For the numerator of the middle term of Eq. (1), denotes the mean value of across different values of at given value of , and means the variance across different values of . Similarly, for the numerator of the right term of Eq. (1), denotes the variance of across different values of at given value of , and means the variance across different values of . Furthermore, the middle term and the right term are equal due to the law of total variance .
quantifies the contribution of input by itself to the uncertainty in output . Specifically, the last term of Eq. (1) also indicates that is the average ratio of the output variance reduction by fixing . For example, means that the variance of will reduce by 30% on average after fixing .
The proposed GSA for BN aims to calculate the first-order Sobol’ index to quantity the contribution of node toward the uncertainty of the node of interest . (As mentioned earlier, denotes an input node in the case of forward propagation, and an observation node in the case backward inference). However, this GSA for a BN confronts two challenges of feasibility and affordability. First, the computation of the Sobol’ index requires a deterministic function . The term “deterministic function” means that if all the inputs are given specific values, the function outputs a single value. It does not matter whether the function is closed-form or implicit. But a unique input to the model gives a unique value of the output, and the variance-based Sobol’ index  needs this assumption. In a previous paper, we have discussed the need for a deterministic function in the Sobol index computation . Our paper emphasizes the term deterministic function to contrast from a “stochastic” prediction; in the latter case, the function output is probabilistic (i.e., it has many possible realizations) even if all inputs are fixed. A simple stochastic model is a Gaussian variable if we consider and as model inputs and as model output. The BN is a stochastic model, i.e., the node of interest still has a distribution even at given values of other nodes. And the required deterministic function mapping (and some other variables) to the node of interest is unestablished. Proof of the existence and the establishment of this deterministic function needs to be solved. Note that usually the Sobol’ index is applied to deterministic function of continuous input/output, so that this paper only considers the continuous BN where all the nodes are continuous variables.
Second, the computation of the Sobol’ index can be expensive even if the deterministic function is established. Based on Eq. (1), computing analytically is nontrivial, since in Eq. (1) indicates a multidimensional integral. Computing by Monte Carlo simulation (MCS) directly is also expensive. The numerator in Eq. (1) leads to a double-loop MCS : the inner loop computes the mean value of using random samples of ; and the outer loop computes by iterating the inner loop times at different values of . Various algorithms have been proposed to reduce the cost in the computation of the first-order Sobol’ index, categorized into analytical methods [17–19] and Monte Carlo methods [16,20–23]. However, all of these algorithms assume uncorrelated model inputs, while the nodes in a BN are usually correlated. Saltelli's paper  in 2002 mentioned that there is no alternative to the expensive double-loop MCS to compute with correlated model inputs. This hurdle in computational cost is also to be solved.
The rest of the paper is organized as follows: Section 2 uses the auxiliary variable method to convert the path between node and node to a deterministic function, thus making the Sobol’ index computation feasible for a BN. Section 3.1 introduces an efficient algorithm to directly estimate the Sobol’ index from Monte Carlo samples of the prior distribution of the BN, so that the proposed GSA for the BN is affordable. The resultant index is the averaged variance reduction ratio (VRR) of the node of interest across all possible values of the observation node. If a specific value of the observation node is given, the proposed method is extended to give a better estimate of variance reduction ratio at the given observation value. Section 4 illustrates the proposed method using two examples, including a time-independent static BN and a time-dependent dynamic BN.
Note: the sensitivity analysis in this paper is based on “current knowledge,” which means the joint distribution of the Bayesian network. This paper uses the prior distribution samples to conduct the sensitivity analysis since that is our current knowledge before any observation and updating. Of course, after collecting an observation and subsequent calculation of the posterior distribution, we can also use the samples from the posterior distribution to calculate the posterior sensitivity index.
Feasibility of Global Sensitivity Analysis for a Bayesian Network
Auxiliary Variable Method.
The auxiliary variable method was developed by Sankararaman and Mahadevan  to quantitatively distinguish the contributions of aleatory natural variability and epistemic distribution parameter uncertainty in a random variable . The distribution of is conditioned on the value of its distribution parameter . The uncertainty about the value of is represented by a probability density , and this uncertainty is epistemic. The aleatory uncertainty of is denoted by a conditional distribution , which changes as the value of changes.
Equation (2) is a deterministic function , since a sample of and a sample of lead to a single value of . At a given value of , the sample of and the sample of have a one-to-one mapping, i.e., a single value of is determined once the value of is decided. Thus the natural variability in is represented by . This standard uniform random variable , which is the CDF value of , is named as the auxiliary variable.
Although the use of auxiliary variable is a standard procedure in sampling random variables, generally it is used implicitly and only the resultant samples of the random variables are recorded and utilized. However, if we use this auxiliary variable explicitly, it makes the GSA for BN possible. This will be explained in the remainder of this section.
Deterministic Function for a Directed Path in Bayesian Network.
The auxiliary variable method have been extended to any variable whose distribution is conditioned on other variables [10,26], i.e., to any CPD in the BN. Assume that the distribution of a random variable depends on the value of two other random variables and by a CPD . Then the variability in can be captured by a single auxiliary variable , which is the CDF value of . Thus, the uncertainty in variable is caused by two components: (1) the uncertainty due to the parent nodes ; and (2) the uncertainty expressed by the CPD at given values of and . The introduced auxiliary variable captures the later part. (With reference to Sec. 2.1, and are analogous to distribution parameters for the random variable ). As shown in Fig. 2 , and constitute a simple BN. The introduced auxiliary variable converts to be a deterministic node, which means the value of is fixed once the value of its parent nodes is given. Finally, this auxiliary variable build a deterministic function , where is the inverse CDF of the CPD .
The auxiliary variable method can be further extended to a directed path in a BN. Here, is the node of interest, and the objective is to compute the sensitivity index of to decide whether is sensitive to it. An example of the directed path is in Fig. 3. As shown in Fig. 4, by introducing auxiliary variables to each CPD in this directed path, a deterministic function mapping to is established
where for is the inverse CDF of the CPD , and is the auxiliary variable introduced for this CPD, and represents the parent nodes of that are not in this path (Note that another notation is used later, which means all the parents node of , i.e., in Fig. 4. The inputs of Eq. (3) are for , thus Eq. (3) can be also denoted as a deterministic function .
Deterministic Function for a Trail in Bayesian Network.
The directed path from to in Sec. 2.2 requires that all the arcs are directed toward , which may NOT be satisfied. It is not usually that some arrows are toward . In backward inference, the node of interest is usually an upstream node (probably root node) while the observation node is usually a downstream node. Thus, the path between them turns to be , where all the arrows are toward . In this case, the function will be still missing.
To solve this problem, let consider a general case named “trail.” A trail (where the arc “ ” is still directed, either “ ” or “ ”) only requires all the adjacent nodes in the path are connected by arcs, regardless of the direction of the arcs. The deterministic function established in Eq. (3) for the directed path can be also extended to the trail based on the theorem of arc reversal .
Theorem 1 Arc Reversal. Given that there is an arcfrom nodeto node, but no other directed path fromto, arccan be replaced by arc. Afterward, both nodes inherit each other's parent nodes.
This theorem is illustrated in Fig. 5. Here, indicates the parent nodes of , and indicates the parent nodes of . In addition, are the nodes which are the parents of but not the parents of , and correspondingly are the nodes which are the parents of but not the parents of ; and are the shared parents of and . Figure 5 shows that after reversing the arc between and , extra arcs and are also derived based on Ref.  and added the new BN to guarantee that the new BN after arc reversal is mathematically equivalent to the original BN. The CPDs also need to be redefined, and the derivation of the new CPDs can be also found in Ref. . However, note that the proposed method in this paper do NOT need to derive these new CPDs. The main focus of this section is to illustrate the possibility of arc reversal and prove the existence of the deterministic function mapping to even if the path between them is undirected.
With respect to the trail between and , Theorem 1 proves that the arc between two adjacent nodes and ( so that they are adjacent) can be reversed, as long as there is no other directed path from to . If all the arcs toward can be reversed, this trail will be converted to a directed path from to so that a deterministic function mapping to exists based on Eq. (3). In Fig. 3, the trail can be converted to a directed path by reserving the arc ; then, a deterministic function mapping to can be constructed using auxiliary variables.
The arc reversal makes the sensitivity analysis regarding the backward Bayesian inference possible. In this case, we aim to compute the Sobol’ index of observation node regarding an inference node of interest so that we can quantify the variance reduction of by fixing node at an observation value. The existing path in the Bayesian network is , then the arc reversal enables us to reverse all the arcs to build the required deterministic function.
It is arc reversal that makes the sensitivity analysis regarding Bayesian inference possible. In this case, we aim to compute the sensitivity of root node with respect to the observation node so that we can quantify the uncertainty reduction of by fixing node at an observation value. We need a directed path from to . However, the existing path in the Bayesian network is always directed from root node toward the observation , so that we must reverse all the arcs to build the required deterministic function.
Global Sensitivity for a Path in the Bayesian Network.
For two nodes and in the BN, Secs. 2.1–2.3 explained the possibility to build a deterministic function mapping to via: (1) the directed path or trail between them, (2) the theorem of arc reversal, and (3) the introduction of auxiliary variables. Thus, we can conduct the GSA on Eq. (3) and compute the first-order Sobol’ index for . As explained below in Eq. (1), is the average ratio of the reduced variance of by fixing . In backward inference, a low value indicates that measuring to reduce the uncertainty of is just a waste of effort, and another node of higher sensitivity index would be more appropriate.
However, the computation of is nontrivial. First, building the deterministic function explicitly can be complicated for both directed and trail. For the directed path , the effort to build the deterministic function in Eq. (3) becomes intensive if the path is long so that many nodes and auxiliary variables will be involved. For the trail, more efforts to modify the structure of the BN and derive new CPDs are required. In backward inference, usually the observation node is downstream and the node of interest is the upstream; thus, the corresponding path is . To build the required deterministic function mapping to , all arcs in the path need to be reversed, which brings intensive computational efforts.
Second, even with the deterministic function established, calculating the sensitivity index also needs intensive effort. The inputs of the deterministic function include the nodes in the BN, so the correlation between them is very common. As mentioned in Sec. 1, since current efficient algorithms for Sobol’ index usually require uncorrelated inputs, the expensive double-loop MCS is the only choice.
Actually, the purpose of this section is to mathematically prove the existence of a deterministic function between two nodes (regardless of the computational effort), so that the Sobol’ index, which requires the deterministic function, is applicable to quantify the sensitivity for the Bayesian network. The Sobol’ index computation will be realized by an algorithm proposed by the authors. Details of this algorithm can be found in Ref. , and a brief introduction is given in Sec. 3.1. This algorithm directly extracts the first-order Sobol’ index from the input–output samples. Using this algorithm, the Sobol’ index of the observation node quantifying its contribution toward the uncertainty of the node of interest can be computed purely using their samples generated from the joint prior distribution of the BN, where the computation of the inference is NOT needed. In detail, we generate samples of the root nodes from their prior distributions and then generate samples of nonroot nodes based on the samples of their parent nodes and the CPDs. Therefore, explicitly establishing the deterministic function is not necessary, and the expensive double-loop MCS method is also avoided. Section 3.2 extends this algorithm to a better variance reduction estimate at a specific observation value.
Sobol’ Index Computation and Variance Reduction at Given Observation
Directly Estimate the First-Order Sobol’ Index From Input–Output Samples.
Consider a deterministic function of where . This algorithm is regarding first-order Sobol’ index expression of , whose numerator implies an expensive double-loop Monte Carlo simulation.
First we illustrate stratified sampling, which generates samples in equal probability intervals to represent the distribution of a random variable . Figure 6(a) shows one strategy  of stratified sampling: (1) divide the CDF of into intervals such that these intervals have the same length and (2) generate one sample (the red dots in Fig. 6(a), and ) from each CDF interval and obtain samples of (the green dots in Fig. 6) by CDF inversion , where is the inverse CDF of . If we take the bounds of these intervals of the CDF as the inputs of , the sampling space of is actually divided into equally probable intervals , as shown in Fig. 6, and is actually a random sample generated within . In this paper we also denote .
where is an unknown but existing point in . Note that now is the variance of given and is the variance of given .
where then Eq. (11) can be realized by the following steps:
Generate random samples of ;
Obtain corresponding values of by evaluating , and estimate using all samples of ;
Divide the domain of into equally probable intervals, as shown in Fig. 6;
Assign the samples of into divided intervals based on one-to-one mapping between the samples of and samples of ;
Estimate as the sampling variance of in each interval;
Estimate as the sampling mean of in step 5;
The first-order index is .
The steps to realize the proposed method are also illustrated in Fig. 7, where samples in different equally probable intervals are represented in different colors. Step 1 in Fig. 7 shows that the proposed algorithm can directly estimate from input–output samples without rerunning the underlying model. Moreover, steps 3 and 4 show that the samples of are not used in calculating the index for ; therefore, the calculation of purely depends on the samples of and , and can be achieved even if the samples of are missing. In addition, the derivation of Eq. (11) does not assume uncorrelated inputs; thus, this algorithm can handle both correlated and uncorrelated inputs. These three advantages make this algorithm suitable for the GSA of a BN using the samples generated from the joint prior distribution.
One question is how many samples are needed to reach desirable accuracy of the computed index. This question has been addressed in an earlier paper  by the authors. A heuristic rule is: (1) at least 50 intervals and (2) each interval contains at least 50 samples. To guarantee this condition, we simply need to generate 2500 samples.
Variance Reduction Assessment by the Proposed Algorithm.
Specifically, the desired first-order index can be calculated by considering the as “” and as “” in Fig. 7.
The resultant index is the average ratio of the variance reduction of by fixing at its observation; and this is an average over all possible values of . This is an informative estimate before the forward propagation or backward inference if the value of the observation is NOT known.
Section 3.1 introduces a new algorithm to efficiently estimate the first-order Sobol’ index from Monte Carlo samples. In the proposed algorithm, the calculation of the first-order Sobol’ index of an input does not need the samples of other inputs. This advantage makes it an ideal algorithm for the GSA of a BN, where generating prior samples are much easier than building the underlying deterministic function in discussed in Sec. 2. The proposed algorithm can also handle correlated inputs, so that it is suitable for a BN where the nodes are generally correlated.
Section 3.2 described the implementation of the proposed algorithm to the BN. Equation (12) can be used to calculate the average ratio of the variance reduction of a node of interest over all possible observations of another node ; Eq. (13) provides a better estimate of variance reduction ratio if the value of the observation is known.
In Sec. 4, two numerical examples are provided to illustrate the proposed GSA method for the BN. The first example is a time-independent static BN, and the second example is a time-dependent dynamic BN.
Structural Dynamics Problem.
A structural dynamics problem provided by Sandia National Laboratories is used to illustrate the proposed method, and more details on this problem can be found in Refs. [29–32]. As shown in Fig. 8, the system of interest contains three mass–spring–damper components in series; and these components are mounted on a beam supported by a hinge at one end and a spring at the other end; and a sinusoidal force input is applied on the beam.
This system has three model parameters of spring stiffnesses and they are assumed to have unknown true values to be calibrated. The prior distribution of is assumed to be Gaussian with a coefficient of variation of 10% and mean values of , , and .
The quantity to be measured for model calibration is the maximum acceleration in the third mass . A computational model based on finite element analysis has been provided by Sandia National Laboratories .
To improve the computational efficiency, a Gaussian process (GP) [33,34] surrogate model is constructed to replace the expensive dynamics computational model. The prediction of the GP model is a Gaussian distribution ; thus, a CPD is given by the GP model.
The observation variable is denoted as and we have where is the measurement error with a zero-mean Gaussian distribution . Thus, another CPD is given by the measurement error. In this example, is another parameter to be calibrated and we assign a noninformative uniform prior distribution to it.
A BN is established for this model calibration problem, as shown in Fig. 9. In this example, we are interested in: (1) calculating the first-order Sobol’ index of the observation variable quantifying its contribution toward the uncertainty in each calibration parameters of interest and (2) predicting the VRR of the calibration parameters at a given observation.
As samples are generated from the joint prior distribution of this network, the first-order Sobol’ indices of are obtained by considering the calibration parameter as and the observation variable as in Eq. (12). The results are listed in Table 1. From this table, we conclude that the variance of will reduce by 50% on average due to calibration; the variance reduction of is 11% on average; while the variance of and will not be reduced significantly by calibration. This is a very valuable insight, which is also supported by Fig. 10. Thus, if we want to reduce the uncertainty in , we need to observe another quantity. In the latter computation of VRR at specific observations of , we focus on and .
Table 1 shows the average VRR. Now assume that the specific observed value of is known (a synthetic data point). Based on Eq. (13), we predict the VRR of and for this specific observation, as shown in Table 2, where the “by inference” method mean we finish the Bayesian inference and compute the VRR based on the posterior distributions. In this example, the Bayesian inference is finished by the rejection sampling algorithm  which results in samples from the posterior distributions of the calibration parameters. Figure 10 shows the probability density functions of these posterior distributions at data point of . We recalculate the VRR by comparing the variances of the posterior samples and the prior samples. As shown in Table 2, our earlier predictions are close to the sample-based results. However, the proposed method only uses the samples from the prior distribution, and no actual Bayesian inference effort is required. In a personal computer of Intel Core i7-4710HQ CPU, the proposed method spends less than 1 s to calculate the sensitivity index at a data point, while the other method by inference spends about 20 s.
In summary, this example verified the effectiveness of the proposed method to predict the variance reduction ratio before conducting the Bayesian inference. Thus, the proposed method provides valuable guidance for selecting observation nodes; for example, in the subsequent updating, nodes such as and cannot be updated by observing data.
Example of a Dynamic Bayesian Network.
This example applies the proposed method to a mathematical example of a dynamic BN, as shown in Fig. 11. The CPDs of this dynamic BN are as follows.
The root node has an unknown true value to be calibrated, so that is a static node and . The prior distribution of is . and are two dynamic state variables, and their states are to be tracked. At the CPD of the child node is ; at the CPD of is , thus the distribution of depends on its previous value and the value of . is the child node of and its CPD is . In this problem, the observation node is and its CPD is , i.e., the value of plus a measurement error of zero mean Gaussian distribution. This example considers the first 30 steps of this dynamic BN. Assuming the true value of is 2.5, the synthetic data of the observation node is generated at each step, as shown in Fig. 12.
A widely used particle filter method named sequential importance resampling (SIR) algorithm  is applied in this example to track the state variables. Here a particle is a sample from the joint distribution of the state variables. This SIR algorithm propagates the particles of the posterior distribution at time step to time step to obtain the particles of the prior distribution of time step . The likelihoods of these particles are calculated and normalized as the weights for them. Then the particles are resampled based on the weight terms and the resultant new particles represent the posterior joint distribution of the state variables in time step . Details of this SIR algorithm can be found in Ref.  and the Appendix.
The number of particles in this example is 50,000. The mean value and 95% bounds of the posterior distributions of the state variables are shown in Fig. 13. The uncertainty of reduces and its posterior distribution approximates to its true value 2.5, but this uncertainty reduction is not significant after step 20.
At each step, before the calculation of the likelihoods and the particle resampling, we apply the proposed method using the particles of the prior joint distribution of the state variables. The VRR of each state variable is predicted by the proposed method of Eq. (13) using the prior samples of the state variables. This VRR is also calculated by the prior/posterior samples at each step. Figure 14 shows that the results from these two methods are consistent so that the proposed method is verified. Note that the proposed method uses the prior samples and the observation data, while the sample-based method needs both the prior and posterior sample. In other words, the proposed method can be applied before Bayesian inference, but the sample-based method happens after the Bayesian inference has been done.
In this example, the CPD of is so the uncertainty of will not be enlarged in the propagation from time step to time step . However, its uncertainty is reduced by the updating in each time step. Figure 13 shows that this uncertainty reduction is significant for the first five times steps so that the VRR in Fig. 14 has a large value before time step 5; this uncertainty reduction is negligible after time step 20 so that the value of VRR is closer to zero after time step 20.
In comparison, Fig. 13 shows that the uncertainty in the posterior distributions of and are increasing, even if their VRR values in Fig. 14 are always significant. The reason is that the uncertainties of and are enlarged in the propagation from time step to time step , so their prior distributions at have more uncertainty than the posterior distribution at The uncertainty in the prior at is reduced by the updating, but the posterior uncertainty at may not be smaller than the posterior uncertainty at if the uncertainty reduction by the updating cannot outperform the uncertainty enlargement by the propagation. Note that the VRR in Fig. 14 is the variance reduction with respect to the prior/posterior distribution at the same time step, not the variance reduction for adjacent posterior distributions.
In summary, this example extended the proposed sensitivity analysis to a dynamic BN and verified its validity. The proposed method is capable of predicting the variance reduction of each state variable before updating.
In a BN, how a node of interest is affected by fixing another node at some value is of prominent interest. The proposed GSA for BN calculates the first-order Sobol’ index to quantity the sensitivity of the node of interest with respect to an input/observation node . In forward propagation where is an upstream node, a low index indicates that is not sensitive to the value of so that we can simply fix at a deterministic value. In backward inference where is a downstream node, a low sensitivity index indicates that the uncertainty of cannot be reduced by observing ; thus, we should observe another node of higher Sobol’ index in order to reduce its uncertainty.
The proposed GSA for BN is developed in two steps. First, an auxiliary variable method is used to convert the path between node and node to a deterministic function thus making the Sobol’ index computation feasible for a BN. If the path from to is not a directed path form, the theorem of arc reversal is used to transform it to the desired directed path so that the auxiliary variable method can still be used to build the deterministic function. Second, this paper uses an efficient algorithm to directly estimate the Sobol’ index from the samples of the prior distribution of the BN, so that the proposed GSA for the BN is computationally affordable. The resultant Sobol’ index is the average variance reduction ratio across all possible observations of . This paper also extend the algorithm to give a better estimate of the uncertainty reduction of the node of interest when the value of the observation is known, thus providing an informative guidance before the updating.
The limitation of the proposed method at present is that currently it only considers a single observation; thus, an extension to the case of multiple observations needs to be addressed in future work.
The authors thank Dr. Zhen Hu for valuable discussions.
The Air Force Office of Scientific Research (Grant No. FA9550-15-1-0018).
Appendix: Introduction to Particle Filter
Particle filter is a general algorithm to track the evolution of the state variables in a dynamic Bayesian network (DBN). In this section, capital letters denote random variables; lower-case letters denote particles, where the superscript indicates that it is the th particle. The subscripts of letters indicate the time step. Thus, the state variables at time step are denoted as , as shown in the simple DBN of Fig. 15.
where is the vector of noise terms in the measurement function.
where is a delta function at .
In other words, the new state of the th particle at time step is sampled from a distribution which takes the current state and the observation as parameters.
In addition, the initial state are sampled from the joint prior distribution of the state variables, and the initial weight for each particle is .
In practice, iterations of Eqs. (A4) and (A5) over time step may lead to particle degeneracy problem, i.e., only a few particles have significant weights but most particles have negligible weights. In that case, we are assigning most computational efforts to the particles of nonsignificant contribution to the posterior distribution. This degeneracy problem can be solved by resampling, i.e., generating a new set of particles based on the particles of . The new particles represent the same posterior distribution as the former particles.
The simplest strategy of resampling is generating new resampled particles based on the discrete approximation shown in Eq. (A3), and the weight of each new particle is set as again. This resampling is bootstrapping process of iterations, and each iteration selects one particle from current particles with replacement. In an iteration, the probability that a particle is selected is proportional to its weight.