In this section, we describe the improved sensor selection methods. We formulate the sensor selection problem as a POMDP framework in conjunction with the CRLB of the target localization mean error for tackling the problem that the observers (e.g., sensor nodes) cannot reliably identify the underlying actual target states. Our method extends the POMDP framework by integrating the T-FoT approach to address the unknown target dynamic model.

POMDP framework based on T-FoT

The core idea of the POMDP is choosing the optimal selection command via minimizing the cost function or maximizing the reward function. At the time step k, the POMDP can be defined as

$$begin{aligned} psi = {S,F(cdot ;{C_{k}}),{Z^s},g(cdot |{X_k},s),{mu }(s;cdot )} end{aligned}$$

(8)

where S is a finite set of the sensor selection commands, ({Z^s}) is a finite set of the observations under the commands set S, (g(cdot |{X_k},s)) is the measurement model conditioned on the command (sin S) and the target state, (F(cdot ;{C_{k}})) is the estimated T-FoT at time k, ({mu }(s;cdot )) is the objective function by executing an action command (sin S).

In the core of our POMDP framework, the objective function ({mu }(s;cdot )) is defined as the CRLB (u_text {lb}(s_{k};{hat{X}}_{k+1})) of the pseudo-localization error of the target conditioned on the measurements from the activated sensors, which in turn depends on the selection command s (see Sect. 3.2). Here, the estimated/predicted state (hat{X}_{k+1} = F(k+1;{C_{k}})) is obtained from the estimated T-FoT [19] rather than by a Markov-jump model (see Sect. 3.3) which is indispensable prior information in traditional methods. This leads to the key difference of our approach with existing POMDP approaches [6, 16].

Typically, the sensor selection needs to meet a specific constraint. In this paper, we consider two practical constraints, i.e., the number of sensors to be selected is deterministic, or the sensors selected correspond to a deterministic CRLB with the minimum number of sensors. For these two cases, the optimal selection command is given by (9) and (10), respectively.

$$begin{aligned} &s_{k}^{*} = mathop {arg min }limits_{{s_{k} in S_{k} }} u_{{{text{lb}}}} (s_{k} ;hat{X}_{{k + 1}} ) \& {text{s}}.{text{t}}.left| {s_{k}^{*} } right| = n_{s} \ end{aligned}$$

(9)

where (S_k subseteq S) denotes the candidate sensor set at time k, (|{s_{k}^*}|) denotes the number of selected sensors, (n_s) is the specified number of sensors to be selected.

$$begin{aligned} &s_{k}^{*} = mathop {arg min }limits_{{s_{k} in S_{k} }} left| {s_{k} } right| hfill \ &{text{s}}.{text{t}}.u_{{{text{lb}}}} (s_{k} ;hat{X}_{{k + 1}} ) le T_{{{text{lb}}}} hfill \ end{aligned}$$

(10)

where (T_text {lb}) is the required CRLB such that the selected sensors can meet.

CRLB with regard to DOA

The CRLB provides the lower bound of the variance of unbiased estimators of a deterministic parameter under specific measurement conditions, which can be used to evaluate the detection capability of different sensor node subsets.

For an unbiased estimator ({hat{X}}_{k}(Z_{k})) of a parameter vector (X_{k}) based on the measurement vector (Z_{k}), the CRLB for the error covariance matrix is defined to be the inverse of the Fisher Information Matrix (FIM), denoted by J, as follows

$$begin{aligned} E{ [{hat{X}}_{k}(Z_{k}) – X_{k}]{[{hat{X}}_{k}(Z_{k}) – X_{k}]^text {T}}} ge {J_{k}^{ – 1}} triangleq u_text {lb}({hat{X}}_{k}) end{aligned}$$

(11)

where E denotes the mean value of the content and the inequality (11) means that the difference (u_text {lb}({hat{X}}_{k}) – {J_{k}^{ – 1}}) is positive semi-definite.

Now, consider the predicted target state (hat{X}_{k} = F(k;C_{k-1})) obtained from the estimated T-FoT, a n-sensor extension of the DOA measurement function as in Eq. (1) is

$$begin{aligned} hat{Z}_{k} = H(hat{X}_{k}) + V_{k} end{aligned}$$

(12)

where

$$begin{aligned} H(hat{X}_{k}) = left[ begin{array}{ccc} {tan ^{-1}}left( frac{{hat{y}}_k-y_1}{{hat{x}}_k-x_1} right) \ {tan ^{-1}}left( frac{{hat{y}}_k-y_2}{{hat{x}}_k-x_2} right) \ vdots \ {tan ^{-1}}left( frac{{hat{y}}_k-y_n}{{hat{x}}_k-x_n} right) \ end{array} right] triangleq {varvec{Theta }}. end{aligned}$$

(13)

Under the premise that (z_{k}^{1}, cdots , z_{k}^{n}) are conditionally independent of each other, the PDF of the collected measurements (Z_{k} = left[ z_{k}^{i}right] _{i=1}^{n} sim {mathcal {N}}(theta ,R_{k})) can be expressed as

$$begin{aligned} p(Z_{k}) = frac{1}{{{{(2pi )}^{n/2}}{{left| R_{k} right| }^{n/2}}}}exp left[ { – frac{{{{(Z_{k} – theta )}^text {T}}{R_{k}^{ – 1}}(Z_{k} -theta )}}{2}} right] end{aligned}$$

(14)

where (theta) is the mean value of measurements ({varvec{Theta }}) and (R_{k} = mathrm {diag}(R_{k}^{1},R_{k}^{2},dots ,R _{k}^{n})).

Then, compute the second-order derivatives of the logarithm of the measurement PDF with respect to ({hat{X}}_{k})

$$begin{aligned} J({hat{X}}_{k}) triangleq Eleft{ {frac{{partial ^{2} log p(Z_{k})}}{{partial {hat{X}}_{k}}{partial {hat{X}}_{k}}^text {T}}} right}. end{aligned}$$

(15)

Substitute Eqs. (14) to (15), the FIM based on DOA measurements (J({hat{X}}_{k})) can be shown as follows

$$begin{aligned} J({hat{X}}_{k}) = bigg [frac{partial H({hat{X}}_{k})}{partial {hat{X}}_{k}} bigg ]^text {T} R_k^{-1} bigg [frac{partial H({hat{X}}_{k})}{partial {hat{X}}_{k}} bigg ]. end{aligned}$$

(16)

Expand the (H({hat{X}}_{k})) and take the first-order partial derivative of (hat{X}_{k})

$$begin{aligned} bigg [frac{partial H({hat{X}}_{k})}{partial hat{X}_k} bigg ] = left[ begin{array}{ccc} -frac{{hat{y}}_k-y_1^{s_k}}{d_1^2} &{} frac{{hat{x}}_k-x_1^{s_k}}{d_1^2} \ -frac{{hat{y}}_k-y_2^{s_k}}{d_2^2} &{} frac{{hat{x}}_k-x_2^{s_k}}{d_2^2} \ vdots &{} vdots \ -frac{{hat{y}}_k-y_n^{s_k}}{d_n^2} &{} frac{{hat{x}}_k-x_n^{s_k}}{d_n^2} end{array} right] end{aligned}$$

(17)

where ((x_i^{s_k},y_i^{s_k})) are the position coordinates of sensor i in the sensor set selected by command (s_k), (d_i = sqrt{{{({hat{x}}_k – { x_i^{s_k}})}^2} + {{(y_k – {y_i^{s_k}})}^2}}) is the distance between the sensor and target. Thus, (J({hat{X}}_{k})) can be computed as

$$begin{aligned} begin{aligned} J({hat{X}}_k) = left[ begin{array}{ccc} sum limits _{i=1}^{n}frac{({hat{x}}_k-x_i^{s_k})^2}{R_k^i d_i^4} &{} sum limits _{i=1}^{n}frac{-({hat{x}}_k-x_i^{s_k})({hat{y}}_k-y_i^{s_k})}{R_k^i d_i^4} \ sum limits _{i=1}^{n}frac{-({hat{y}}_k-y_i^{s_k})({hat{x}}_k-x_i^{s_k})}{R_k^i d_i^4} &{} sum limits _{i=1}^{n}frac{({hat{y}}_k-y_i^{s_k})^2}{R_k^i d_i^4} end{array} right] triangleq left[ begin{array}{ccc} J_{xx} &{} J_{xy}\ J_{yx} &{} J_{yy} end{array} right]. end{aligned} end{aligned}$$

(18)

Finally, the CRLB is given as

$$begin{aligned} u_text {lb}({hat{X}}_{k}) = frac{J_{yy}+J_{xx}}{J_{xx}J_{yy}-J_{xy}J_{yx}}. end{aligned}$$

(19)

T-FoT for tracking and prediction

As we mentioned before, the T-FoT fits the time series measurements in a sliding time-window up to the current time k denoted as ([{k’},k] triangleq {k’, k’+1, …, k}), where ({k’} = max {(1,k – T)}), T is the length of the time-window. Disregarding false and missing data issues temporally here, the parameter of T-FoT at time k can be estimated in the LS sense

$$begin{aligned} {{hat{C}}_k} = mathop {arg min }limits _{C} sum limits _{t = k’}^k {left| {X_t} – {F_k(t;C)} right| _{{sum }_{e_{t}}^{-1}} ^2}. end{aligned}$$

(20)

({X_t}) denotes the position of the target at time t, where the Mahalanobis distance is used, i.e.,

$$begin{aligned} left| {{X_t} – {{{hat{X}}}_t}} right| _{{sum }_{e_{t}}^{-1}} ^2 = {({X_t} – {{hat{X}}_t})^text {T}}{{sum }_{e_{t}}^{-1}}({X_t} – {{hat{X}}_t}) end{aligned}$$

(21)

where ({{hat{X}}_t}) denotes the estimates of the target state at time t and the fitting error is given as ({e_t}={X_t}-{hat{X}_t}) and (sum _{e_t}) is the covariance of the fitting error, c.f., (5).

Algorithm summary

In summary, the proposed sensor selection algorithm can be summarized as Algorithm 1. Based on the POMDP framework, the interaction of the tracked target with the sensor selection strategy can be described as the following three steps (see also Fig. 2):

  1. 1

    At any time step k, the estimated T-FoT has parameters (C_k) which can be used to predict the target state (hat{X}_{k+1} = F(k+1;{C_{k}})) for time (k+1).

  2. 2

    The sensor network perceives pseudo-observations (hat{Z}_{k+1}) through the known stochastic observation model (g(cdot |{X_{k}},s_{k})) and the predicted target state (hat{X}_{k+1}). This will result in the expression of objective function (u_text {lb}(s_{k};{hat{X}}_{k+1})).

  3. 3

    Find the optimal selection command (s_k^*) from the candidate set (S_k subseteq S) by optimizing the objective function (u_text {lb}(s_{k};{hat{X}}_{k+1})) with respect to the potential constraints. Here, the candidate set can be defined as the subset of all sensors that lie within a limited distance to the target.

figure a
Fig. 2
figure 2

POMDP-based sensor selection using the T-FoT approach

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Disclaimer:

This article is autogenerated using RSS feeds and has not been created or edited by OA JF.

Click here for Source link (https://www.springeropen.com/)