# A method of pedestrian flow monitoring based on received signal strength – EURASIP Journal on Wireless Communications and Networking

#### ByZhiyong Yang, Jing Wen and Kaide Huang

Jan 4, 2022

In this paper, the RSS value is utilized to monitor the pedestrian flow, and the number of people passing through is obtained by the RSS value of the link composed by the wireless sensor network. From the perspective of machine learning, this problem is a supervised multi-classification problem [25,26,27]. In the case of small data volume, the SVM can handle multiple classification problem well [28, 29]. Compared with other more complex machine learning algorithms, such as KNN, CNN, and RNN, SVM algorithm has many advantages, the most prominent two are suitable for small sample data classification and low computational complexity. This feature is especially suitable for the application of people flow monitoring based on RSS. Therefore, we adopt the SVM for obtaining the classification model of this system.

### Wavelet analysis

We use a series of low-pass filters and high-pass filters to filter the collected signal data. Low-pass filtering and high-pass filtering are both filtering methods in data processing. The rule of low-pass filtering is that the low-frequency signal can pass, and the high-frequency signal that exceeds the critical value set by the system is blocked and weakened. High-pass filtering is the opposite. Therefore, different filtering methods can be used according to different requirements. And then carry out wavelet decomposition of the output. The decomposition produces two parts whose lengths are halved: one is the approximate component of the original signal produced by the low-pass filter, and the other is the detailed component of the original signal produced by the high-pass filter. An RSS signal of length W is down sampled through a low-pass filter and a high-pass filter, the corresponding approximate component coefficient cA1 and detailed component coefficient cD1 are obtained, and the length of both component coefficients is W/2. Then, the same method is applied to cA1 for obtaining the approximate component cA2 and detailed component cD2 of the second layer. We repeat this process until we reach the preset decomposition level [30]. The decomposition process is shown in Fig. 6.

The above process of wavelet decomposition can be expressed as the following equation:

begin{aligned} left{ begin{array}{l} {mathrm{c}}_{j + 1,k} = sum limits _{n = – infty }^{ + infty } h_{n-2k}{c_{j,n}} \ d_{j + 1,k} = sum limits _{n = – infty }^{ + infty } g_{n – 2k}c_{j,n} end{array} right. end{aligned}

(1)

where ({j in left{ {0,1,2, ldots } right} }) is the levels of decomposition, k and n are translation coefficients, ({mathrm{c}}_{j + 1,k}) and (d_{j + 1,k}) are low-frequency approximate component coefficients and high-frequency detail component coefficients at (2^{j}) resolution, respectively, and (h_{n – 2k}) and (g_{n – 2k}) are low-pass filtering coefficients and high-pass filtering coefficients, respectively.

In this paper, the DB4 wavelet is adopted to carry out wavelet decomposition on received RSS measurement data. The threshold filtering method is also used to carry out signal filtering, and the threshold is set to 0.1. In this paper, if one person goes through the “gate,” the filtered waveform of the RSS value of one of the effective links can be obtained through 15-layer decomposition, as shown in Fig. 7. In the case of the same number of people passing through the gate, the same effective links have, theoretically, a small change in RF. Because the environment changes over time, the RF signal will fluctuate greatly. As can be seen from the waveform after wavelet filtering, the RSS fluctuation of the same effective link under the same condition tends to be stable, which is conducive to the establishment of the later model.

### SVM classification

The SVM is a method proposed by Vapnik et al. [31] on the basis of statistical learning theory. The SVM aims at ensuring the learning machine’s generalization ability by using the principle of structural risk minimization. Based on the statistical learning theory, the model base is established using the principle of structural risk minimization. The decision rules can still achieve lower expected risks for independent test datasets with a limited number of samples. The training process of the SVM involves finding an optimal hyperplane to partition data in high-dimensional space. The optimal hyperplane is defined as maximizing the distance from the nearest sample points to the hyperplane, and these sample points are called support vectors. SVMs have received considerable attention and have been successfully applied to many classification tasks due to their good generalization ability.

We assume that ({mathbf{x }_i}in {mathbb {R}}^{n}, left( i = 1,2, ldots ,Nright)) are the input vectors, where each point (mathbf{x }_{i}) belongs to one of the two classes denoted by a label ({y_i} in left{ – 1, + 1 right}). In the linearly separable case, the input vectors can be classified by the separating hyperplane, (mathbf{w }^Tmathbf{x } + b = 0), which satisfying the condition of ({mathrm{y}}_{i}left( mathbf{w }^{T}mathbf{x }_{i} + b right) >0), where (mathbf{w }) is the normal vector that determines the direction of the hyperplane, and b determines the distance between the hyperplane and the origin. There may exist many hyperplanes which meet this condition. In order to construct a optimal hyperplane to separate the two classes, by scaling w and b, we set ({mathrm{y}}_{i}left( mathbf{w }^{T}mathbf{x }_{i} + bright) ge 1) for (i = 1,2, ldots ,N). The distance from the closest point to the hyperplane is ({1 /{left| mathbf{w } right| }}) so that finding the optimal separating hyperplane needs maximize (1/Vert mathbf{w }Vert) which can be converted to minimize (frac{1}{2}{left| mathbf{w } right| ^2}). The original support vector machines solved the following optimality problem:

begin{aligned} begin{array}{l} mathop {min }limits _{w,b} frac{1}{2}{left| w right| ^2}\ hbox{s.t.},,,hbox{y}_{i} left( {{w^{mathrm{T}}}{x_i} + b} right) ge 1,quad i = 1,2, ldots ,N end{array} end{aligned}

(2)

However, most data sets in the real world cannot be completely linearly separated. Therefore, in this paper, the soft margin SVM is used to establish a pedestrian classification model. It is implemented by introducing slack variables, ({{xi _i} ge 0}). This improved method allows for errors in some samples. The optimal classification hyperplane can be obtained by solving the following convex quadratic programming problem:

begin{aligned} &mathop {min }limits _{w,b,{xi _i}} frac{1}{2}{left| mathbf{w } right| ^2} + Csum limits _{i = 1}^N {{xi _i}} \ &hbox{s.t.},,,hbox{y}_{i}left( {{mathbf{w }^T}{mathbf{x }_i} + b} right) ge 1 – {xi _i},quad {xi _i} ge 0,quad i = 1,2, ldots ,N end{aligned}

(3)

where ({{xi _i}}) are positive slack variables and C is the penalty factor controlling the tradeoff between the maximal margin and the minimal classification error. The parameter C affects the generalization ability of the SVM. If the value of C is large, it will cause overfitting to the training data. On the contrary, if it is set too small, the classifier will cause underfitting. Formula (1) can obtain its dual problem by using Lagrange multiplier method and then obtain the estimated decision function by solving the dual problem:

begin{aligned} &{mathop {mathrm{max}}limits _alpha } sum limits _{i = 1}^{mathrm{N}} {alpha _i} – frac{1}{2}sum limits _{i=1}^N sum limits _{j = 1}^N {alpha _i}{alpha _j}{y_i}{y_j}mathbf{x }_i^T{mathbf{x }_j}\ &hbox{s.t.},, sum limits _{i = 1}^{mathrm{N}} {alpha _i}{y_i} = 0,quad 0 le {alpha _i} le C end{aligned}

(4)

where ({{alpha _i}}left( {i = 1,2, ldots ,N} right)) are the nonnegative Lagrangian multipliers. By solving the above problems, the decision function is obtained as follows:

begin{aligned} fleft( x right) = {mathrm{sgn}} left{ {{mathbf{w }^T}mathbf{x } + b} right} = {mathrm{sgn}} left{ {sum limits _{i = 1}^N {{alpha _i}{y_i}mathbf{x }_i^Tmathbf{x } + b} } right} end{aligned}

(5)

When the input vectors are not linearly separable, we cannot process the vectors with linear classification. We need to utilize the kernel function to map input vectors to a higher-dimensional space, which turns the problem of nonlinear separability in low-dimensional space into the problem of linear separability in high-dimensional space. Then, the purpose of classification by finding the classification hyperplane can be achieved. The kernel function is as follows:

begin{aligned} Kleftlangle {{mathbf{x }_i},mathbf{y }_j} rightrangle = leftlangle {phi left( {{mathbf{x }_i}} right) cdot phi left( {{mathbf{y }_j}} right) } rightrangle end{aligned}

(6)

(phi) is a mapping from X to the inner product feature space. Any function that satisfies Mercer’s condition can be considered as a kernel function. In the SVM, different kernel functions are selected to generate different models, and the selection of kernel functions directly determines the classification results and machine learning ability of the SVM. There are four main types of common kernel functions including linear, polynomial, gaussian, sigmoid kernel function. By using the appropriate kernel function ({mathrm{K}}left( mathbf{x }_{i},mathbf{x }_{j} right)) to achieve classification, the decision function is obtained as follows:

begin{aligned} fleft( mathbf{x } right) = {mathrm{sgn}}left{ mathbf{w }^{T}phi left( mathbf{x } right) + b right} = {mathrm{sgn}} left{ sum limits _{i = 1}^N alpha _{i}y_{i}{mathrm{K}}left( mathbf{x }_{i},mathbf{x } right) + b right} end{aligned}

(7)

In this study, this paper chooses the linear kernel function as the kernel function of SVM to classify the experimental data. The linear kernel function is defined as follows:

begin{aligned} {mathrm{K}}left( {{mathbf{x }_i},{mathbf{x }_j}} right) = {mathbf{x }_i^T} cdot {mathbf{x }_j} end{aligned}

(8)

SVM was originally designed for two classification problem, but since the pedestrian flow monitoring in this paper is a multi-class problem. There are roughly two implementations of SVM multi-class classification methods:

1. (1)

Modifying the algorithm of support vector machine so that multiple classification problem is realized by solving the optimization problem “at one time”.

2. (2)

Constructing and combining multiple binary classifiers. The second method mainly includes one-against-rest and one-against-one. If one-against-rest method is used, M SVMs are required for M classification problems. For the ith SVM, all the samples of the ith class are labeled as positive class, and the rest samples are labeled as negative class. A test sample is labeled according to maximum output among the M classifiers. If one-against-one method is used, (M(M – 1)/2) SVMs are required for M classification problems. It constructs a SVM between each pair of classes and chooses the class with the most votes as the final class of the test sample.

This paper uses one-against-rest method to construct multi-classification SVM to solve the multi-class problem. The advantage of this method is that for M classification problems, only M two-class classifiers support vector machines need to be trained and the classification speed is relatively fast.