# Digital Twin-based Quality Management Method for the Assembly Process of Aerospace Products with the Grey-Markov Model and Apriori Algorithm – Chinese Journal of Mechanical Engineering

Aug 11, 2022

### Numerical Prediction Results of Quality Data

#### Data Selection

Taking the assembly process of an aerospace product as an example, the centroid offset data of cabin A of model S1 are collected to verify the proposed key technologies and method. It is known that cabin A of model S1 has completed three batches of assembly since the last production environment change. In each batch 15 products are assembled, and the assembly of the 15th product in the fourth batch is currently underway. If the assembly time of each product is the same, the allowable error of the centroid offset of cabin A is 15 mm. The collected centroid offset data of cabin A of the fourth batch are shown in Table 2.

From the above data, we use 13 centroid offset data items, from A4-1 to A4-13 as raw data, A4-14 as validation data, and A4-15 as predicted data. The original series of centroid offset are obtained:

$${x^{(0)}} = ({x^{(0)}}(1),{x^{(0)}}(2) ldots {x^{(0)}}({13})) = (5.918,5.205,4.393,4.520,5.105,5.818,6.647,7.560,8.467,9.642,10.811,11.898,13.548).$$

(38)

The scale ratio of the original series ({{x}}^{({0})}) for centroid offset is calculated as

$$lambda (t)=(1.137, 1.185, 0.972, 0.885, 0.878, 0.875,0.879, 0.893, 0.878, 0.892, 0.909, 0.878).$$

(39)

According to the Grey prediction model, the tolerant coverage interval of the ratio is obtained as

$$Y=({e}^{frac{-2}{n+1}}, {e}^{frac{2}{n+1}})=(0.867,1.154).$$

(40)

By observing the scale ratio of the centroid offset, it is found that each item of the ratio series falls within the admissible coverage interval, which indicates that the GM(1,1) model is applicable for the original series of centroid offsets.

#### Establishment of Grey Forecasting Model

The original sequence of the centroid offset is accumulated to weaken the volatility and randomness that may exist in the random sequence, and the accumulated sequence of centroid offsets is obtained as

$${x}^{(1)}=(5.918, 11.123, 15.515, 20.036, 25.141,30.959,37.605, 45.165, 53.633,63.275, 74.086, 85.984, 99.532).$$

(41)

According to the Grey model, the cumulative sequence of centroid offsets is processed, and the matrix B and constant vector (varvec{{y}}_{{n}}) are obtained:

$$begin{array}{l} varvec{B} = left[ {begin{array}{*{20}{c}} { – 0.5({X^{( 1 )}}( 1 ) + {x^{( 1 )}}(2))}&1\ { – 0.5({x^{( 1 )}}( 2 ) + {x^{( 1 )}}(3))}&1\ ldots&1\ { – 0.5({x^{( 1 )}}( {12} ) + {x^{( 1 )}}(13))}&1 end{array}} right]\ = left[ {begin{array}{*{20}{c}} { – 0.5(5.918 + 11.123)}&1\ { – 0.5(11.123 + 15.515)}&1\ ldots&1\ { – 0.5(85.984 + 99.532)}&1 end{array}} right]\ = left[ {begin{array}{*{20}{c}} { – 8.520}&1\ { – 13.319}&1\ { – 17.776}&1\ { – 22.588}&1\ { – 28.050}&1\ { – 34.282}&1\ { – 41.385}&1\ { – 49.399}&1\ { – 58.454}&1\ { – 68.681}&1\ { – 80.035}&1\ { – 92.758}&1 end{array}} right] end{array}$$

(42)

$$varvec{y_n} = left[ {begin{array}{*{20}{c}} {{x^{( 0 )}}( 2 )}\ {{x^{( 0 )}}( 3 )}\ ldots \ {{x^{( 0 )}}( {13} )} end{array}} right] = left[ {begin{array}{*{20}{c}} {5.205}\ {4.393}\ {4.520}\ {5.105}\ {5.818}\ {6.647}\ {7.560}\ {8.467}\ {9.642}\ {10.811}\ {11.898}\ {13.548} end{array}} right].$$

(43)

The Grey parameter matrix (hat{ varvec{a}}) is calculated as

$$hat{ varvec{a}} = left[ {begin{array}{*{20}{c}} a\ u end{array}} right] = {({B^text{T}}B)^{ – 1}}{B^text{T}}{X_n} = left[ {begin{array}{*{20}{c}} { – 0.112644}\ {3.009679} end{array}} right].$$

(44)

Substituting (hat{ varvec{a}}) into the prediction model of the centroid offset, the prediction function of ({{hat X}^{(1)}}({text{t}})) is obtained as

$${hat x^{( 1 )}}( {t + 1} ) = left( {{y^{( 0 )}}( 1 ) – frac{u}{a}} right){e^{ – at}} + frac{u}{a} = 32.637{{text{e}}^{0.112644t}} – 26.719.$$

(45)

Expressions ({hat{x}}^{(1)}(t+1)) and ({hat{{x}}}^{({1})}({{t}})) are discretized and restored to the original sequence of centroid offsets, whose prediction sequence is

$${hat x^{(0)}}(t + 1) = {hat x^{(1)}}(t + 1) – {hat x^{(1)}}(t) = 32.637({{text{e}}^{0.112644t}} – {{text{e}}^{0.112644(t – 1)}}).$$

(46)

The predicted serial value of centroid offsets is obtained as

$${hat{x}}^{(0)}=(5.918, 3.891, 4.355, 4.875, 5.456,6.106,6.835, 7.649,8.562, 9.582, 10.725, 12.004, 13.435, 15.037, 16.830).$$

(47)

#### Grey Prediction Model Test

(1) Residual Test

The predicted and original series of centroid offsets are put together, and the residual and relative error of each data item is calculated to obtain the fitting results in Table 3.

The fitting curve of centroid offsets drawn from the above fitting results is shown in Figure 4.

The calculated average relative error is 4.16%, and the fitting accuracy is 95.84%. Because the fitting accuracy is more than 80%, the forecast result passes the residual test.

(2) Correlation Degree Test

According to the above fitting curve, (min _{i = 1}^{14}| {{{hat y}^{(0)}}(i) – {y^{(0)}}(i)} | = 0,max _{i = 1}^{14}| {{{hat y}^{(0)}}(i) – {y^{(0)}}(i)} | = 1.314), and (rho = 0.5). According to Eq. (18), the corresponding correlation coefficients are

$$eta (t) = (1,0.333,0.946,0.650,0.652,0.695,0.778,0.880,0.875,0.916,0.84,0.861,0.853).$$

(48)

The correlation degree between the two sets is 0.794, which is greater than 0.6; hence, the prediction result passes the correlation degree test.

(3) Posterior Deviation Test

The column of residuals extracted from the test results is

$${e}^{(0)}(t)=(0, 1.314, 0.037, -0.354, -0.351, -0.289,-0.188,-0.089, -0.094, 0.060, 0.086, -0.106, 0.113).$$

(49)

We calculate that the variance of the original series ({y}^{(0)}(t)) of the centroid offset is ({{s}_{1}}^{2}=8.284), the variance of the residuals is ({{s}_{2}}^{2}=0.165), and the ratio of the mean variance is (C=0.141). According to the definition, the probability of small error is P = 1.

Referring to Table 1, we can see that the mean variance and small probability error calculated above are in the range of Level I, but the average relative error is in the range of Level II.

#### Case of Grey-Markov Model

Based on the Grey prediction results, it can be found that the minimum residual error of the predicted and actual values is −0.354, and the maximum is 1.314, i.e., the values of residual error are within the range [−0.36, 1.32].

According to the distribution of residual error values, the above large interval is divided into five subintervals: [−0.36, −0.24], [−0.24, −0.12], [−0.12, 0], [0, 0.12], [0.12, 1.32]. Each interval corresponds to a residual error state of the centroid offset, as shown in Table 4.

It is assumed that the predicted result of centroid offset will transfer between States I and V with a certain probability. The state transition diagram of the predicted results is shown in Figure 5. If the current forecasting result is State I, then the next forecasting result has a Z11 probability become State I, a Z12 probability become State II, a Z13 probability become State III, a Z14 probability become State IV, and a Z15 probability become State V.

From the column of residual error values of the centroid offset, we can see that State I occurs three times, twice transferred to State I and once to State II; State IV occurs five times, transferred once each to States I, III, IV, V, and the final State (the residual error of the predicted value at the moment of the final State is left of the interval where State I is located, which we consider as State I). The state transition of residual error is shown in Table 5.

It is assumed that the time interval of the measurement for each two cabins’ centroid offset is one assembly cycle, i.e., the step size of the Markov model, which is the time required for state transition. According to the statistics above, the state transition matrix of the residual error can be obtained as

$${varvec{P}}=left[begin{array}{ccccc}{Z}_{11}& {Z}_{12}& {Z}_{13}& {Z}_{14}& {Z}_{15}\ {Z}_{21}& {Z}_{22}& {Z}_{23}& {Z}_{24}& {Z}_{25}\ {Z}_{31}& {Z}_{32}& {Z}_{33}& {Z}_{34}& {Z}_{35}\ {Z}_{41}& {Z}_{42}& {Z}_{43}& {Z}_{44}& {Z}_{45}\ {Z}_{51}& {Z}_{52}& {Z}_{53}& {Z}_{54}& {Z}_{55}end{array}right]=left[begin{array}{ccccc}frac{{P}_{11}}{{P}_{1}}& frac{{P}_{12}}{{P}_{1}}& frac{{P}_{13}}{{P}_{1}}& frac{{P}_{14}}{{P}_{1}}& frac{{P}_{15}}{{P}_{1}}\ frac{{P}_{21}}{{P}_{2}}& frac{{P}_{22}}{{P}_{2}}& frac{{P}_{23}}{{P}_{2}}& frac{{P}_{24}}{{P}_{2}}& frac{{P}_{25}}{{P}_{2}}\ frac{{P}_{31}}{{P}_{3}}& frac{{P}_{32}}{{P}_{3}}& frac{{P}_{33}}{{P}_{3}}& frac{{P}_{34}}{{P}_{3}}& frac{{P}_{35}}{{P}_{3}}\ frac{{P}_{41}}{{P}_{4}}& frac{{P}_{42}}{{P}_{4}}& frac{{P}_{43}}{{P}_{4}}& frac{{P}_{44}}{{P}_{4}}& frac{{P}_{45}}{{P}_{4}}\ frac{{P}_{51}}{{P}_{5}}& frac{{P}_{52}}{{P}_{5}}& frac{{P}_{53}}{{P}_{5}}& frac{{P}_{54}}{{P}_{5}}& frac{{P}_{55}}{{P}_{5}}end{array}right]=left[begin{array}{ccccc}frac{2}{{3}}& frac{1}{{3}}& {0}& {0}& {0}\ {0}& {0}& {1}& {0}& {0}\ {0}& {0}& frac{1}{{3}}& frac{2}{{3}}& {0}\ frac{2}{{5}}& {0}& frac{1}{{5}}& frac{1}{{5}}& frac{1}{{5}}\ {0}& {0}& {0}& {1}& {0}end{array}right].$$

(50)

According to the state of residual error, the residual error of prediction values for A4–13 belongs to state IV, so the initial state can be considered to be (0, 0, 0, 1, 0). According to the state transition matrix of Eq. (50), the state after one-step transition is (2/5, 0, 1/5, 1/5, 1/5), so the residual error of the prediction value of A4-14 is state I.

We correct the predicted centroid offset value of A4-14 from the Grey model:

$${text{A}}4-14;{text{predicted value}} = 15.037 + frac{1}{2}*(0.24 + 0.36) = 14.737.$$

(51)

The results are compared with those of the Grey prediction model, as shown in Table 6.

It is found that, unlike the Grey prediction model, the Grey-Markov prediction model takes into account the fluctuation of data, and it reduces the relative error of centroid offset prediction from 3.44% to 1.38%, which improves the accuracy of prediction.

Similarly, we calculated the state after two steps of transfer of A4-13 (0, 0, 0, 0, 1, 0) as (0.347, 0.133, 0.107, 0.373, 0.040), so the remaining error of A4-15 prediction is State IV.

We correct the Grey forecast value of A4-15:

$${text{A}}4 – 15{text{ predicted value}} = 16.830 + frac{1}{2}*(0 + 0.12) = 16.890.$$

(52)

Therefore, it forecasts that the value of the centroid offset for cabin A of the 15th product will be 16.890 mm, exceeding the allowable error of 15 mm, and abnormal quality will occur soon.

### Assembly System Status Prediction Results

#### Raw Data Preprocessing

Based on 14 centroid offset data items of the fourth batch of cabin A of model S1 in Table 2 and the predicted data of product 15, we can obtain 15 data items for the fourth batch of cabin A. With the centroid offset data of the first three batches of cabin A of model S1, 60 pieces of centroid offset data are obtained, as shown in Table 7, which are used as part of the sample data to forecast the status of the assembly system by means of statistical process control.

Drawing on the idea of group technology, we collect the data of cabin A of model S2 similarly to those of cabin A of model S1. The other half of the sample data, the centroid offset data of four batches of cabin A of model S2, are shown in Table 8.

All the sample data required by the statistical process are obtained so that T-K control charts can be used to monitor the mean and standard deviation of the centroid data and evaluate the stability of the assembly system.

#### Case of T Control Chart

In the selected case, we know the allowable error range of the centroid offset of cabin A, but we do not know the mean value, which satisfies the condition for the use of the calculation formula of the T statistic in Section 4.2.1. According to the original data in Tables 7 and 8, we first calculate the mean and standard deviation of group i samples, the mean (overline{overline {{X_j}}}) of the first r-1 batches of samples, and then the T statistics, as shown in Table 9.

According to Eq. (30), we set n to 15, and the upper control limit of the T control chart is 3.636, the lower control limit is −3.636, and the center line is 0. We draw the T control chart of the centroid offset of cabin A, as shown in Figure 6, from which it can be found that the mean of the centroid offset for cabin A of models S1 and S2 is within the allowable range.

#### Case of K-control Chart

Similar to the calculation of the T-statistic, we only know the allowable range of the error of the centroid offset, and we do not know the average, which satisfies the condition of calculating the K-statistic as in Section 4.2.2. Based on the original data in Tables 7 and 8, we calculate the mean (overline {S_j^2}) of the variance of the first r-1 batches of sample data for different product types, and then calculate the intermediate variable (lambda _{i,j}^{(r)}) and the K statistic, as shown in Table 10.

According to Table 10 and Eq. (33), the K-control chart of the centroid offset is shown in Figure 7.

Observing the K-control chart, it can be found that the standard deviation of the centroid offset is within the allowable range. Combined with the results of the T-control chart in Section 5.2.2, the assembly system can be considered to be in a controlled state according to the sample data. Since the centroid offset data of the 15th product of the fourth batch are a predicted value, the predicted results based on the T-K control chart indicate that the status of the assembly system at the next moment is controlled.

### Mining Association Rules of Influence Factors for Abnormal Quality Data

Assuming that the assembly process of cabin A in this case has three processes, all kinds of materials required in each process are from the same batch, and all equipment of each station is fixed. According to the six categories of quality influencing factors mentioned above, the fishbone diagram of the influencing factors for the centroid offset of cabin A is shown in Figure 8, which includes 28 factors.

Although 28 factors are identified from the analysis, how these are combined to cause an abnormal centroid offset still must be determined. We take 150 pieces of quality data collected from the assembly process of a certain cabin A as an example, as shown in Additional file 1: Appendix 1.

To keep the centroid offset within the qualified range as much as possible, the ideal error range of centroid offset is determined as within 12 mm after the influencing factors are analyzed. Over 12 mm is considered to be abnormal, which requires attention. The data encoding is shown in Figure 9.

Each encoding consists of three bytes. Byte 1 represents the process to which it belongs, such as processes 1–3. Byte 2 represents the attributes of influencing factors, such as operators and inspectors. Byte 3 represents the instantiation of influencing factors, such as operator A and inspector B. It is worth noting that when bytes 2 and 3 of two encodings are the same, the corresponding objects are the same regardless of whether byte 1 is the same. For example, 1OPE and 2OPE both represent operator E, who performs process 1 in the first case and process 2 in the second. A total of 108 codes appear in Additional file 1: Appendix 1, and their meanings are given in Additional file 2: Appendix 2.

According to the principle of the Apriori algorithm, association rules are mined from 150 pieces of assembly quality data in Additional file 1: Appendix 1 by R system. We set the support degree to 0.16 and the minimum length of item to 2, to obtain 7250 frequent item sets, from which 15 strong association rules with RHS item “AN” are mined, as shown in Figure 10. The depth of color indicates the degree of lifting, and the size of circle indicates the size of support degree.

Strong association rules with RHS item “AN” are arranged in descending order of lifting in Table 11.

Observing the above rules, the lifts of the top 14 rules are greater than 3, which means they are effective. It is found that the probability of abnormal centroid offset is relatively high when the material batch of process 3 is E. Also, when the material batch of process 1 is A, the product model is S1, the material batch of process 2 is C, or the pressure in the cabin is A. Managers can try to avoid the combination of the above rules.

### Mining Association Rules of Influence Factors for Assembly System Status

Because no strong association rule with RHS item “NCT” is found, 30 uncontrolled data items of the assembly system are filtered in 150 collected pieces of data. Mining association rules from this data, we set the support to 0.66, and the minimum length of item set to 2, from which 7250 frequent item sets are obtained. From these, 31 association rules with RHS item “NCT” are mined, as shown in Figure 11.

### System Implementation

The proposed quality management method is achieved through data value prediction, assembly system status prediction, and association rule mining based on the constructed DT model, providing early warning for the assembly shop-floor and avoiding future quality abnormalities. According to the comparison of the results in Table 6, the Grey-Markov model considers the fluctuation of data compared with the general Grey prediction model, thus improving the accuracy of data prediction. A DT-based quality management system for the assembly process of aerospace products is developed, which achieves the monitoring of quality information and quick tracing of quality problems, thereby reducing the occurrence frequency and improving the processing efficiency of quality problems on the shop-floor. Currently, the system has been applied in an aerospace enterprise, and the quality data value of more than seven types of key indicators on a final assembly shop-floor and their trend for different products are monitored and predicted.

The interface for monitoring assembly shop-floor quality based on the DT is shown in Figure 12. The lower left quarter quadrant enables the monitoring of the operating status of the equipment, the bill of arrived materials, and the values of key quality indicators of the current station by clicking the corresponding button. The top-left corner monitors quality data value and change trends for different indicators. The lower-right corner is a pop-up window for early warning if an exception occurs or may occur. The upper-right corner shows the prediction results such as the utilization rate of each station and maximum load area.

The interface for monitoring all of the assembly quality data on the assembly shop-floor is as shown in Figure 13, including the prediction results at the next moment based on the built Grey-Markov model. On the left side, the assembly quality data monitoring for different stations on the shop-floor can be realized, including the loading preparation area, empty cylinder treatment area, and docking area. For a specific station, the quality data are monitored on the right side of the interface, and the warning prompt appears in a red font on the interface for monitoring shop-floor quality data on the left side, and on a red background on the right ride of the interface for monitoring station quality data. Quality abnormality records are automatically generated.

The interface for monitoring assembly shop-floor status is shown in Figure 14. The left side is an assembly quality data list for batches of aerospace product A. In the list, the values of key quality data of historical batches can be viewed, along with collected real-time data and predicted data of the current batch. Moreover, users also can monitor the mean, standard deviation, and T and K statistics of quality data in each batch. Meanwhile, users can view the T-K statistical diagram on the right side to observe whether the current status of the assembly system is controlled. If the T-K statistics of a batch exceed the upper or lower limits, then quality abnormality records will be automatically generated.

When dealing with quality problems, a craftsman or inspector can switch to the interface of managing strong association rules for quality exceptions, as shown in Figure 15. The left side is product BOM, through which quality data and rules are managed. The upper right is a search bar. Select product model, quality data, support, confidence, and other conditions to search and view these rules which are listed in the lower right of the interface, and then the reference methods of processing quality problems can be provided. The list of strong association rules contains the combination of shop-floor resources that lead to abnormal quality data, which are obtained through the proposed approach and imported into the knowledge base in Excel format.