Data collection and preparation
The healthrelated surveys data from BRFSS [22] were used in this study. BRFSS collected United States residents’ data on health risk behaviors and chronic health conditions [22], which involved various risk factors of lung cancer and its prevalence situation, such as age, body mass index, smoking frequency, smoking start age, smoking intensity, time since quitting smoking, personal cancer history, family history of cancer, ecigarette use, asthma history, chronic obstructive pulmonary disease (COPD) history, et al. The data selection flowchart was shown in Fig. 1. The whole population (14,043,816 cases) of the survey were aged older than 18 years old. Of those, 47.39% (6,655,364 cases) were men and 52.61% (7,388,452 cases) were women. By leveraging data preprocessing, some cases which had missing values were excluded, e.g. missing smoking related factors, gender, lung cancer screening. The elderly population were those aged 65 years old and older according to international age threshold for the elderly in the developed countries. 1,367,598 elderly cases were obtained totally. The proportion of men 65 years and older was 48.36% (661,370 cases). In order to analyze the specificity of the characteristics of lung cancer incidence in the elderly, men aged 18 years and older and women aged 18 years and older, as well as the whole population, were included in the study to compare with the elderly. In all, five stratified groups: men aged 65 years and older (elderly men), women aged 65 years and older (elderly women), men aged 18 years and older (men), women aged 18 years and older (women) and the whole population (all), were obtained in this study.
We also selected environmental data from US Environmental Protection Agency (EPA) [23] website, which related to particulate matter (PM), carbon monoxide (CO), lead (Pb), Ozone, sulfur dioxide (SO_{2}), nitrogen dioxide (NO_{2}), 24h average temperature, relative humidity, wind speed, duration of sunshine, precipitation, atmospheric pressure and indoor radon. The Environmental data were linked to BRFSS through the collection date, which could integrate these two datasets together.
Data analysis
We adopted DQN model to predict lung cancer intervention strategy and assess intervention effect for lung cancer high risk. The workflow of this study was shown in Fig. 2. Firstly, we separately screened lung cancer high risk in five stratified groups. Secondly, DQN models were developed to deduce lung cancer intervention strategy in different stratifications. Thirdly, lung cancer incidences were computed according to corresponding intervention strategy, and intervention effects were deduced through DQN models. Lastly, we assessed lung cancer intervention effect to derive optimal intervention strategy.
Lung cancer high risk screening
Timely high risk screening and early intervention [24] might reduce the incidence of lung cancer. We screened risk factors for lung cancer of elderly men and women through our previous study [21]. In elderly men, smoking frequency and time since quitting (i.e. how long has it been since the respondent last smoked a cigarette) were the top two risk factors for lung cancer [21]. Thus, according to the risk factors, the lung cancer high risk of elderly men was screened. Time since quitting and smoked at least 100 cigarettes (i.e. smoked at least 100 cigarettes in respondent’s entire life) were the high risk factors in elderly women [21]. Similarly, we screened lung cancer high risk of elderly women. We obtained 103,629 high risk elderly people and developed intervention simulation to predict lung cancer optimal intervention strategy in elderly men and women.
Deep Qnetworks modelling
DQN was a valuebased reinforcement learning algorithm, which used CNN to approximate value functions. DQN models’ inputs were risk factors of high risk people, which were obtained from our previous study [21], e.g. smoking frequency, cancer history, asthma history, radiation, use of ecigarette, time since quitting, physical activity. And models’ outputs were optimal intervention strategies which were deduced from target value functions. Value functions were trained using CNN to get close to maximal intervention effect as much as possible.
We adopted Qlearning method to develop networks and computed the loss function. The loss function was shown in Eq. (1). Q was output value function of neural network, which represented maximum cumulative intervention effect of intervention strategy a from risk state s; Q(s, a; θ_{i}) was output of current network; Q_{i} was output of the target network; θ was mean squared error of network parameters; and ρ(s, a) was probability distribution of risk state s and intervention strategy a.
$$L_{i} (theta_{i} ) = E_{s,asim rho ( cdot )} [(Q_{i} – Q(s,a;theta_{i} ))^{2} ]$$
(1)
We iteratively updated weights of optimization loss function using the stochastic gradient descent (SGD) function, as shown in Eq. (2). Q(s′, a′; θ_{i1}) was the target network output; Q(s, a; θ_{i}) was current network output; r was intervention effect of current network; ε was intervention environment; and γ was discount factor and between 0 and 1.
$$nabla_{{theta_{i} }} L_{i} (theta_{i} ) =, E_{s,asim rho ( cdot );s^{prime}sim varepsilon } [(r + gamma mathop {max }limits_{a^{prime}} Q(s^{prime},a^{prime};theta_{i – 1} ) – Q(s,a;theta_{i} ))nabla_{{theta_{i} }} Q(s,a;theta_{i} )]$$
(2)
Then, by leveraging SGD function, the current value function was getting close to target value function as much as possible. Output target value function Q_{i} was combined with optimal intervention strategy a and risk state s, which was in Eq. (3) and could be used to deduce optimal intervention strategy.
$$Q_{i} = E_{s^{prime}sim varepsilon } [r + gamma mathop {max }limits_{a^{prime}} Q(s^{prime},a^{prime};theta_{i – 1} )s,a]$$
(3)
Rectified linear unit was activation function in this study, which was integrated into convolutional layer. The model consisted of one input layer, three convolutional layers, one fully connected layer and one output layer. We adopted input neurons 32 × 32, convolution kernels 5 × 5, 4 × 4 and 3 × 3 of three convolutional layers respectively and four output neurons. Tenfold crossvalidation was used to evaluate the model, which randomly divided the dataset into ten parts and took turns using nine parts for model training and one part for model testing. Python script and PyTorch framework were employed in Ubuntu programming environment based on Docker platform for model training in this study. We separately trained five DQN models of elderly men, elderly women, men, women and the whole population. Intervention strategies of these five groups were derived from their DQN models.
Intervention strategy optimization

(i)
Intervention effect prediction
The high risk was a risk state of lung cancer occurrence in this study. There were other risk states as well, such as low risk and lung cancer. Once intervention strategy was conducted, risk state might change, which was risk state transition. Risk state transitions of high risk included from high risk to low risk, from high risk to lung cancer, from high risk to high risk. We used the probability of risk state transition to assess the intervention effect of intervention strategy in this study. Similar intervention effect predictions in different stratifications were developed.

(ii)
Lung cancer intervention assessment
Probabilities of risk state transitions were assessed in different groups. As in Fig. 3, we described risk state transitions of high risk in multiple intervention cycles, where S_{t} was the set of risk states at time t; A_{t} was the set of intervention strategies at time t. We computed probabilities of risk state transition of high risk in different intervention cycles. We comprehensively assessed the intervention effects in elderly men and women using lung cancer incidence.

(iii)
Optimal feedback
Based on intervention effect assessment, we employed the reduction of lung cancer incidence to reflect the effectiveness of intervention strategy. The intervention strategy could bring the largest reduction of lung cancer incidence than other strategies, which would be considered as the optimal intervention strategy. Otherwise, this intervention strategy would be adjusted using feedback mechanism. The whole process was reworked as shown in Fig. 2 and intervention effect was comprehensively evaluated until optimal intervention strategy was deduced.
Model performance evaluation
To evaluate the models, we adopted tenfold crossvalidation. Accuracies and area under the receiver operating characteristic curve (AUROC) of five models were computed separately. Then we compared DQN models with support vector machines (SVM), random forest and multiple logistic regression in five groups to conduct method comparison.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Disclaimer:
This article is autogenerated using RSS feeds and has not been created or edited by OA JF.
Click here for Source link (https://www.biomedcentral.com/)