Study population

The Kailuan Study is an ongoing prospective community-based cohort study to investigate the risk factors for cardiovascular diseases, cerebrovascular diseases and other non-communicable diseases, which has been described in detail elsewhere [13, 14]. In briefly, the Kailuan Study was designed and initiated in 2006–2007 and a total of 101,510 participants were enrolled into participate the baseline surveys and the follow-up visits biennially. Up to now, the Kailuan cohort has completed seven circles of health assessments, including health assessments in 2006–2007, 2008–2009, 2010–2011, 2012–2013, 2014–2015, 2016–2017, and 2018–2019. According to standardized uniform design, face-to-face questionnaire interviews (demographic characteristics, disease history, lifestyles, etc.), physical examinations (body weight, height, waist circumference, blood pressure, etc.), and laboratory tests (fasting blood glucose, lipids profile, etc.) were conducted by trained physicians or nurses in every circle. The study was approved by the ethics committee of Kailuan Hospital. Written informed consent was obtained from all participants before every survey circle.

The present study was based on the Kailuan Study. Participants were included in the study if they had participated in the first three circles of physical examinations. The survey in 2010–2011 was regarded as the index year, the start time-point of follow-up. After excluding participants with a history of ischemic stroke prior to the third physical examination (2010–2011), or missing data on FBG or TG at each examination, a total of 54,098 participants were included for analysis (Fig. 1).

Fig 1
figure 1

Flow chart for the inclusion of participants in the study

Definition of the TyG index

The TyG index was calculated as ln [TG (mg/dL) × FBG (mg/dL)/2] [13]. Cumulative exposure to TyG index (cum-TyG) was calculated as the weighted sum of the mean TyG value for each visit: (TyG index2006 + TyG index2008)/2 × time1–2 + (TyG index2008 + TyG index2010)/2 × time2–3, where TyG index2006, TyG index2008, and TyG index2010 represent the TyG index at the first, second, and third examinations, and time1–2 and time2–3 represent the participant-specific time intervals between consecutive examinations (in years) [14]. The mean values of time1–2 and time2–3 were 2.07 and 1.97 years. We then placed the participants into four groups according to the quartile of cum-TyG: Q1 group, < 32.01; Q2 group, 32.01–34.45; Q3 group, 34.45–37.47; and Q4 group, ≥ 37.47.

Previous studies have shown that participants with a high TyG index are at a higher risk of ischemic stroke [15]. In the present analysis, a high TyG index was defined as a TyG index higher than the appropriate cut-off value, which was determined using a time-dependent receiver operating characteristic (ROC) curve (Additional file 1: Table S5). The duration of exposure to the TyG index was defined as the period of time during the study period in which a participant had a high TyG index: 0 years (TyG index less than the cut-off value at all three examinations), 2 years (TyG index higher than the cut-off value at one of the three examinations), 4 years (TyG index higher than the cut-off value at two of the three examinations), and 6 years (TyG index higher than the cut-off value at all three examinations).


The outcome of the present study was the incidence of ischemic stroke. We used the ICD-10th revision code I63.x to identify cases of ischemic stroke [16]. Ischemic stroke was diagnosed on the basis of neurological signs, clinical symptoms, and neuroimaging, including computed tomography and magnetic resonance imaging, according to the World Health Organization criteria [17], which were consistently applied across all 11 hospitals. All the participants were followed from the index year to the first of the date of death or ischemic stroke or 31 December 2019.

Data collection and definitions

All of the measurements were performed in a quiet, temperature-controlled room (22 °C–25 °C). All participants completed a questionnaire that collected information on their demographic characteristics (sex, age), personal health history (hypertension, diabetes, and CVD, use of antihypertensive, hyperglycemic, and lipid-lowering drugs) and lifestyle characteristics (smoking status, alcohol consumption habits, physical exercise habits) via face-to-face questionnaire interviews at each physical examination, as detailed elsewhere [18]. A current smoker was defined as someone who smoked a mean of ≥ 1 cigarette per day during the preceding year, and participants were categorized as non-smokers or current smokers. An alcohol consumer was defined as someone who drank a mean of ≥ 100 mL of alcohol per day for at least the preceding year, and participants were categorized as non-drinkers or current drinkers. Participants were categorized as undertaking physical exercise if they performed exercise ≥ 3 times per week for ≥ 30 min on each occasion [19]. Participants were asked to wear light clothes and be barefoot when measuring anthropometric indices. Body weight and height were measured to the nearest 0.1 kg and 0.1 cm, respectively, by trained physicians under standardized conditions following a standardized protocol. Body mass index (BMI) was calculated as weight (kg) divided by height squared (m2). Blood pressure (BP) was measured by experienced physicians using the right arm of each participant in the seated position and a calibrated mercury sphygmomanometer after 15 min of rest [20]. At least two BP measurements were made after 5 min of rest, and again if the difference between the two measurements was > 5 mmHg. The mean values were used in analyses. Hypertension [21] was defined as using a blood pressure ≥ 140/90 mmHg, the use of anti-hypertensive medication, or a self-reported history of hypertension. Diabetes [22] was defined using an FBG ≥ 7.0 mmol/L, the use of hypoglycemic drugs, or a self-reported history of diabetes. Lipid-lowering drugs were defined as drugs that lower blood lipid levels [23], such as statins, nicotinic acid, fibric acid derivatives (fibrates).

Blood samples were collected in the morning following an 8- to 12-h overnight fast at each visit. The FBG, TG, low-density lipoprotein-cholesterol (LDL-C), high-density lipoprotein-cholesterol (HDL-C), and hypersensitive C-reactive protein (hs-CRP) concentrations were measured using a Hitachi 7600 autoanalyzer (Tokyo, Japan) at the central laboratory of Kailuan General Hospital.

Statistical analysis

Continuous, normally distributed data are summarized as mean ± standard deviation ( ± s) and one-way analysis of variance was used for comparisons between multiple groups. Continuous, skewed data are summarized as median and interquartile range (25%, 75%) and the Wilcoxon rank-sum test was used for comparisons between groups. Categorical variables are summarized as number and percentage (%) and the chi-square test was used for comparisons between groups. Differences of basic characteristics between four groups were compared with Bonferoni correction. The cumulative incidences of new-onset ischemic stroke for each group were calculated using the Kaplan–Meier method and these were compared using the log-rank test. Two Cox proportional hazard models were used to evaluate the relationships of the cum-TyG index and the duration of exposure to high TyG index with ischemic stroke by calculating the hazard ratios (HRs) and 95% confidence intervals (95% CIs). To reduce the effect of confounding factors, the univariate and multivariate Cox regression model was used to analyze the independent influencing factors of ischemic stroke. The variables inputted for multivariate Cox regression model were variables with a P-value < 0.1 (by univariate analyses). In addition, although univariate analysis results suggested that alcohol consumption was not associated with ischemic stroke, previous studies found that was strongly correlated [24]. We included the alcohol consumption in the final analysis. Data outcomes of the Cox model were listed as the HR, with a 95% CI. To assess the relationships of the cumulative TyG index, three Cox proportional hazard models were modelled with enter selection approach for covariables. In model 1, age (continuous variable, years) and sex (categorical variable, men or women) were adjusted. In Model 2, LDL-C (continuous variable, mmol/L), HDL-C (continuous variable, mmol/L), BMI (continuous variable, kg/m2), hs-CRP (continuous variable, mg/L), smoking status (categorical variable, smoker or non-smoker), alcohol consumption habits (categorical variable, drinker or non-drinker), physical exercise habits (categorical variable, active or inactive), hypertension (categorical variable, yes or no), diabetes mellitus (categorical variable, yes or no), and the use of lipid-lowering drugs (categorical variable, yes or no) were further adjusted. In model 3 the TyG index (continuous variable) at baseline was further adjusted. The optimum cut-off value of the TyG index for the risk of incident ischemic stroke was determined using time-dependent ROC curve analysis. The optimal cut-off value of the TyG index was identified using the maximum value of the Youden index, which was calculated as sensitivity + specificity − 1. On the whole, there were few missing data in our final analysis dataset (< 2%), and the counts and proportions of missing data for covariates are presented below. We used multiple imputation by chained equations to impute missing value for covariates [25] and the details of the missing covariates are presented in Additional file 1: Table S2.

The data were also analyzed after stratification for age and sex. To test the robustness of our findings, the following sensitivity analyses were performed: (1) the exclusion of individuals who developed ischemic stroke-related endpoints within a year (n = 381); (2) the exclusion of participants who underwent treatment with anti-hypertensive, hypoglycemic, or lipid-lowering medications (n = 11,153); (3) the exclusion of participants with abnormal FBG (≥ 7.0 mmol/L) at baseline (n = 4306); and (4) Considered that we had adjusted for hypertension in the model 3, we did not adjust for SBP. However, we found that SBP was strongly correlated with ischemic stroke in univariate analysis, we further adjusted for SBP.

We used SAS version 9.4 (SAS Institute, Cary, NC, USA) and R software (version 4.0.2) for the analyses, and a two-sided P value of < 0.05 was considered to represent statistical significance.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.


This article is autogenerated using RSS feeds and has not been created or edited by OA JF.

Click here for Source link (