Identification of immune infiltration-related genes in HCC patients

According to the criteria for DEGs, we used the DESeq2 algorithm and identified 5010 DEGs between 374 TCGA-HCC samples and 50 normal liver samples. The 5010 identified DEGs and 636 immune infiltration-related genes obtained from the CIBERSORTX database were used for Venn diagram analysis. Through the above analysis, we obtained 89 immune infiltration-related genes in HCC (Fig. 1A). Then, we verified the expression of the 89 immune infiltration-related genes in the GSE25097, GSE87630 and GSE89377 data sets from the GEO database (Fig. 1B, Fig. S1, and Fig. S2). We conducted further enrichment analysis to explore the functions of the selected genes. The genes were significantly enriched in neutrophil chemotaxis, neutrophil migration, the external side of the plasma membrane, tertiary granule lumen, chemokine activity, and chemokine receptor binding (Fig. 1C). Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis showed that viral protein interaction with cytokine and cytokine receptor, cytokine-cytokine receptor interaction, and chemokine signalling pathway were all significantly enriched (Fig. 1D). The complete results of the enrichment analysis are shown in Table S2.

Fig. 1
figure 1

Identification and functional enrichment analysis of immune infiltration-related genes between the TCGA-HCC cohort and normal liver samples. A Venn diagram of the intersection between immune infiltration-related genes and DEGs identified by the DESeq2 algorithm. B Heat map of 89 DEGs related to immune infiltration in the data set GSE25097. Terms of Gene Ontology (GO) enrichment analysis (C) and KEGG pathways (D) related to the 89 immune infiltration-related genes

Construction and assessment of the risk scoring system

First, univariate Cox regression analysis was performed to explore the relationship between the expression levels of 89 immune infiltration-related genes and the OS times of patients in the TCGA-HCC cohort. Using the cut-off value of Cox P < 0.05, 27 potential predictive genes related to OS were screened out (Table S3). Then, LASSO regression analysis was used to refine the gene sets (Fig. 2A, B). Seventeen genes were identified as the most valuable predictive genes, and the risk scoring system was established based on the above formula (Table 1). Kaplan–Meier analysis of the 17 genes is shown in Fig. S3.

Fig. 2
figure 2

Demonstration of DEGs with univariate Cox regression P value < 0.05. A The LASSO regression model of the 27 immune infiltration-related genes performed by Lasso-ten-fold cross-validation. B The coefficient distribution in the LASSO regression model

Table 1 Seventeen immune infiltration-related genes identified by univariate COX regression analysis

To observe the expression of these genes in HCC and normal liver tissues, we further conducted research using immunohistochemical data from the Human Protein Atlas (HPA) database. The results are shown in Fig. 3. The immunohistochemical data of some genes were temporarily unavailable from the HPA database.

Fig. 3
figure 3

Immunohistochemical analysis of HCC and normal liver tissue determined by HPA database. A CCR3; B CD4; C CYP27A1; D DACH1; E IGHM; F ORC1; G RPL10L; H SKA1; I TNFRSF4

The risk score of each patient in the TCGA-HCC data set was calculated based on the expression levels and regression coefficients of the 17 immune infiltration-related genes. The distribution of risk scores in the TCGA-HCC data set is shown in Fig. 4A. According to the median risk score, the patients in the TCGA-HCC cohort were divided into high-risk and low-risk groups. In addition, the survival time distribution indicated that the higher the risk score was, the worse the prognosis (Fig. 4A). Figure 4A also shows the corresponding expression levels of the 17 immune infiltration-related genes. The performance of the risk scoring system according to the time ROC curves in terms of 1-year, 3-year, and 5-year prognoses is shown in Fig. 4B. The areas under the time ROC curves (AUCs) were 0.766, 0.757, and 0.773 for the 1-year, 3-year, and 5-year OS times, respectively, in the TCGA-HCC cohort. Kaplan–Meier analysis and the log-rank test showed that the prognosis of the high-risk group was significantly worse than that of the low-risk group (P < 0.001; Fig. 4C).

Fig. 4
figure 4

The risk score analysis, prognostic performance and survival analysis of the risk scoring model based on the differential expression of the 17 immune infiltration-related genes in TCGA-HCC patients. A The risk score, survival time distributions and gene expression heat map of immune infiltration-related genes in the TCGA-HCC cohort. B The ROC curves of the risk scoring model predicting OS of 1-year, 3-year, and 5-year in the TCGA-HCC cohort. C Kaplan–Meier survival analysis of the OS between the risk groups in the TCGA-HCC cohort

Correlation between the risk score and clinical features

We also analysed the association between the risk score and the clinical features of patients in the TCGA-HCC cohort. We found significant differences between the risk score and the following clinical features (Fig. 5 A–F): histological grade (G1&2 vs. G3&G4, P < 0.001), T stage (T1&T2 vs. T3&T4, P < 0.01), residual tumour (R0 vs. R1&R2, P < 0.01), pathologic stage (stage 1 & stage 2 vs. stage 3&stage 4, P < 0.01), vascular invasion (no vs. yes, P < 0.05) and AFP (≤400 vs. > 400, P < 0.05).

Fig. 5
figure 5

Correlation between clinical features and the immune infiltration risk score in the TCGA-HCC data set. A Histologic grade (G1&2 vs G3&G4), B T stage(T1&T2 vs T3&T4), C Residual tumor (R0 vs R1&R2), D Pathologic stage (Stage1&Stage2 vs Stage3&Stage4), E Vascular invasion (No vs Yes), F AFP(≤400 vs > 400). *P < 0.05, **P < 0.01, ***P < 0.001

Construction and verification of the nomogram

First, we performed univariate and multivariate Cox regression analyses of potential predictors, such as T stage, gender, age, residual tumour, histologic grade, AFP, vascular invasion, tumour status, and risk group, that may affect the prognosis of HCC patients (Table 2). The results showed that T stage, tumour status, and risk group were independent risk factors for OS in HCC patients. The independent predictors, including T stage, tumour status, and risk group, which affect the OS of HCC patients, were incorporated into the nomogram model (Fig. 6A). The C-index of the nomogram model we established was 0.692 (0.664–0.720). Then, we calculated the score of each HCC patient based on the nomogram and evaluated the predictive ability of the nomogram through ROC analysis. In the TCGA-HCC cohort, the nomogram AUCs for the 1-year, 3-year, and 5-year OS rates were 0.755, 0.781, and 0.832, respectively (Fig. 6B). Moreover, we used the calibration curve to evaluate the agreement of the nomogram. Compared with the ideal model, the calibration plots of the nomogram model showed good agreement for the 1-year, 3-year, and 5-year OS rates (Fig. 6C).

Table 2 Univariate and multivariate Cox regression analysis between the clinical features and OS in the TCGA-HCC cohort
Fig. 6
figure 6

Prognostic nomogram for the 1-year, 3-year, and 5-year OS of HCC patients. A The independent risk factors that affect the OS of HCC patients screened by multiple Cox regression were incorporated into the nomogram model. B The ROC curves for predicting the nomogram of 1-year, 3-year, and 5-year OS in the TCGA-HCC cohort. B The nomogram calibration curves for predicting the 1-year, 3-year, and 5-year OS of the TCGA-HCC cohort

GSEA

To reveal the potential impact of immune infiltration-related genes on the occurrence and development of HCC, we performed GSEA on the DEGs between the high-risk group and the low-risk group. GSEA showed that the DEGs between the high-risk group and low-risk group were mainly enriched in several pathways, including disease, matrisome, haemostasis, innate immune system, metabolism of lipids, transport of small molecules, infectious disease, metabolism of amino acids and derivatives, vesicle-mediated transport, and adaptive immune system (Fig. 7). These findings suggested that immune infiltration-related genes may play a potential role in amino acid and lipid metabolism, matrisome and small molecule transportation, immune system regulation, and hepatitis virus infection in HCC.

Fig. 7
figure 7

GSEA analysis of the DEGs between the high-risk group and the low-risk group in the TCGA-HCC cohort. Top 10 terms of GSEA analysis (Reactome disease, NABA matrisome, Hemostasis, Innate immune system, Metabolism of lipids, Transport of small molecules, Infectious disease, Metabolism of amion acids and derivatives, Vesicle mediated transport, Adaptive immune system)

Immune cell infiltration level analysis

We also calculated the correlation between this prognostic model based on patients in the TCGA-HCC cohort and immune cell infiltration. The results showed that the high-risk group showed lower levels of immune cell infiltration, such as B cells (P < 0.01), CD8 T cells (P < 0.001), neutrophils (P < 0.001), DCs (P < 0.001), Tregs (P < 0.01), and NK cells (P < 0.001) (Fig. 8A). Moreover, the risk score was negatively correlated with infiltrating immune cells, including B cells, CD8 T cells, neutrophils, DCs, Tregs, and NK cells (Fig. 8B-G).

Fig. 8
figure 8

Analysis of immune cell infiltration in TCGA-HCC cohort. A The box plot showed the levels of immune cell infiltration between the high-risk group and low-risk group in HCC patients. Scatter plots of correlation between immune cell infiltrations and risk score (B, B cells; C, CD8 T cells; D, Neutrophils; E, DC; F, TReg; G, NK cells). **P < 0.01, ***P < 0.001, ns, not significant

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Disclaimer:

This article is autogenerated using RSS feeds and has not been created or edited by OA JF.

Click here for Source link (https://www.biomedcentral.com/)