Feature extraction of TME signals and construction of ISTMEscore
The designs of our studies were demonstrated in Figs. 1, 2 and Additional file 1: Figure S1. To isolated the TME-related signals from mixed tumor tissues, we first divided RNA-seq data of the training cohort (TCGA LUAD)  into 11 heterogeneous clusters based on NMF (Fig. 3A). Enrichment analysis of clusters’ high NMF weight genes (top 100) was performed to identify clusters with immune or stromal signals. Only one cluster was linked with T cell activation and IFN-γ production (Additional file 11: Table S3, Fig. 3B). We extracted the LDF of this cluster by NMF. Univariate Cox regression indicated that the LDF was a favorable prognostic factor (Cox-P < 0.0001). Therefore, this cluster was defined as “immune activation cluster”. On the other hand, extracellular matrix organization, cell–matrix adhesion and TGF-β signal were enriched in another cluster (Additional file 12: Table S4, Fig. 3B), and its LDF was identified as an adverse prognostic factor (Cox-P < 0.0001), suggesting this cluster was a “stromal activation cluster”. Enrichment results of other 9 clusters were displayed in Additional file 13: Table S5.
We next identified gene signatures from the immune and stromal clusters using our novel workflow (Fig. 2, see “Methods”—“Development of novel immune and stromal scores” for details). A total of 166 genes were identified as TME-related signatures, containing 108 immune- (e.g. CD40LG, CD8A, IFNG, PTPRC, CXCL10) and 58 stromal-related genes (e.g. MMP13, FN1, COL1A1, COL1A2, COL11A1, COL3A1, VEGFA, Additional file 14: Table S6). Novel immune and stromal scores were calculated by immune- and stromal-related signatures using ssGSEA.
ISTMEscore was linked to prognostic, cellular and molecular characteristics
The association of ISTMEscore with prognostic, cellular and molecular patterns were then investigated. Kaplan–Meier curves indicated that our immune scores were significantly associated with better OS, while stromal scores were associated with poorer OS in the training datasets (Fig. 3C, D). We next characterized the association between ISTMEscore and TME cell infiltration (estimated by MCPcounter) in the TCGA LUAD training dataset . All 6 immune cells (T cells, CD8+ T cells, B cells, NK cells, monocytic lineage and myeloid dendritic cells) were positively correlated with the immune scores (Spearman correlation P < 0.0001, Fig. 3E), and the stromal scores were positively correlated with fibroblasts and negatively correlated with myeloid dendritic cells (Spearman correlation P < 0.0001).
Differentially expressed gene analysis was then performed in patients with high-low immune and stromal scores (Up 50% vs. Low 50%). The GSEA of differentially expressed genes suggested that T cell activation was highly enriched in the patients with high immune scores, while leukocyte activation was less enriched in the patients with high stromal scores (Fig. 3F, G).
The correlation between T cell co-stimulatory/inhibitory molecules and our scores were subsequently analyzed. The stromal scores were negatively correlated with CD28 and CD40LG, and positively correlated with PD-1 (Additional file 2: Figure S2), suggesting that the stromal scores were positively associated with exhaustive immunity.
Identification of TME subtypes by ISTMEscore
To explore the roles of immune and stromal activation in tumors, patients were divided into 4 TME subtypes according to the median values of ISTME scores. The patients with high immune and stromal scores were considered as the “HH type”, and high immune and low stromal scores were identified as the “HL type”. The “LH type” patients had low immune and high stromal scores, and the patients with low immune and stromal scores were defined as the “LL type”.
Comprehensive analysis of multi-dimensional landscape of the 4 TME subtypes in LUAD, SKCM and HNSC
The landscape of TME cell infiltration in different TME subtypes
To depict the TME immune landscape of the 4 subtypes, the infiltration of immune cells was calculated using MCPcounter. In the TCGA LUAD, SKCM and HNSC cohorts, the HH and HL patients had significantly high levels of immune cell infiltration, including T cells, CD8+ T cells, cytotoxic lymphocytes, NK cells, B lineages, monocytic lineages and myeloid dendritic cells (Fig. 4A–C). We further assessed the cellular patterns of TME subtypes in other independent cohorts, containing GSE11969 (NSCLC, n = 149) , GSE68465 (LUAD, n = 442) , GSE68571 (LUAD, n = 86) , GSE37745 (NSCLC, n = 196) , GSE50081 (NSCLC, n = 172)  and GSE65904 (SKCM, n = 214) . Consistent with the outcomes in the TCGA cohorts, the HH and HL patients also had high immune cell infiltration (Additional file 3: Figure S3), except GSE68571, due to the lack of mapping genes.
TME subtype was associated with the functional markers of T cells and angiogenesis
Downregulation of co-stimulatory molecules (CD28, CD40LG) and upregulation of co-inhibitory molecules (PD-1, CTLA4, LAG3, TIM-3) were detected in the exhaustive T cells. In the TCGA LUAD dataset, the ratio of co-inhibitory/co-stimulatory molecules (PD-1 + CTLA4 + LAG3 + TIM-3)/(CD28 + CD40LG) was significantly different in the 4 TME subtypes (analysis of variance, ANOVA, P < 0.0001, Fig. 4D). Contrary to the LH patients, the HL patients had the lowest ratio. Similar patterns were also found in the TCGA SKCM (ANOVA, P = 0.0072) and TCGA HNSC datasets (ANOVA, P < 0.0001).
In addition, angiogenic markers, VEGFA levels and ANGPT1/ANGPT2 ratios had significantly different levels among TME subtypes in the TCGA LUAD and TCGA HNSC datasets (ANOVA, P < 0.0001, Fig. 4E, F). The HL patients had low levels of VEGFA and high ANGPT1/ANGPT2 ratios, suggesting inhibited angiogenesis. VEGFA levels were associated with TME subtypes in the TCGA SKCM dataset (ANOVA, P = 0.0004).
TME subtype was associated with distinct TME networks composed of cells, receptors and ligands
In TME, signal transmission between cells was mediated by secreted proteins and direct communications. The TME communication networks were established to characterize cell interactions, as well as receptors and ligands, with Thorsson’s method in the training datasets (Fig. 4G) . The TME networks of the HH type composed of CD8+ T cells, B cells, NK cells and fibroblasts as the center of the subnetworks mediated by integrins and chemokines. The immune cell-centered network was identified in the HL type. The fibroblast-centered networks were observed in the LH type. However, no network was identified in the LL type, suggesting its desert-like TME.
TME subtype was linked to clinicopathological characteristics
According to the above findings, our TME subtypes could represent TME patterns and were associated with immune regulation in LUAD, SKCM and HNSC. The correlations between TME subtypes and clinicopathological characteristics were next analyzed. In the TCGA LUAD dataset, the LH type was concentrated on young patients at pathologic N2–3 and stage III–IV (Table 1), which were related to higher malignancy and poorer survival. In the TCGA HNSC dataset, the LH type was significantly enriched in male patients at pathologic T3–4, which was contrary to HL. In the TCGA SKCM dataset, TME subtype was independent of clinicopathological characteristics.
TME subtype was an independent prognostic factor
The prognostic features (age, gender, pathologic stage and TME subtype) were identified by clinical value and univariate Cox regression (Fig. 5A–C). Pathological T, N and M have collinearity with pathologic stage, thus we only included the pathologic stages in the Cox model. In the TCGA LUAD dataset, multivariate Cox regression analysis indicated that age (HR = 1.02, P = 0.03), pathologic stage (HR = 1.58, P < 0.0001) and TME subtype (HL vs. LH, HR = 0.42, P < 0.0001) were independently prognostic factors (Fig. 5D). In the TCGA SKCM dataset, age (HR = 1.02, P = 0.0006), pathologic stage (HR = 1.36, P = 0.002) and TME subgroup (HL vs. LH, HR = 0.39, P < 0.0001) were independently prognostic factors (Fig. 5E). In the TCGA HNSC dataset, age (HR = 1.02, P = 0.006), gender (male vs. female, HR = 0.76, P = 0.1), pathologic stage (HR = 1.45, P < 0.001) and TME subgroup (HL vs. HH, HR = 0.74, P = 0.168) were independently prognostic factors (Fig. 5F). Combining the 4 prognostic factors using Cox model, DCA revealed that the inclusion of TME subtype had higher potential clinical utility (Fig. 5G–I).
In the Kaplan–Meier curves (Fig. 5J), the HL patients had the best OS in the TCGA LUAD (logrank-P < 0.0001), SKCM (logrank-P < 0.0001) and HNSC (logrank-P = 0.39) datasets. To adjust the impacts of prognostic covariates (age, gender and pathologic stage) on prognosis of TME subtypes, we performed PSM to baseline correction. In the TCGA LUAD dataset, the HL type had significantly favorable OS, while the LH type had unfavorable OS (logrank-P = 0.043, Additional file 4: Figure S4A&B) after PSM (For all subtypes: age, ANOVA, P = 0.986; gender, Chi-squared test, P = 0.418; stage, Chi-squared test, P = 0.228). In the TCGA SKCM dataset, the survival curve was similar to that of the TCGA LUAD dataset (logrank-P = 0.00061, Additional file 4: Figure S4C&D) after PSM (For all subtypes: age, ANOVA, P = 0.882; gender, Chi-squared test, P = 0.43; stage, Chi-squared test, P = 0.796). In the TCGA HNSC dataset, survival curve also showed difference (logrank-P = 0.096, HL vs. LH: P = 0.03, Additional file 4: Figure S4E&F) after PSM (For all subtypes: age, ANOVA, P = 0.622; gender, Chi-squared test, P = 0.931; stage, Chi-squared test, P = 0.471).
To validate our TME subtypes in the other 6 independent datasets [12,13,14,15,16,17], we performed Kaplan–Meier analysis in each dataset. The HL patients were associated with significantly favorable prognosis in all validation datasets (Fig. 5J).
TME subtype was associated with mutation profile and copy number alteration
To determine whether TME subtypes were driven by gene mutations, we analyzed the somatic mutation profile in the 3 cancers from TCGA (Fig. 6A–C). Gene mutations in the 4 TME subtypes were compared using Fisher’s test (Additional file 5: Figure S5A–F). However, there was no duplicated differentially mutated genes in the 3 cancers. The mutation frequencies of traditional driving genes were then compared (Additional file 6: Figure S6A–I). In LUAD and HNSC, the mutation frequency of TP53 was significantly downregulated in the HL type, and the mutation sites of TP53 were concentrated at DNA binding domains (Additional file 7: Figure S7A&B). In SKCM, there were no differently mutated driver genes in the 4 TME subtypes.
Despite the controversy, tumor mutation burden (TMB) was the novel biomarker for ICI therapy according to CheckMate-032 (SCLC) and CheckMate-026 (NSCLC) [41, 42], and its roles in prognosis of patients without ICIs remained unclear. The non-silent mutation counts in the exon region represent relative values of TMB . In this study, the HH and LH types were associated with high non-silent mutation count (ANOVA P < 0.0001, Fig. 6D), which was linked to better OS (logrank-P = 0.055, Fig. 6E) in the TCGA LUAD dataset. In the TCGA SKCM and HNSC datasets, the TME subtype had no significant mutation count that was also linked to prognosis.
The association between CNA burden and prognosis was still unclear. Here, we defined CNA count burden (CNACB) as the total gene counts with CNA in each sample. In the 3 TCGA cohorts, the HL type had lowest CNACB (Fig. 6F), which was an unfavorable prognostic factor in HNSC (logrank-P = 0.0056) and LUAD (logrank-P = 0.1), but a favorable factor in SKCM (logrank-P = 0.1, Fig. 6G).
TME subtype was related to ICI responses
Five independent transcriptomic cohorts containing patients with immunotherapy were used to validate the predictive effects of our TME subtype on ICI responses. In the immunotherapy dataset 1 (Chen et al.) of 53 melanoma cases , there were 5 biopsy timepoints: pre-anti-CTLA4, on-anti-CTLA4, pre-anti-PD-1, on-anti-PD-1 and prog-anti-PD-1. Immune scores were markedly increased in on-anti-CTLA4 (P = 0.078) and on-anti-PD-1 (P = 0.0002) responders (Fig. 7A), while stromal scores were decreased in on-anti-PD-1 therapy (P = 0.0013) responders (Fig. 7B). The HL patients showed higher responses than the LH patients with on-anti-CTLA4/PD-1 therapy (HH vs. HL vs. LH vs. LL: 100% vs. 100% vs. 0% vs. 0%, Fisher’s test P = 0.0012, Fig. 7C). Notably, 41.9% patients experienced transition of TME subtype at the 5 biopsy timepoints (Fig. 7D). There were 8 patients with changes from the LH and LL types before anti-CTLA4 therapy to HH and HL types at later times, and 75% patients were responsive.
The immunotherapy dataset 2 (GSE91061) included 65 melanoma patients with anti-PD-1 therapy . Similar to immunotherapy dataset 1, immune scores were significantly higher in responders with on-anti-PD-1 therapy (P = 0.0049, Fig. 7E), but stromal scores were not significantly lower (Fig. 7F). The LH patients were linked to low responses of on-anti-PD-1 therapy (HH vs. HL vs. LH vs. LL: 54.5% vs. 78.9% vs. 37.5% vs. 44.4%, Fisher’s test P = 0.07, Fig. 7G). Moreover, 29% patients experienced transition of TME subtype before and during treatment (Fig. 7H). Furthermore, 66.7% patients, who changed to LH and LL types, were not responsive, while 66.7% patients, who changed to HH and HL types, were responsive.
Other immunotherapy datasets only included transcriptomic data before treatment. There was no statistical significance between TME subtype and immunotherapy response in these datasets (Fig. 7I–L).
Comparison of ISTMEscore system with other studies
The overlap between our TME gene signatures and existing signatures
To test the robustness of this study, we compared our TME signatures (“Results”—“Feature extraction of TME signals and construction of ISTMEscore”) with existing signatures [44,45,46,47,48,49]. There were 30.6% and 31% overlaps of ISTMEscore signatures with ESTIMATE and MCPcounter, respectively (Additional file 15: Table S7).
Comparison with other algorithms for prediction of prognosis and ICI responses
Since ESTIMATE  and MCPcounter  algorithms also provided TME-related scores, it was necessary to compare these indicators with our ISTMEscore. We first calculated C-index for each score on prognosis prediction in all 9 datasets with prognostic annotation (Additional file 16: Table S8). The top 5 indicators of C-index were our stromal score (mean C-index: 0.587), our immune score (mean rank: 0.58), B lineage infiltration (mean C-index: 0.578), myeloid dendritic cell infiltration (mean C-index: 0.567), ESTIMATE immune score (mean C-index: 0.566). We next applied ROC analysis on ICI responses, and calculated area under receiver (AUC) of ROC in the 5 immunotherapy cohorts (Additional file 16: Table S8). The top 5 indicators of AUC were monocytic lineage infiltration (mean AUC: 0.73462), our immune score (mean AUC: 0.72636), myeloid dendritic cell infiltration (mean AUC: 0.72574), T cell infiltration (mean AUC: 0.72034) and our stromal score (mean AUC: 0.71854).
Moreover, we compared our scores with ESTIMATE scores in GSE9014 , which included 111 arrays of stroma through Laser Capture Microdissected from 53 breast cancer patients. Only our stromal scores were significantly associated with the histologic GRADE (Additional file 8: Figure S8A-D, P < 0.0001). Table of abbreviations was displayed in Additional file 17: Table S9.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.