The study's design flowchart is depicted in Supplementary Fig. 1. GSE98638 represents a scRNA-seq dataset of T cells in HCC, encompassing 6 patients and 5063 cells. Utilizing the "Single" R package, cell subsets were annotated and visualized (Supplementary Fig. 2A). Subsequently, the scRNA-seq data underwent normalization and dimensionality reduction via PCA, UMAP, and t-SNE, leading to the identification of 15 distinct clusters (Supplementary Fig. 2B). A total of 4304 cluster markers were acquired, and a gene expression heat map for each cluster was generated (Supplementary Fig. 2C).
3.2 WGCNAThe content matrix of 22 TIICs in the TCGA-LIHC samples was calculated using the CIBERSORT algorithm. The WGCNA package was utilized for correlation analysis of cluster markers and the immune cell matrix, ensuring that each gene regulation relationship conformed to the scale-free distribution by achieving a scale-free topology fitting index of 0.9 (Fig. 1A, B). Cluster markers were grouped into different modules by creating the module clustering dendrogram (Fig. 1C, D). Subsequent analysis of correlation coefficients revealed that the red module showed the strongest association with CD8 T cells among all immune cell types (Fig. 1E).
Fig. 1WGCNA of cluster markers. A, B The soft threshold was determined according to the scale-free topological fitting index and the average connectivity degree. C Sample cluster map. D Hierarchical clustering map. E Heat map of gene module-immune cell correlation
Functional annotation of GO and enrichment analyses of KEGG were performed for genes related to CD8 T cells. The GO enrichment analysis revealed participation in functions such as T-cell activation, MHC II protein complex, and immune receptor activity (Supplementary Fig. 3A). The KEGG analysis identified genes associated with CD8 T cells participating in Th1 and Th2 cell differentiation and Th17 cell differentiation (Supplementary Fig. 3B).
3.3 Development and validation of prognosis modelsFifty-nine DEGs related to CD8 T cells were identified using a Venn diagram (Fig. 2A). Subsequently, Cox-Lasso regression was performed to select five prognostic genes (IKBKE, ATP1B3, MSC, ADA, BATF), which were used to develop a risk stratification model (Fig. 2B–D). The risk score was calculated as follows: Risk Score = IKBKEexp × 0.002064 + ATP1B3exp × 0.003343 + MSCexp × 0.002037 + ADAexp × 0.001229 + BATFexp × 0.002017. To assess the reliability of our risk model, we employed the TCGA-LIHC cohort as the training set and used the ICGC-LIHC cohort for external validation. The survival heat map indicates that as the risk score increased, a majority of patients experienced shorter survival durations (Fig. 3A, B). Kaplan–Meier analysis revealed a statistically significant decrease in overall survival among high-risk patients (Fig. 3C, D).
Fig. 2Cox-Lasso regression analysis of CD8 T cell-related genes. A The Venn diagram illustrates the overlap between the red module genes of WGCNA and DEGs of the TCGA dataset. B Univariate Cox regression analysis. C–D Lasso regression analysis
Fig. 3Risk scores of the TCGA training set and ICGC test set. A Survival state plots of risk scores for the TCGA training set. B Survival state plots of risk scores for the ICGC test set. C Kaplan–Meier survival curves of risk scores for the TCGA training set. D Kaplan–Meier survival curves of risk scores for the ICGC test set
The risk score exhibited significant correlations with grade, stage, and T stage (Fig. 4A, B). Cox regression analyses revealed that the risk score independently predicted the prognosis of HCC (Fig. 4C, D). A nomogram was constructed by incorporating both risk score and clinical factors (Fig. 4E). The calibration curve and ROC curve substantiated the excellent predictive value of the nomogram (Fig. 4F, G). Based on the ROC curve and C-index analysis of variables, it can be concluded that the risk score possesses superior predictive capability (Fig. 4H, I). Additionally, the Cox regression analysis conducted in the ICGC-LIHC external validation cohort confirmed that the risk score served as an independent prognostic factor (Supplementary Fig. 4A, B). By constructing a nomogram, calibration curve, ROC curve, and C-index, we observed that the risk score exhibited significant predictive value for the prognostic model in the ICGC external cohort (Supplementary Fig. 4C–G).
Fig. 4The model's predictive performance and clinical prognostic value. A, B The correlation between clinical factors and patients at high and low risk. C, D Univariate and multivariate Cox analysis of risk scores. E A nomogram was constructed based on clinical factors and risk scores. F The calibration curve of the nomogram. G The ROC curve of the nomogram. H The ROC curves of factors. I The C-index of factors
3.4 GSVA and immune checkpoints (ICPs)GSEA analysis revealed the chemokine signaling pathway, ECM receptor interaction, and others in the high-risk group (Supplementary Fig. 5A). Conversely, the low-risk group exhibited significant enrichment in beta-alanine metabolism, retinol metabolism, and other pathways (Supplementary Fig. 5B). Additionally, GSVA analysis demonstrated a strong correlation between the high-risk group and pathways such as PI3K-AKT-mTOR signaling, Notch signaling, and IL6-JAK-STAT3 signaling (Fig. 5A). Furthermore, ICPs including PDCD1, CTLA4, LAG3, CD274, etc., were positively correlated with the risk score (Fig. 5B). Furthermore, notable differences were found in immune subtypes (C1, C2, C3, and C4) among the high and low-risk groups (Fig. 5C). These findings further support a relationship between high-risk scores and activation of immune processes.
Fig. 5GSVA enriched pathways and ICPs. A The risk score was subjected to GSVA enrichment analysis. B The correlation between risk scores and ICPs. C Immune subtypes (C1: wound healing, C2: IFN-γ dominant, C3: inflammatory, C4: lymphocyte depleted)
3.5 TMEWe conducted a thorough analysis of TIICs and immune function in HCC utilizing the ssGSEA algorithm. The findings indicated that in the high-risk group, the levels of CD8+ T cells and macrophage infiltration were elevated, and immune functions such as HLA and MHC-1 may be activated (Fig. 6A, B). Additionally, the high-risk group exhibited higher Estimate scores, Immune scores, and Stromal scores but lower Tumor purity compared to the low-risk group (Fig. 6C–F).
Fig. 6Tumor microenvironment. A The heatmap illustrates the correlation between risk score and TME. B Differences in immune cell infiltration levels between high and low risk groups. C-F Differences of ESTIMATE scores, Immune scores, Stromal scores, and Tumor purity in high-risk and low-risk groups. (*P < 0.05, **P < 0.01, ***P < 0.001)
3.6 Immunotherapy evaluationWe employed the TIDE database to examine tumor immune-related factors and observed significant correlations between Exclusion score, CAF, Merck18, MDSC, CD8, IFNG, and risk scores (Fig. 7A, B). Additionally, through the computation of IPS for HCC samples, we found that patients classified in the high-risk group and exhibiting positive CTLA-4 and PD-1 expressions displayed elevated IPS scores (Fig. 7C–F).
Fig.7Differences in immune responses between high-risk and low-risk patients. A Correlation between risk score and various immune indicators in TIDE. B Differences in TIDE immune indicators between high and low-risk groups. C-F Differences in IPS between high and low-risk groups to PD1 and CTLA4 treatment
3.7 Chemotherapy drug sensitivityUpon evaluating the IC50 values of chemotherapy drugs in HCC, the findings reveal that the low-risk group demonstrates lower IC50 values for cytarabine and axitinib (Fig. 8A, B), while the high-risk group exhibits lower IC50 values for dasatinib and gefitinib (Fig. 8C, D). A negative correlation between drug sensitivity and IC50 value was observed.
Fig. 8The sensitivity of chemotherapeutic agents. A, B Chemotherapeutic agents that are sensitive to high-risk groups. C, D Chemotherapeutic agents that are sensitive to low-risk groups
3.8 The expression of risk genes in HCCThe violin map and bubble plot depict the expression of risk genes (IKBKE, ATP1B3, MSC, ADA, and BATF) in distinct T cell clusters (Supplementary Fig. 6A, B). The t-SNE plot reveals that these risk genes exhibit high expression levels in T cells (Supplementary Fig. 6C). Additionally, the risk genes are up-regulated in HCC tissues (Fig. 9A–E), and individuals with elevated expression of risk genes tend to have a poor prognosis (Fig. 9F–J). Furthermore, the expression levels of five risk genes were higher in majority of HCC cell lines compared to L02 normal cells (Fig. 9K–O). Additionally, qRT-PCR analysis of 12 HCC samples validated that the expression levels of five risk genes were elevated in cancerous tissues compared to adjacent normal tissues (Fig. 9P–T). Immunohistochemical staining images of risk gene proteins in HCC tissues and normal liver tissues were acquired from the HPA database. The findings revealed elevated protein expression levels of IKBKE, ATP1B3, and ADA in HCC tissues compared to normal liver tissues, while the protein expression level of BATF was lower than that in normal liver tissues (Supplementary Fig. 7A–D).
Fig. 9The expression and prognosis of risk genes in HCC. A–E The expression levels of IKBKE, ATP1B3, MSC, ADA, and BATF in both normal liver and HCC tissues. F–J The Kaplan–Meier survival curves of IKBKE, ATP1B3, MSC, ADA, and BATF in HCC. K–O Relative expression levels of IKBKE, ATP1B3, MSC, ADA, and BATF in normal cell lines and HCC cell lines via qRT-PCR. P–T Relative expression levels of IKBKE, ATP1B3, MSC, ADA, and BATF were determined in 12 paired normal liver tissues and HCC tissues using qRT-PCR.*P < 0.05, **P < 0.01, ***P < 0.001
Comments (0)