mRNA expression profiles and methylation microarray data (27k, 450k, corrected for batch effects with the sva combat function) for TCGA-GBM glioblastoma samples and GTEx (brain cortex, n = 290) were downloaded from the Xenabrowser database (https://xenabrowser.net/datapages/) [19]. Clinical information was downloaded with the R package cgdsr, and samples with survival information, RNA-seq data, and methylation data were selected, thus resulting in 114 glioblastoma samples.
2.1.2 CGGA dataMethylation datasets and corresponding survival data for glioblastoma were downloaded from the Chinese Glioma Genome Atlas (CGGA) database (http://www.cgga.org.cn/index.jsp) [20] and served as the validation set. To match the survival time criteria used in The Cancer Genome Atlas (TCGA, https://portal.gdc.cancer.gov/) [21], we selected 83 tumor samples with survival data (overall survival [OS] < 55). Sample survival information is summarized in Supplementary Table 1, sample information is detailed in Table 1, and statistical Table of Clinical Information of TCGA and CGGA Tumor Samples is detailed in Table 2.
Table 1 Sample Information TableTable 2 Clinical information statistics for TCGA and CGGA tumor samples2.1.3 T cell exhaustion genesT cell exhaustion-related genes were obtained by downloading four pathways (REACTOME_TNF_SIGNALING, REACTOME_INTERLEUKIN_2_SIGNALING, REACTOME_INTERFERON_GAMMA_SIGNALING, BIOCARTA_TCYTOTOXIC_PATHWAY) from the MSigDB database and relevant genes from the literature (PMID: 35961204), thus resulting in 192T cell exhaustion-related genes with expression data in TCGA-GBM. The 192T cell exhaustion-related genes are listed in Supplementary Table 2.
2.2 Identification of methylation sites involved in T cell exhaustionUsing the limma package, we identified differentially expressed genes associated with T cell exhaustion between tumor and normal tissues (ajd. P < 0.05 and |log2FC| > 1). The psych package's corr.test function was used to detect the correlation between differentially expressed genes and their corresponding methylation sites. Significantly negatively correlated sites were identified as T cell exhaustion-related methylation markers (Pearson P < 0.05 and correlation < 0).
2.3 Identification of T cell exhaustion-related methylation subtypesThe methylation site markers identified in the previous step were used for disease subtyping of cancer samples through consensus clustering analysis with the ConsensusClusterPlus package. The clustering distance was Euclidean, and the clustering method was KM, with 1000 repetitions to ensure stability. The K-elbow method was used to aid in determining the number of clusters, and survival analysis was performed on different subtypes.
2.4 Immune infiltration assessmentCIBERSORT combined with the LM22 signature matrix was used to estimate the proportions of 22 cell phenotypes in the samples, with the sum of all estimated immune cell type proportions in each sample equaling 1. The ESTIMATE algorithm was used to calculate immune scores, stromal scores, and tumor purity for each tumor sample. Differences in these scores between groups were compared with the Wilcoxon test.
2.5 Construction and validation of the T cell exhaustion methylation (TEXM) signatureOn the basis of the TCGA training set's methylation and OS data, we performed univariate Cox regression analysis on the methylation site markers identified in the previous step (samples were divided into high and low methylation groups according to median expression values). Methylation sites significantly associated with OS (P < 0.05) were further analyzed with multivariate Cox regression to construct the TEXM signature model. The formula was as follows:
$$} = \, \sum \, \left( } \times }} \right)$$
where genei represents the expression levels of key genes from LASSO regression, and coefi represents their weight. The same formula was applied to the CGGA validation set to calculate signature scores. The median TEXM signature score was used as the threshold to classify samples into high and low score groups, and Kaplan–Meier curves were plotted to demonstrate the prognostic model's predictive ability.
2.6 Independent prognostic performance and nomogram construction based on the TEXM signatureUnivariate and multivariate Cox regression analyses were performed on the clinical characteristics and the TEXM signature scores in both the training and validation sets to determine whether the TEXM signature might serve as an independent prognostic factor. In the training set, the TEXM signature was analyzed across various clinical characteristics, and a nomogram model combining the TEXM signature and clinical factors was constructed to predict cancer progression and guide clinical practice.
2.7 Drug resistance assessment associated with the TEXM signatureThe pRRophetic package's pRRopheticPredict function was used to predict the resistance of each sample to two drugs (methotrexate and doxorubicin). The Wilcoxon test was used to analyze differences in drug sensitivity between the high and low TEXM signature score groups.
2.8 Immune analysis associated with TEXM signatureThe Time-series Dense Encode (TIDE) database (http://tide.dfci.harvard.edu/) was used to predict the efficacy of immunotherapy and further characterize differences among TEXM signature groups. The TIP tool (http://biocc.hrbmu.edu.cn/TIP/index.jsp) was used to predict antitumor immune process activity and analyze differences in antitumor immune process activity among the TEXM signature groups.
2.9 Metabolic process analysis associated with the TEXM signatureTo explore metabolic processes, we used 470 metabolism-associated biological processes (keyword "metabolic") from the GOBP pathways in the MSigDB database. Enrichment of these processes in tumor samples was calculated with ssGSEA, and differences between groups with high and low TEXM signatures were analyzed with the Wilcoxon test.
2.10 Statistical analysisPearson’s test was used to evaluate the independence or correlation between two variables. All survival analyses were performed with Cox proportional risk models, KM analysis, and log-rank tests. The statistical significance threshold was set at P < 0.05.
Comments (0)