If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Lung adenocarcinoma (LUAD) is a major health problem and has poor prognosis. Heterogeneity is a central determinant of the treatment outcome, requiring identification of new subclasses of LUAD. Senescence has emerged as a crucial regulator of metastasis and drug response. Ionizing radiation– and doxorubicin-induced senescence-associated genes in lung fibroblasts were used in K-means clustering to identify high- and low-senescence (HS and LS) classes among The Cancer Genome Atlas- LUAD (TCGA-LUAD) patients. The LS group showed significantly poorer survival (P = 0.01) and greater activation of proliferative signaling pathways, proliferation, wound healing, and genetic aberrations (P < 0.05). The TP53 mutation rate was significantly greater in the HS group (P < 0.0001), explaining the phenotype. Also, genome-wide hypomethylation was significantly greater in the LS group than in the HS group. Interestingly, pathway analysis identified silencing of Wnt signaling in the HS group. The machine learning–based recursive feature elimination technique was used to identify a 20-gene senescence signature in TCGA-LUAD samples. The presence of a senescence phenotype with poor survival was validated in an independent patient cohort and a cell-line cohort using unsupervised clustering of samples based on a 20-gene signature. On further analysis, HS cells were more resistant to drugs, particularly histone deacetylase inhibitors. Taken together, this study identified a novel subtype of LUAD with reduced Wnt signaling and high drug resistance.
Lung adenocarcinoma (LUAD) is a common cause of cancer-related deaths worldwide.
These observations indicate that tumors have heterogeneity in terms of senescence status.
In this work, transcriptomic data from senescent lung fibroblast cells were used to identify the senescence subtypes of LUAD. Low-senescence (LS) samples were characterized by significant hypomethylation, greater genetic aberration, and poor prognosis. A classification model to identify senescence levels in LUAD patients and cell lines is proposed based on a machine-learning technique.
Materials and Methods
Lung Fibroblast Senescence Expression Data and the Identification of Senescence-Associated Genes and Patient Data Sets
Data on senescence from WI-38, lung fibroblast cells treated with doxorubicin and ionizing radiation (IR) from the GSE130727 data set, were used. To select senescence-associated genes, the expressions of genes in senescent cells versus proliferating control cells were compared, and a fold-cutoff of >5 was applied. To select the senescence-associated genes with a probable role in cancer, the expression of genes was compared between LUAD and normal lung samples. RNA sequencing (RNA-seq) data associated with clinical parameters, from The Cancer Genome Atlas (TCGA)–Genomic Data Commons, Michigan
Differential Gene Expression and Metascape Analysis
To compare gene expression levels between the two sample groups, t-test with false discovery rate correction was performed. Metascape (https://metascape.org, last accessed September 1, 2021) analysis was performed to identify the pathways associated with a gene group.
K-Means Clustering and Gene Set Enrichment Analysis, Cibersort, and Kaplan-Meier Analysis
Fragments per kilobase of transcript per million mapped reads (FPKM) of senescence genes were normalized, and consensus K-means clustering was performed. Gene set enrichment analysis (GSEA) was performed using default settings. For Kaplan-Meier analysis, Prism software version 8.3 (GraphPad Software, San Diego, CA) was used, and Mantel-Cox statistics were used to calculate hazard ratios and P values. For Cibersort analysis (AlizadehLab, Stanford, CA), RNA-seq data from LUAD samples were used as a mixture file, and the LM22 immune cell type was used as a signature gene file.
Copy Number Variation and Mutation Analysis
Data on copy number variations from LUAD samples were downloaded from the Genomic Data Commons data portal and analyzed using Genomic Identification of Significant Targets in Cancer (GISTIC) software version 2.0 (https://www.genepattern.org/modules/docs/GISTIC_2.0, last accessed September 1, 2021). Cytobands with false discovery rates of ≤0.1 were retained, and a t-test using FPKM values was used to compare the genes in these cytobands between the high- and low-senescence groups (HS and LS). Copy number variations at these focal points were visualized using Circos plotting. To compare broad events in each chromosome arm, the Fisher exact test was performed. Values of <−0.3 and >0.3 were considered as deleted and amplified, respectively. Mutation data in mutation annotation format were downloaded from Genomic Data Commons, and all of the nonsilent variants of mutations were retained. The Fisher exact test was performed to compare all of the genes between the HS and LS groups, and a waterfall plot was plotted for selected genes.
For methylation, TCGA-LUAD Infinium HumanMethylation450 data was used and preprocessed using the method decribed earlier.
Identification of Senescence Gene Signature Using Machine Learning, and Drug Sensitivity Analysis
Given that it is vital to retain the features that contribute the most in the distinction of HS samples from LS samples and to eliminate redundant features, feature selection by recursive feature elimination was performed. Caret software version 6.0 (http://topepo.github.io/caret/index.html, last accessed September 1, 2021) library in statistical software package R version 4.0.3 (https://www.R-project.org, last accessed September 1, 2021) was used to perform recursive feature elimination on a random forest model with fivefold cross-validation repeated 100 times. The top 20 features, based on variable importance returned by the random forest model, were considered for further processing. To determine the predictive ability of the chosen features, two random forest models, one with all features (418) and the other with 20 selected features, were built. A train-test split of 80% to 20% (train: C1 = 309, C2 = 106; test: C1 = 77, C2 = 26) was used in both cases. The selected features were validated by the survival analysis in the TCGA, Michigan, and East Asian cohorts. To obtain the senescence groups, K-means clustering was performed in testing cohorts using the expression of the 20 selected genes. Senescence groups were also identified and verified in cell lines using the expression data from the cancer cell line encyclopedia (CCLE). IC50 values for different cell lines from genomics of drug sensitivity in cancer (GDSC) for different drugs were used to compare the sensitivity of drugs.
Radiation and Chemotherapy Drugs Induce Different Senescence Genes in Lung Fibroblasts
RNA-seq data from WI-38 cells (human lung fibroblasts), associated with treatment with doxorubicin and ionizing radiation (IR), were analyzed to identify the changes in gene expression levels during senescence in lung tissue. As expected, both the IR- and doxorubicin-treated cells showed increased expression of senescence marker CDKN1A compared with that in control cells (Figure 1A). The expression changes with IR and doxorubicin treatment were checked. The expression levels of a higher number of genes were changed in IR-treated cells compared with doxorubicin-treated cells (Figure 1B). Metascape analysis showed that genes up-regulated after IR- and doxorubicin-induced senescence modulated the immune system (Supplemental Figure S1A). However, down-regulated genes in IR-induced senescence regulated the cell cycle, whereas down-regulated genes after doxorubicin-induced senescence did not show specific pathway enrichment (Supplemental Figure S1A). A Venn diagram was used to identify 315 commonly up-regulated and 200 commonly down-regulated genes after IR- and doxorubicin-induced senescence (Figure 1C). These genes were identified as senescence-related genes. Next, the expression levels of senescence genes were compared between the TCGA normal and LUAD samples, and differentially expressed genes were used for network analysis (Supplemental Figure S1, B and C). As seen earlier, senescence-induced genes with differential expression in LUAD compared with normal samples showed immune-modulatory function (Supplemental Figure S1D). In comparison, senescence-inhibited genes with differential expression in LUAD compared with normal samples were associated with cell cycle regulation (Figure 1D), suggesting inhibition of the cell cycle after senescence induction.
Senescence-Associated Genes Identify Two Novel Subclasses of LUAD with the Difference in Senescence Level
To identify tumors with different senescence statuses, K-means consensus clustering was performed using senescence genes differentially expressed in LUAD compared with normal samples. The analysis identified two clusters with variable expression of LUAD-associated senescence genes (Figure 1E and Supplemental Figure S1E). Heatmapping of senescence genes showed that senescence cluster C2 had higher expression of genes repressed during senescence and a lower expression of genes induced during senescence (Supplemental Figure S1E). In contrast, senescence cluster C1 showed a lower expression of senescence-repressed genes and a higher expression of senescence-induced genes (Supplemental Figure S1E). Hence, clusters C1 and C2 were named as the HS and LS groups, respectively (Supplemental Figure S1E). Additionally, the HS patients showed a significantly poor survival rate compared with that in the LS patients (Figure 1F).
High- and Low-Senescence Groups Have Different Immune and Genetic Landscapes
The senescence nature in the HS and LS groups was confirmed using senescence-associated gene sets in GSEA (Figure 2A). GSEA with oncogenic signature identified positive enrichment of various oncogenic pathways in the LS group (Figure 2B). Immune microenvironment plays a key role in cancer development and progression. Also, senescent cells induce the release of a variety of factors, called senescent-associated secretory phenotypes, to attract immune cells to remove senescent cells from the tissue environment.
Therefore, the enrichment of various immune cells in the HS and LS groups was estimated. Cibersort analysis showed that T cells (CD4, CD8, and T-regulatory cells) and neutrophils were significantly more enriched in the HS group, whereas B cells, resting mast cells, and monocytes were differentially infiltrated in the LS group (Supplemental Figure S2). In a recent landmark paper, Thorsson et al
identified the immune subtype of various cancers, including LUAD. In keeping with that, LS patients had significantly enriched C1 and C2 classes of the immune subtypes (Figure 2C). In contrast, HS patients had significantly enriched C3 subtype (Figure 2C) and higher lymphocyte infiltration (Figure 2D). As expected, the LS group showed significantly higher proliferation, wound healing, single-nucleotide variants, indel variations, and nonsilent mutation rate (Figure 2, E–H).
Given that the HS group had overall less genetic variation, the specific genetic differences between the two groups were compared. The TP53 was the most significantly differentially mutated gene between the HS and LS groups (Figure 3A). Intriguingly, the LS group had higher mutation rates of CAD, GRIA4, and SERPINA3 (Figure 3A).
Copy number variations were also less common in the LS group. Broad significant differences in copy number variation were seen in chromosomes 3p, 4p, 4q, 5p, 5q, 11p, 11q, 13q, 15q, 16p, 16q, 19p, 19q, 21q, and 22q (Figure 3B). As expected, the number of focal variations was higher in the LS group (Figure 3B). The LS group also had higher amplification of many important loci harboring genes such as CDC45, SOX2, TERT, and MET. The genes amplified and overexpressed in LS were associated with cell cycle promotion and proliferation (Supplemental Figure S3A and Supplemental Table S1). In comparison, the HS group showed amplification and overexpression of genes associated with cellular transport (Supplemental Figure S3A and Supplemental Table S1). Similarly, the LS group showed higher focal deletion and under-expression of genes associated with cellular transport and cellular movements (Supplemental Figure S3B and Supplemental Table S1). In contrast, the HS group showed deletion and down-regulation of genes associated with cell cycle and proliferation (Supplemental Figure S3B and Supplemental Table S1). These findings may explain the high-proliferation phenotype and poor survival in LS patients.
High- and Low-Senescence Groups Have Significantly Different Epigenetic Alterations
Epigenetic changes play a crucial role in age-related phenomena such as senescence and cancer. To understand the DNA methylation changes in the HS and LS groups, TCGA Infinium Human Methylation 450 data were analyzed. Interestingly, a significantly lower overall methylation level was observed in the LS group compared with that in the HS group (Figure 3C). Also, the overall methylation levels in the normal lung and HS groups were similar (Figure 3C), suggesting the role of DNA methylation in senescence phenotype in LUAD. Next, differential methylation analysis identified 596 hypermethylated and 685 hypomethylated CpGs (Supplemental Figure S4). The correlation analysis identified 66 CpGs (61 genes) with a significant negative and 48 CpGs (41 genes) with a significant positive correlation between methylation and expression (Supplemental Table S2). The CpGs present on CpG islands at the promoters (promoter, 5′-untranslated region, first exon) generally repress the expression, and CpGs from the gene body are associated with activation.
Hence, the CpGs with significant correlation between methylation and expression were classified as island and nonisland CpGs. As expected, CpGs present on promoter islands were primarily associated with repression, and nonisland CpGs were associated with the activation of gene expression (Figure 3D). Furthermore, Metascape analysis showed that the genes suppressed by DNA methylation in HS were enriched in the Wnt pathway, development-related pathways, and senescence-regulating pathways (Figure 3E). This finding validates previous reports suggesting the role of Wnt pathway inhibition in senescence.
Machine Learning–Based Validation of Senescence Phenotype with Drug Resistance in Independent Sample Sets
To identify a senescence signature, feature selection was performed by recursive feature elimination. The analysis identified 20 genes the expression of which could be used for identifying the HS and LS groups without significant accuracy loss (Supplemental Figure S5, A–C). To validate that these 20 genes were sufficient for identifying the senescence groups in LUAD, K-means consensus clustering was performed in an independent cohort (East Asian cohort) of LUAD patients. This analysis identified two clusters, named C1 and C2 (Figure 4A). The C1 group had a high expression levels of 20 senescence genes compared with those in the C2 group (Supplemental Figure S5D). Hence, C1 was labeled as the HS group and C2 as the LS group. Subsequent Kaplan-Meier analysis showed that survival was significantly greater in the HS patients, validating the findings from the TCGA data set (Figure 4B). Furthermore, the HS status of the C1 group was confirmed using GSEA (Supplemental Figure S5E). The HS group also showed lower activation of the retinoblastoma tumor suppressor (RB), polycomb repressive complex (PRC)-2–enhancer of zeste homolog (EZH)-2, and mammalian target of rapamycin (mTOR) signaling pathways (Supplemental Figure S5F). As in the East Asian cohort, the presence of the senescence phenotype was validated in the Michigan cohort. Although the difference in survival between the HS and LS groups was nonsignificant, the trend remained the same (Supplemental Figure S6, A–C).
Next, cell lines from the Cancer Cell Line Encyclopedia (CCLE) data set were classified into HS (C1) and LS (C2) groups using the expression of 20-gene senescence signatures (Figure 4, C and D). The GSEA validated the senescence status of the cell lines in the HS and LS groups (Supplemental Figure S6D). Chemosensitivity data from the genomics of the Genomics of Drug Sensitivity in Cancer database were analyzed to understand drug sensitivity in senescence groups. As expected, HS cells showed significantly higher IC50 values for all drugs combined (Figure 4E). Further investigation of the IC50 values of individual drugs revealed that the histone deacetylase (HDAC) inhibitors belinostat, trichostatin A, CAY10603, and AR42 showed significantly higher IC50 in the HS group compared with the LS group (Figure 4F). Similarly, the LS group was considerably more sensitive to various cyclin-dependent kinase (CDK)-9 inhibitors (Figure 4G). These observations suggest that the senescence nature of tumor is an essential determinant of therapy decision and disease outcome.
Cancer, a leading cause of fatalities across the world, is characterized by its strong proliferative signal. A reduction in tumor growth requires multiple doses of chemotherapy. Over the years, cellular senescence has been identified as another way to push cancer cells into permanent proliferation arrest.
As compared with doxorubicin-regulated genes, IR-regulated genes were highly associated with cell cycle regulation, suggesting a greater impact of IR than of chemotherapy on molecular outcomes. IR- and doxorubicin-regulated genes were used to identify HS and LS groups of patients. HS patients showed better survival and lower activation of proliferative signaling, suggesting that tumors with an HS nature are inherently less aggressive compared with tumors with an LS nature. A lower rate of mutation of the TP53 gene could be involved in the HS nature of HS-group tumors. Additionally, HS patients showed lower genetic aberration, which also supports the lesser aggressiveness of these tumors.
Many reports have suggested that DNA methylation patterns of senescence are a crucial determinant of tumor behavior and hence are associated with cancer and aging.
In the current study, LS patients had significantly greater genome-wide hypomethylation than that of HS patients and the methylation-silenced genes in the HS group were enriched in Wnt signaling. The down-regulation of Wnt signaling is crucial for the onset of senescence.
The data suggest that epigenetic silencing of Wnt signaling is an important factor in the determination of HS phenotype in lung cancer. The presence of senescence phenotype was also validated in two independent cohorts using the senescence gene signature as identified by machine learning–based methods. The patients with HS nature showed better survival and lower activation of proliferative pathways as found in the TCGA data set.
To summarize, this study used RNA-seq data from senescent lung fibroblasts to identify a novel senescence phenotype in LUAD patients. The patients with HS showed a better prognosis and poor activation of mTOR. Additionally, The study used a machine-learning model to retain the minimal number of genes required to identify the senescent phenotype, and further validated it in two independent data sets of LUAD patients and CCLE cell lines. The study also showed that cell lines from the HS group were significantly more resistant to drugs.
We thank The Cancer Genome Atlas for providing patient data.
A: Pathway analysis using the up- and down-regulated genes after ionizing radiation (IR)- and doxorubicin (Dox)-induced senescence, and significantly enriched Gene Ontology terms. The dashed line indicates the significance cutoff [false discovery rate (FDR) of <0.05]. B and C: Expression of genes commonly up-regulated (B) and down-regulated (C) after IR- and Dox-induced senescence, compared in LUAD and normal lung, and FDR and logarithm twofold changes. The dashed lines indicate the significance cutoff. D: Senescence-induced and differentially expressed genes in LUAD compared to normal tissue in the Metascape analysis, and significant terms. E: Heatmap of LUAD-associated senescence genes in C1 and C2 groups as identified by K-means clustering.
A: Genes differentially amplified in HS and LS groups identified by analysis of copy number variations. The genes amplified and overexpressed in the LS group are used as input for Metascape analysis, with the top 25 enriched Gene Ontology (GO) terms. Similarly, genes amplified and overexpressed in the HS group are used as input for Metascape analysis, with enriched GO terms. B: Genes differentially deleted in HS and LS groups identified by copy number variation analysis. The genes deleted and under-expressed in the LS group are used as input for Metascape analysis, with the top 25 enriched GO terms. Similarly, genes deleted and under-expressed in the HS group are used as input for Metascape analysis, with the top 25 enriched GO terms.
A: Receiver operating characteristic analysis comparing the sensitivity and specificity of gene sets in identifying HS and LS groups in The Cancer Genome Atlas (TCGA) data set, showing the areas under the curve (AUC). B:K-means consensus clustering of TCGA patients based on the expression of 20 genes as identified by random forest analysis (RFA). C and D: Heatmaps of the expression of 20 genes as identified by RFA in TCGA samples (C) and in EAS-LUAD (D) HS and LS groups. E and F: The significant gene sets identified in gene set enrichment analysis (GSEA) (E) and EAS-LUAD (F) data using oncogenic signature gene sets are plotted to show the enrichment of higher activation of oncogenic signaling in the HS and LS groups. EZH, enhancer of zeste homolog; FDR, false discovery rate; GO, Gene Ontology; mTOR, mammalian target of rapamycin; NES, normalized enrichment score; PRC, polycomb repressive complex; RB, retinoblastoma tumor suppressor; RF, random forest.
A:K-means consensus clustering of the Michigan cohort based on the expression of 20 genes as identified by random forest analysis (RFA). B: Heatmap showing the expression of 20 genes as identified by RFA in Michigan samples. C:K-means plot of the difference in survival between the HS and LS groups as identified in K-means clustering. The dashed lines indicate median survival. D: Enrichment of the senescence-associated gene sets, showing the enrichment of senescence-associated genes in the HS group. GO, Gene Ontology.
Supported by Government of India Department of Biotechnology grant BT/PR27478/MED/30/1954/2018 (S.S.) and Government of India Science and Engineering Research Board, Department of Science and Technology grant ECR/2018/000528 (S.S.).