If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Therapeutic decisions in lung cancer critically depend on the determination of histologic types and oncogene mutations. Therefore, tumor samples are subjected to standard histologic and immunohistochemical analyses and examined for relevant mutations using comprehensive molecular diagnostics. In this study, an alternative diagnostic approach for automatic and label-free detection of mutations in lung adenocarcinoma tissue using quantum cascade laser–based infrared imaging is presented. For this purpose, a five-step supervised classification algorithm was developed, which was not only able to detect tissue types and tumor lesions, but also the tumor type and mutation status of adenocarcinomas. Tumor detection was verified on a data set of 214 patient samples with a specificity of 97% and a sensitivity of 95%. Furthermore, histology typing was verified on samples from 203 of the 214 patients with a specificity of 97% and a sensitivity of 94% for adenocarcinoma. The most frequently occurring mutations in adenocarcinoma (KRAS, EGFR, and TP53) were differentiated by this technique. Detection of mutations was verified in 60 patient samples from the data set with a sensitivity and specificity of 95% for each mutation. This demonstrates that quantum cascade laser infrared imaging can be used to analyze morphologic differences as well as molecular changes. Therefore, this single, one-step measurement provides comprehensive diagnostics of lung cancer histology types and most frequent mutations.
Cancer is the most frequent cause of death, second only to cardiovascular diseases, in industrialized countries. In 2018, approximately 9.6 million deaths were attributed to cancer. Worldwide, 2.1 million people are confronted with lung cancer diagnosis, and 1.76 million die from it every year (World Health Organization, https://gco.iarc.fr/today/home, last accessed January 22, 2021). Lung tumors are characterized by a high degree of heterogeneity and are divided into numerous types [eg, small-cell lung carcinoma (SCLC), adenocarcinoma, squamous cell carcinoma, neuroendocrine carcinoma, carcinoids, and many rare histologies], which are linked to different prognoses and therapeutic approaches.
Validation of interobserver agreement in lung cancer assessment: hematoxylin-eosin diagnostic reproducibility for non-small cell lung cancer: the 2004 World Health Organization classification and therapeutically relevant subsets.
If lung cancer is suspected, an X-ray and a subsequent computed tomographic examination of the thorax are performed, followed by tissue sample collection by bronchoscopy, fine-needle aspiration, transthoracic needle aspiration, or surgery. In addition to histologic typing, tumor samples are examined comprehensively for oncogene alterations by ultra-deep next-generation sequencing (NGS). The three most common mutations in adenocarcinomas of the lung, which are by far the most common histologic type, are found in the genes encoding tumor protein 53 (TP53), KRAS proto-oncogene, GTPase (KRAS), and epidermal growth factor receptor EGFR). The presence of one of these mutations may influence both the patient's prognosis as well as further therapeutic decisions.
patients, with sensitivity and specificity of >90% compared with diagnostics by pathologists using histologic staining and immunohistochemical methods. In addition to the tumor identification in tissue samples, FTIR imaging may be employed for further analyses, such as glioma grading
was presented. The combination of FTIR imaging and laser capture microdissection (LCM) with subsequent proteomics adds molecular resolution to the spatial resolution provided by hyperspectral data sets. As demonstrated previously, this approach can also be used for biomarker identification.
These results indicate that a single index color image can provide the same biochemical information as several immunohistochemistry (IHC) stains.
Extending the previous work, new infrared (IR) microscopes with tunable quantum cascade lasers (QCLs) as IR sources, instead of globar and uncooled microbolometer detectors, were used to perform FTIR imaging feasible for routine diagnostic applications. The reduction of the measurement time using QCL-based IR microscopes on breast cancer tissue microarrays,
Using two Spero-QT IR microscopes (Daylight Solutions, San Diego, CA), tumor lesions and healthy tissue types on whole slices were identified with 96% sensitivity and 100% specificity compared with histopathology.
A comparison of the Spero-QT with the previously used Cary-FTIR system (Agilent, Santa Clara, CA) showed a reduction of measurement time at the same wave number (inverse wavelength, 1/λ) resolution by a factor of 160. Therefore, <30 minutes was required for a whole slice measurement. This corresponds to the time required for histologic staining of fresh-frozen tissue and evaluation by a pathologist. A previous study revealed that QCL-IR imaging can classify changes at the molecular level in colorectal cancer tissue. The recognition of microsatellite stability and instability of cancerous tissue was verified with 100% sensitivity and 93% specificity compared with immunohistochemistry and fragment length analysis.
In this study, a label-free, automated, spatially resolved, and observer-/operator-independent approach using QCL-based IR imaging is presented. This technique was verified on thin sections of 536 formalin-fixed, paraffin-embedded (FFPE) tumor and nontumor lung tissues from 214 patients. Cancerous regions were identified with a sensitivity of 95% and a specificity of 97% compared with histopathology. Furthermore, the tumor type (94% sensitivity and 97% specificity for adenocarcinoma) and adenocarcinoma mutation status (KRAS, EGFR, or TP53 mutation) were determined with a sensitivity and specificity of 95% compared with the NGS gene panel result.
Materials and Methods
Two different sample sets were used in this study (Table 1). The first (N = 21) set was used for training the random forest (RF) classifiers. It included tumor and normal tissue samples from patients diagnosed with adenocarcinoma (n = 10), squamous cell carcinoma (n = 5), neuroendocrine carcinoma (n = 1), SCLC (n = 1), carcinoid (n = 1), pulmonary chondroid hamartoma (n = 1), or other lung diseases (n = 2). A KRAS mutation occurred in four adenocarcinomas, and three adenocarcinomas harbored a TP53 mutation and an EGFR mutation. The patients were 50 to 85 years old at specimen collection and had an average age of 68 years. Eleven patients were female, and 10 were male. The second sample set (N = 214) was used to verify the RF classifiers. Among them were tumor and normal tissue samples from patients diagnosed with adenocarcinoma (n = 170), squamous cell carcinoma (n = 23), neuroendocrine carcinoma (n = 3), SCLC (n = 3), carcinoid (n = 4), pulmonary chondroid hamartoma (n = 5), or other lung diseases (n = 6). KRAS, EGFR, or TP53 mutations were detected in 20 adenocarcinomas. The remaining tumors with adenocarcinoma (n = 110) contained other or no mutations. In the verification samples, patients from 41 to 84 years with an average age of 68 years were included. A total of 104 patients were female, and 110 were male.
Table 1Sample Sets Included in This Study for Training and Verification
The study was approved by the University of Cologne Ethics Committee (registration number 15-116). General informed consent for research was obtained from the patients. All procedures are in accordance with the approved guidelines and regulations for human experimental research.
FFPE lung tissue sections were obtained from the Institute of Pathology, University Hospital Cologne (Cologne, Germany). The samples were collected during surgery and prepared following standardized protocols. Fresh-frozen or FFPE tissue blocks were cut into section (10 μm thin) and floated onto polyethylene terephtalat membrane frame slides. The management and distribution of the samples were performed by the Institute for Prevention and Occupational Medicine of the German Social Accident Insurance (Ruhr University Bochum, Bochum, Germany). Before the spectral data acquisition, the FFPE samples were dewaxed using established protocols.
For spectral data acquisition, two Spero QT QCL-based microscopes and Chemical Vision software version 3.2 (Daylight Solutions) were used. In addition to the original setup, a purge air diffuser was connected to the sample chamber. Furthermore, the stage was modified so that two slides could be analyzed in a row to reduce the equilibrium time before measurements. The tissue samples were measured with a 4× objective (0.3 numerical aperture), which covers a field of view of 2 × 2 mm2 in a spectral range of 1800 to 948 cm−1 with a spectral resolution of 2 cm−1 in transmission mode. Spero QT operates with an uncooled microbolometer focal plane array detector with 480 × 480 pixels and a pixel size of 4.25 × 4.25 μm.
Data Processing and Analysis
Spectral artifacts from folds and cracks in the tissue were eliminated by quality control based on the integral of the amide I band. Disturbing bands caused by the polyethylene terephtalat membrane or embedding medium (Tissue-Tek, Sakura Finetek, Staufen, Germany) were removed on the basis of the relations of the integral of the amide I band and the integral of the regions between and 1135 to 1064 cm−1 and 1800 to 1700 cm−1. After quality control, Mie scattering was corrected using the resonant Mie scattering-extended multiplicative signal correction (RMieS-EMSC) algorithm by Bassan
(RMie_EMSC_v2) with one iteration. Unsupervised classification was performed using k-means or hierarchical cluster analysis (HCA). Supervised and unsupervised classification was performed on unsmoothed data on the fingerprint region from 1760 to 998 cm−1.
Classifier Setup and Spectral Database Generation
The workflow with the RF classifier used for this work was established and described in previous publications.
In this study, five consecutive RF classifiers were generated. Therefore, a spectral database with tissue-specific spectral information for pathologic regions, infiltrated inflammatory cells, necrosis, muscle, connective tissue, alveoli, blood, calcification, pulmonary chondroid hamartoma, and mucus was set up. The databases for the other RF levels contained the spectral signatures for cancerous, necrotic, and inflammatory tissue (second-level RF), adenocarcinoma, squamous cell carcinoma, small-cell lung cancer, carcinoids, and neuroendocrine carcinoma (third-level RF), and adenocarcinoma with KRAS, EGRF, and TP53 mutations (fourth- and fifth-level RF). The pathologic findings per sample were used as ground truth for morphologic detection (first- and second-level RF) as well as for the tumor type identification (third-level RF). For mutation analysis (fourth- and fifth-level RF), the NGS result of the whole tumor per sample was used as ground truth. The first- and second-level classifiers were set up with 50 decision trees and 16 spectral features randomly chosen per decision in the trees. For the other levels, 500 decision trees and 16 spectral features were used. The exact class composition and number of spectra of all five RF classifiers can be seen in Supplemental Tables S1 through S4. Because of the lower signal/noise ratios and baseline effects, the spectral data range was reduced to 1760 to 998 cm−1, so that 382 wave numbers were used for RF training. The RF for lung tissue classification was built from samples of 21 patients. A total of 536 samples from 214 patients for the lung tissue classifier were available for verification. RF classifiers perform implicit feature selection using a small subset of variables. The visualization of this feature selection can be accomplished using the Gini importance, which can be considered as an indicator for the relevance of the features in terms of a relative ranking. The Gini importance thus provides a relative value for the frequency of use of a certain feature for the split at a node within the decision trees of a model as well as for the overall discrimination value of a feature. The Gini importance plots for each of the trained classifiers detailed by individual training classes are illustrated in Supplemental Figures S1 through S5. All computations were performed using MATLAB R2019a (MathWorks, Natick, MA). The final annotation was provided as index color images and compared with that of the corresponding hematoxylin and eosin (H&E)–stained tissue images. Pathologists at the Pathology Institute, University Hospital Cologne, supplied their histologic reports.
IR Imaging-Guided LCM Workflow
The workflow is based on the one previously described by Großerueschkamp et al.
The respective lung tissue samples were measured using a Spero-QT as usual, and the spectral data were classified during this process. The resulting index color image was used to determine the region of interest. For this analysis, only tumor regions that were incorrectly classified by the fourth or fifth RF classifier (mutation status) were selected as the region of interest. The coordinate transfer was performed using a two-dimensional Helmert transformation based on three reference points. Because the chemical vision software does not allow collection of coordinates, Helmert transformation was done using reference points taken from the respective false color image. The sample was transferred to an LCM microscope (PALM MicroBeam; Zeiss, Jena, Germany), and the coordinates of the reference points were taken. The coordinate transformation was performed in MATLAB. As only tissue pieces of certain shapes and sizes can be lifted and collected by the PALM Zeiss instrument, the region of interest was further subdivided. This resulted in shapes with areas in the range 100 to 50,000 μm2. The coordinates of these shapes were imported to the PALM Robo software version 4.6 and cut using the 5× objective of the instrument. The tissue was collected in NGS incubation buffer for FFPE tissues and stored at −80°C until analysis.
Mutational analysis of low-input DNA NGS was performed using an Ion AmpliSeq Custom DNA Panel (Thermo Fisher Scientific, Waltham, MA) and the Ion AmpliSeq Library Kit 2.0 (Thermo Fisher Scientific), according to the Ion AmpliSeq Library Preparation User Guide (Thermo Fisher Scientific). After multiplex PCR and adapter ligation, libraries were generated by target enrichment using the Gene Read DNA Library I Core Kit, the Gene Read DNA I Amp Kit (Qiagen, Hilden, Germany), and the NEXTflex DNA Barcodes (Bio Scientific, Phoenix, AZ). For sequencing, 12 pmol/L of the constructed libraries was processed on the MiSeq platform (Illumina, San Diego, CA) with a MiSeq reagent kit V2 (Illumina) with 300 cycles following the manufacturer's recommendations. Data analysis and mutation calling were performed as previously described.
The QIAseq-targeted DNA panel for human lung cancer (NGHS-005X-96) with the GeneRead DNAseq Panel PCR Kit V2 (Qiagen) was used for a subset of samples by preparing libraries using the Gene Read DNA Library I Core Kit and the Gene Read DNA I Amp Kit (Qiagen), according to the manufacturer's protocol. Final library products were quantified, diluted, and pooled in equal amounts. A total of 1.2 pmol/L of the pooled final libraries was sequenced on a NextSeq Sequencer (Illumina) with the NextSeq 500 Mid Output Kit v2 following the manufacturer's recommendations. Refer to Supplemental Table S6 for details of the analyzed regions.
Tumor Identification and Tumor Type Determination in Lung Tissues
FFPE tumor and nontumor lung tissue sections from 235 patients were used for this study. Within this cohort, 180 patients were diagnosed with adenocarcinoma of the lungs and 28 patients were diagnosed with squamous cell carcinoma. The remaining patients had other lung tumors (SCLC, neuroendocrine carcinoma, carcinoid, and pulmonary chondroid hamartoma), metastasis (eg, from colorectal carcinoma), or other lung diseases (pneumonia or chronic obstructive lung disease). The established IR imaging workflow
used for this study is shown in Figure 1. Data acquisition was performed with QCL-based infrared microscopes on unstained, unmodified thin sections of lung tissues. Each pixel of the image is represented by one IR spectrum, which shows an integral of the information of the biochemical composition of the tissue. Therefore, the IR spectrum serves as a fingerprint for morphologic or molecular changes in the tissues. Thus, machine-learning algorithms, such as supervised classifiers, can be used to distinguish between spectra of different tissue types or molecular conditions. The results are presented as index color images, where each color represents a different tissue type or molecular condition. For diagnosis, the pathologist uses histologic methods, such as H&E and IHC staining, as well as NGS gene panels. The results of the mentioned analyses per sample were used as ground truth for the construction of the spectral database for the classifiers. RF supervised classifier was used in this study, which provides reliable and robust results for the annotation of tissue samples.
To analyze lung tissues, three hierarchical RF classifiers (Figure 2A) were elucidated using the spectral data of 21 patients. Tumor, chronic obstructive lung disease, pneumonia, and nontumor or nondiseased tissue samples of these patients were included in the training data set. The first RF classifier was used to (Figure 2A) differentiate between different tissue types, such as connective tissue, muscle, and blood, as well as calcification, necrotic tissue, pulmonary chondroid hamartoma (cartilage tumors), and pathologic regions. Subsequently, spectra classified as pathologic were analyzed by a second RF classifier (Figure 2A), which was used to identify inflammatory infiltrates, lymph follicles, slightly necrotic tissue, and tumor regions. A detailed illustration of the separation of lymph follicles and inflammatory infiltrates is shown in Supplementary Figure S6. A third RF classifier (Figure 2A) used the tumor spectra to determine the tumor type. This RF classifier identified the five most common tumor types (adenocarcinoma, squamous cell carcinoma, SCLC, neuroendocrine carcinoma, and carcinoid). The results of the tree classifiers are presented in Figure 2 on a lung tissue section with adenocarcinoma. The different tissue types as well as the pathologic region (Figure 2B) were identified more precisely compared with histopathology (Figure 2E). The same applied to the tumor region (Figure 2C) determined by the second RF. Figure 2, D and F (detail of D with matching H&E image, G), illustrates that the tumor type (adenocarcinoma) was determined correctly and homogeneously within the tumor lesions.
For verification, tumor and nontumor samples from 214 patients were available. Of these patients, 208 were diagnosed with a lung tumor (Table 2). Because of the large size of the whole lung tissue, thin sections, and alterations that occur during the staining process, pixel-based analysis was not performed. Pixel-based analysis is not relevant for clinical diagnosis, but can be used for an overall annotation of the tissue section. Therefore, statistical analyses were performed using the overall diagnosis of the sections. For tumor identification, all sections with ≥5% pathologic-classified spectra were classified and recognized as tumor samples. Sections with <5% tumor-classified spectra were rated as nontumor samples. On considering these parameters, tumor identification achieved a sensitivity of 95% and a specificity of 97% compared with histopathology. For verification of tumor type (third RF), tumor samples from 203 patients were accessible. Most of these samples were diagnosed with lung adenocarcinoma (170 patients), with approximately 50% clinical incidence being the most common lung tumor type.
Twenty-three patients were diagnosed with squamous cell carcinoma. Only four patients had carcinoids, three had neuroendocrine carcinoma, and three had SCLC. The evaluation of the third classifier was performed using a simple majority vote (Table 3). Therefore, the tumor type could be determined with a sensitivity of 94% and a specificity of 97% for adenocarcinoma and a sensitivity of 96% and a specificity of 96% for squamous cell carcinoma. For carcinoids, neuroendocrine carcinomas, and SCLC tissue samples, a sensitivity and specificity of 100% for tumor type identification were achieved. Because of the low number of samples for verification, the reliability of these values remains questionable for this cohort and should be further addressed.
Table 2Lung Tissue Sample Data Set for Verification and Training, According to the Tumor Types of the Patients
In addition to the histochemical and immunohistochemical staining for subtyping lung cancer, the Institute of Pathology, University Hospital Cologne, sequenced an NGS gene panel to identify relevant mutations in lung tumor tissues. Previous studies showed that IR imaging can be used for biomarker identification,
otherwise performed by several IHC stainings. Therefore, to add a molecular dimension to the spatial IR resolution, herein, two additional RF classifiers were trained to identify mutations in lung cancer tissues (Figure 3). The most frequent mutations in lung adenocarcinomas within this data set were mutations in KRAS, EGFR, and TP53. Therefore, these three mutations were chosen to build the RF classifier. To train the spectral data of four patients with KRAS, three with EGFR mutations and three with TP53 mutations were required. The structure of this RF is shown in Figure 3. In the first step, the spectra previously classified as adenocarcinoma on the third level (tumor type identification) were classified as TP53 or spectra, which could be either KRAS or EGFR. The fifth RF subdivides the spectra further as EGFR- or KRAS-classified spectra. This is necessary because the spectra (Figure 4) of the tissues with these mutations are similar to each other, indicating that both mutated genes activate mitogen-activated protein kinase signaling. The most noticeable differences between the spectral data of lung tissues with these mutations occur within the fingerprint region between 1350 and 1000 cm−1. This is probably based on the fact that the EGFR mutation causes a constant activation of this receptor, which also triggers the KRAS signaling cascade. For more detailed spectral information on the training data set, see Supplemental Figures S7 through S12.
The results of these two RF classifiers to determine the mutation status of lung adenocarcinoma are presented in Figure 5. The index color images of sections of tissue samples with TP53 (Figure 5A), KRAS (Figure 5C), and EGFR (Figure 5E) mutation are shown in comparison to their corresponding H&E staining (Figure 5, B, D, and F; refer to Supplemental Figure S13 for whole tissue slices). The identified tumor regions and the tumor lesions visible based on H&E staining matched well. Furthermore, the three mutations were determined correctly as well as homogeneously within the tumor lesions.
For verification, samples of 60 patients with lung adenocarcinoma (20 tumor samples with KRAS, EGFR, and TP53 mutations each) were used. The evaluation of the fourth and fifth RF classifiers was performed using a simple majority vote. Slices with >50% of the previously classified tumor spectra assigned to one mutation class were rated as positive for this mutation. The threshold was confirmed using receiver operating characteristic curves (Supplemental Figure S14) for both classifiers. The mutation status was determined with sensitivities and specificities of 95% for each mutation compared with the NGS gene panel.
Clarification of Mutation Classifier Results via IR-Guided LCM
In total, 87% (52 of 60) of the verification data set for the mutation status classifier was identified, with >65% of the tumor spectra assigned to the correct mutation class (Supplemental Table S7). The tumors of three patients (one with KRAS, EGFR, and TP53 mutations) were identified correctly, but the respective mutations were misclassified. The cases with EGFR and KRAS mutations were assigned to other mutations. The incorrect TP53 case was classified as KRAS or EGFR (fourth RF). Five cases were correctly classified (50% to 65% spectra assigned to the correct mutation) but showed heterogeneity with large contributions of other mutations. After a renewed control of the NGS gene panel results, one of these cases (63.71% as KRAS and 36.29% as TP53 mutated classified tumor spectra) showed not only a KRAS mutation (as formerly assumed), but also a co-occurring mutation within the TP53 gene. The index color image as a result of the fourth RF classifier is shown in Figure 6A in comparison with the corresponding H&E staining of the thin section of lung tissues (Figure 6B). Both mutation classes (KRAS and TP53) were distributed relatively homogeneously within the tumor lesions. This may indicate that the mutations were not locally confined as heterogeneous mutations, but rather co-occurred homogeneously throughout the tumor tissue.
Three of the remaining five noticeable cases still had sufficient lung tissue to repeat the genetic analysis. In this regard, the combined IR imaging LCM workflow presented by Großerueschkamp et al
was used to collect homogeneous tissue samples without prior labeling of the tissue slices. Only areas that were assigned to the incorrect mutation classes were collected. The subsequent performance of the NGS gene panel showed that all previously detected mutations could be confirmed. This led to the conclusion that the incorrectly classified spectra in the three examined patient samples are a false detection of the classifier or may result from further undetected mutations that co-activate both mitogen-activated protein kinase and TP53 signaling.
This study used a label-free and operator-independent approach to identify tumor regions and to determine the tumor type as well as the mutation status of lung tumor and nontumor tissue samples with high sensitivity and specificity. A QCL-based IR imaging workflow for whole-slice lung tissue sections, and a decrease in the measuring time by 160 times in comparison to the previously used Agilent Cary-FTIR system (Supplemental Figure S15) were used. The classification of a large tissue sample could be performed in <30 minutes, which is within the same time range required for histopathology. Compared with previous studies,
a high number of whole slice samples (578 FFPE tissue samples from 235 patients) could be analyzed because of the high sample throughput. In addition, the compact design of the uncooled microbolometer detector of the Spero-QT can make its routine use in clinical settings feasible. Furthermore, the data processing time could be reduced by performing data correction and classification parallel to data acquisition for each field of view individually (480 × 480 pixels at 427 data points). Therefore, computing can be performed on ordinary personal computers, and no additional and expensive high-performance hardware would be needed.
Another important prerequisite for the clinical translation of QCL-based IR imaging is the validation of the classifiers on an independent data set. This not only includes the use of several instruments at different locations but also different operators and samples from different clinics. This ensures that the algorithm is not fitted to artifacts caused, for instance, by a certain preparation method. In the present study, two Spero-QT instruments were used, and data acquisition was performed by seven different operators. In addition, the measurements were performed at two different locations. The results of the presented RF classification and the determination of tumor type and mutation status of lung tissues are independent of the device, of the operator who performs the measurement, and of the device location. To increase the reliability of these results, a larger number of tissue samples from different clinics would be required. Furthermore, the RF classifier could be replaced by deep learning algorithms. This would add spatial information to the existing spectral information so that morphologic aspects of the tissue could also be included in the classification process. As reported by Schuhmacher et al
in regard to analyzing colon cancer samples using a neural network, this is a promising approach for the annotation of tissue samples based on IR spectral data. A further application of deep learning, as demonstrated by several groups, is the evaluation of H&E images. Recently, Kather et al
reported the identification of microsatellite stability or instability in colorectal cancer samples based on H&E images using deep residual learning. Weakly supervised multiple instance learning–based deep learning on whole slide images was performed by Lu et al
on basal cell carcinoma and prostate and breast cancer. Therefore, evaluating these modern weakly supervised classifiers on infrared hyperspectral data sets in advanced studies may be a promising approach.
To further reduce the measuring time, the number of recorded wave numbers can be reduced, or only discrete frequencies can be measured.
The latter, however, can be problematic with regard to the RMie-EMSC correction, as this requires the complete spectrum.
This study showed for the first time that QCL-based IR imaging can be used not only to identify different tissue types and tumor regions with a sensitivity of 95% and a specificity of 97% compared with histopathology, but also to identify spectral markers that allow differentiation of different molecular states. This illustrates that a single IR measurement can be used to obtain information about a sample that would otherwise require several methods and time-consuming procedures (IHC or NGS). Mayerich et al
presented the possibility of mimicking several IHC stains on breast tissue using FTIR imaging. The RF classifier for the determination of mutation status introduced in this study could be verified with a sensitivity and specificity of 95% for adenocarcinoma tissue samples from 60 patients. Only one patient per mutation type (KRAS, EGFR, or TP53) was found to be incorrectly detected compared with the results of the NGS gene panel. In one case where there was heterogeneity in the recognition of the classifier (Figure 6), the presence of both KRAS and TP53 mutations was confirmed using the gene panel. A different approach for the automated determination of lung tumor types and mutations was shown by Coudray et al,
performed a pan-cancer analysis based on H&E staining using a deep learning algorithm. In contrast to the method presented in this study, these approaches only provide probabilities for each image tile, not a spatially resolved assignment. Furthermore, it is not possible to identify different tissue types or spatially resolve tumor regions or mutation patterns.
In addition, the method introduced with this study facilitates the use of an LCM to isolate precisely defined tissue types from untreated and unstained samples. The regions of individual tissue types can be isolated precisely, the material contains little unwanted (contaminating) tissue, and the amount of material required for further analyses can be reduced significantly. Furthermore, this approach can also be used to examine samples that have been obtained by minimally invasive methods and provide only a small amount of material, such as endobronchial ultrasound-guided transbronchial needle aspiration. These samples can be used not only for genome analyses, as shown in this study, but also for proteomic and transcriptomic studies. Cumulative data on a single sample can contribute to a better understanding of the molecular changes occurring in different lung cancer types and thus improve diagnostic and therapeutic approaches to the disease.
Treatment procedures on FFPE tissues lead to changes within the tissue, and thus, influence the spectral data compared with spectra from fresh-frozen tissue. These differences can be seen in bands, which are caused or influenced by lipids. Therefore, two different classifiers must be trained for FFPE and fresh-frozen tissue. The first three RF levels were built similar to the FFPE tissue classifier. Because of the low number of patients with certain mutations, no classifier could be trained to determine the mutation status. This can be addressed in the future by obtaining more fresh-frozen patient samples.
In summary, this study presents a new application for QCL-based IR imaging by showing that both morphologic and molecular alterations can be detected reproducibly by this automatic and label-free method. To increase the reliability of IR imaging, the next step is to conduct studies with larger patient numbers adapted to clinical needs, which will augment acceptability of this method in the medical community.
We thank Thomas Brüning and Thomas Behrens (Institute for Prevention and Occupational Medicine of the German Social Accident Insurance, Ruhr University Bochum) for management and distribution of the samples; and Catharina Vaerst for great efforts in enrolling patients for this study.
N.G. performed the experiments, analyzed spectral data, and wrote the manuscript. F.G. supervised the experiments edited. R.P. and J.F. performed next-generation sequencing analysis edited. R.B. supplied the histologic reports and contributed as clinical pathologist. All authors edited the manuscript. T.B. supervised the management and distribution of the samples. J.S. and Y.K. supervised the clinical study. K.G. was the senior author for this study.
Validation of interobserver agreement in lung cancer assessment: hematoxylin-eosin diagnostic reproducibility for non-small cell lung cancer: the 2004 World Health Organization classification and therapeutically relevant subsets.