Advertisement
Regular article|Articles in Press

A Deep Learning–Based System Trained for Gastrointestinal Stromal Tumor Screening Can Identify Multiple Types of Soft Tissue Tumors

  • Zhu Meng
    Affiliations
    Beijing University of Posts and Telecommunications and Department of Pathology, Peking University Third Hospital, Beijing Key Laboratory of Tumor Systems Biology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
    Search for articles by this author
  • Guangxi Wang
    Affiliations
    Beijing University of Posts and Telecommunications and Department of Pathology, Peking University Third Hospital, Beijing Key Laboratory of Tumor Systems Biology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
    Search for articles by this author
  • Fei Su
    Affiliations
    Beijing University of Posts and Telecommunications and Department of Pathology, Peking University Third Hospital, Beijing Key Laboratory of Tumor Systems Biology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China

    Beijing Key Laboratory of Network System and Network Culture, Beijing, China
    Search for articles by this author
  • Yan Liu
    Affiliations
    Beijing University of Posts and Telecommunications and Department of Pathology, Peking University Third Hospital, Beijing Key Laboratory of Tumor Systems Biology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
    Search for articles by this author
  • Yuxiang Wang
    Affiliations
    Beijing University of Posts and Telecommunications and Department of Pathology, Peking University Third Hospital, Beijing Key Laboratory of Tumor Systems Biology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
    Search for articles by this author
  • Jing Yang
    Affiliations
    Beijing University of Posts and Telecommunications and Department of Pathology, Peking University Third Hospital, Beijing Key Laboratory of Tumor Systems Biology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
    Search for articles by this author
  • Jianyuan Luo
    Affiliations
    Department of Medical Genetics, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
    Search for articles by this author
  • Fang Cao
    Affiliations
    Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Department of Pathology, Peking University Cancer Hospital and Institute, Beijing, China
    Search for articles by this author
  • Panpan Zhen
    Affiliations
    Department of Pathology, Beijing Luhe Hospital, Capital Medical University, Beijing, China
    Search for articles by this author
  • Binhua Huang
    Affiliations
    Department of Pathology, Dongguan Houjie Hospital, Dongguan, China
    Search for articles by this author
  • Yuxin Yin
    Affiliations
    Beijing University of Posts and Telecommunications and Department of Pathology, Peking University Third Hospital, Beijing Key Laboratory of Tumor Systems Biology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
    Search for articles by this author
  • Zhicheng Zhao
    Correspondence
    Address correspondence to Zhicheng Zhao, Beijing University of Posts and Telecommunications, No. 10 Xitucheng Rd., Beijing 100876, China.
    Affiliations
    Beijing University of Posts and Telecommunications and Department of Pathology, Peking University Third Hospital, Beijing Key Laboratory of Tumor Systems Biology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China

    Beijing Key Laboratory of Network System and Network Culture, Beijing, China
    Search for articles by this author
  • Limei Guo
    Correspondence
    Limei Guo, Department of Pathology, Peking University Third Hospital, Peking University Health Science Center, No. 49 Huayuanbei Rd., Beijing 100191, China.
    Affiliations
    Beijing University of Posts and Telecommunications and Department of Pathology, Peking University Third Hospital, Beijing Key Laboratory of Tumor Systems Biology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
    Search for articles by this author
Open AccessPublished:April 15, 2023DOI:https://doi.org/10.1016/j.ajpath.2023.03.012
      The accuracy and timeliness of the pathologic diagnosis of soft tissue tumors (STTs) critically affect treatment decision and patient prognosis. Thus, it is crucial to make a preliminary judgement on whether the tumor is benign or malignant with hematoxylin-and-eosin–stained images. Here, we present a deep learning–based system, STT-BOX, with only hematoxylin-and-eosin images for malignant STT identification from benign STTs with histopathologic similarity. STT-BOX assumed gastrointestinal stromal tumor as a baseline for malignant STT evaluation, and distinguished gastrointestinal stromal tumor from leiomyoma and schwannoma with 100% area under the curve in patients from three hospitals, which achieved higher accuracy than the interpretation of experienced pathologists. Particularly, this system performed well on six common types of malignant STTs from The Cancer Genome Atlas data set, accurately highlighting the malignant mass lesion. Moreover, without any fine-tuning, STT-BOX was capable of distinguishing ovarian malignant sex-cord stromal tumors. Our study includes mesenchymal tumors that originated from the digestive system, bone and soft tissues, and reproductive system, where the high accuracy of migration verification may reveal the morphologic similarity of the nine types of malignant tumors. Further evaluation in a pan-STT setting would be potential and prospective, obviating the overuse of immunohistochemistry and molecular tests, and providing a practical basis for clinical treatment selection in a timely manner.
      Recently, artificial intelligence (AI) has made great progress in assisting pathologic diagnosis of epithelial malignancy. Applications of deep learning based on convolutional neural networks (CNNs) in carcinomas of the skin,
      • Esteva A.
      • Kuprel B.
      • Novoa R.A.
      • Ko J.
      • Swetter S.M.
      • Blau H.M.
      • Thrun S.
      Dermatologist-level classification of skin cancer with deep neural networks.
      breast,
      • Bejnordi B.E.
      • Veta M.
      • Van Diest P.J.
      • Van Ginneken B.
      • Karssemeijer N.
      • Litjens G.
      • Van Der Laak J.A.
      • Hermsen M.
      • Manson Q.F.
      • Balkenhol M.
      Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer.
      • Le H.
      • Gupta R.
      • Hou L.
      • Abousamra S.
      • Fassler D.
      • Torre-Healy L.
      • Moffitt R.A.
      • Kurc T.
      • Samaras D.
      • Batiste R.
      Utilizing automated breast cancer detection to identify spatial distributions of tumor-infiltrating lymphocytes in invasive breast cancer.
      • Li J.
      • Mi W.
      • Guo Y.
      • Ren X.
      • Fu H.
      • Zhang T.
      • Zou H.
      • Liang Z.
      Artificial intelligence for histological subtype classification of breast cancer: combining multi-scale feature maps and the recurrent attention model.
      • Lin H.
      • Chen H.
      • Dou Q.
      • Wang L.
      • Qin J.
      • Heng P.-A.
      Scannet: a fast and dense scanning framework for metastastic breast cancer detection from whole-slide image.
      • Litjens G.
      • Bandi P.
      • Ehteshami Bejnordi B.
      • Geessink O.
      • Balkenhol M.
      • Bult P.
      • Halilovic A.
      • Hermsen M.
      • van de Loo R.
      • Vogels R.
      1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset.
      prostate,
      • da Silva L.M.
      • Pereira E.M.
      • Salles P.G.
      • Godrich R.
      • Ceballos R.
      • Kunz J.D.
      • Casson A.
      • Viret J.
      • Chandarlapaty S.
      • Ferreira C.G.
      Independent real-world application of a clinical-grade automated prostate cancer detection system.
      ,
      • Ström P.
      • Kartasalo K.
      • Olsson H.
      • Solorzano L.
      • Delahunt B.
      • Berney D.M.
      • Bostwick D.G.
      • Evans A.J.
      • Grignon D.J.
      • Humphrey P.A.
      Artificial intelligence for diagnosis and grading of prostate cancer in biopsies: a population-based, diagnostic study.
      lung,
      • Viswanathan V.S.
      • Toro P.
      • Corredor G.
      • Mukhopadhyay S.
      • Madabhushi A.
      The state of the art for artificial intelligence in lung digital pathology.
      kidney,
      • Bouteldja N.
      • Hölscher D.L.
      • Klinkhammer B.M.
      • Buelow R.D.
      • Lotz J.
      • Weiss N.
      • Daniel C.
      • Amann K.
      • Boor P.
      Stain-independent deep learning–based analysis of digital kidney histopathology.
      ,
      • Hermsen M.
      • Ciompi F.
      • Adefidipe A.
      • Denic A.
      • Dendooven A.
      • Smith B.H.
      • van Midden D.
      • Bräsen J.H.
      • Kers J.
      • Stegall M.D.
      Convolutional neural networks for the evaluation of chronic and inflammatory lesions in kidney transplant biopsies.
      stomach,
      • Park J.
      • Jang B.G.
      • Kim Y.W.
      • Park H.
      • Kim B.-H.
      • Kim M.J.
      • Ko H.
      • Gwak J.M.
      • Lee E.J.
      • Chung Y.R.
      A prospective validation and observer performance study of a deep learning algorithm for pathologic diagnosis of gastric tumors in endoscopic BiopsiesDeep learning–assisted diagnosis in gastric biopsies.
      colorectum,
      • Kumar N.
      • Verma R.
      • Chen C.
      • Lu C.
      • Fu P.
      • Willis J.
      • Madabhushi A.
      Computer-extracted features of nuclear morphology in hematoxylin and eosin images distinguish stage II and IV colon tumors.
      • Echle A.
      • Grabsch H.I.
      • Quirke P.
      • van den Brandt P.A.
      • West N.P.
      • Hutchins G.G.
      • Heij L.R.
      • Tan X.
      • Richman S.D.
      • Krause J.
      Clinical-grade detection of microsatellite instability in colorectal tumors by deep learning.
      • Xu H.
      • Cha Y.J.
      • Clemenceau J.R.
      • Choi J.
      • Lee S.H.
      • Kang J.
      • Hwang T.H.
      Spatial analysis of tumor-infiltrating lymphocytes in histological sections using deep learning techniques predicts survival in colorectal carcinoma.
      and liver
      • Kim Y.J.
      • Jang H.
      • Lee K.
      • Park S.
      • Min S.-G.
      • Hong C.
      • Park J.H.
      • Lee K.
      • Kim J.
      • Hong W.
      PAIP 2019: liver cancer segmentation challenge.
      have improved the diagnostic efficiency and accuracy, and even shown a more objective and forward-looking trend than the pathologists. In contrast, AI is scarcely studied in the diagnosis of soft tissue tumors (STTs). It is challenging for pathologists in judging whether an STT is benign or malignant by only hematoxylin-and-eosin (H&E)–stained slides. STT includes a variety of benign, borderline, and malignant tumors that originated from the mesenchymal connective tissue. Most tumors consist of spindle cells or contain a portion of spindle cell components.
      • Ywasa Y.
      • Fletcher C.
      • Flucke U.
      WHO Classification of Soft Tissue and Bone Tumours.
      Routinely, malignant STTs are distinguished from benign STTs mainly according to the arrangement, atypia, and nuclear mitosis of tumor cells, tumor margin growth pattern, secondary changes as tumor necrosis, and hemorrhage. Unfortunately, some malignant STTs show similarity in growth pattern and morphologic features to the benign ones.
      • Ywasa Y.
      • Fletcher C.
      • Flucke U.
      WHO Classification of Soft Tissue and Bone Tumours.
      Immunohistochemistry and genetic tests are often performed to make an accurate diagnosis, which inevitably increases the burden of patients and difficulty of diagnosis in local hospitals.
      • Burns J.
      • Brown J.M.
      • Jones K.B.
      • Huang P.H.
      The cancer genome atlas: impact and future directions in sarcoma.
      In the present study, we focused on STTs originated from the digestive system, soft tissue, and bone, as well as mesenchymal tumors from reproductive system, to establish and test our deep learning–based system, called STT-BOX, in distinguishing malignant STTs from benign STTs (Supplemental Figure S1 provides the system interface of STT-BOX).
      First, we trained the core model of our system on gastrointestinal stromal tumors (GISTs). GIST is the most common malignant STT of the gastrointestinal tract,
      • Joensuu H.
      • Hohenberger P.
      • Corless C.L.
      Gastrointestinal stromal tumour.
      and has varying degrees of recurrence and metastasis risk.
      • Papke Jr., D.J.
      • Hornick J.L.
      Recent developments in gastroesophageal mesenchymal tumours.
      ,
      • Joensuu H.
      • Vehtari A.
      • Riihimäki J.
      • Nishida T.
      • Steigen S.E.
      • Brabec P.
      • Plank L.
      • Nilsson B.
      • Cirilli C.
      • Braconi C.
      Risk of recurrence of gastrointestinal stromal tumour after surgery: an analysis of pooled population-based cohorts.
      Histopathologically, it is usually difficult to distinguish GISTs from benign ones with spindle cell morphology, such as leiomyoma and schwannoma, both in gross specimens and H&E-stained slides. Pathologists must apply a panel of immunohistochemical (IHC) antibodies to distinguish GISTs from benign STTs with similar histopathologic characteristics.
      • Joensuu H.
      Risk stratification of patients diagnosed with gastrointestinal stromal tumor.
      ,
      • Karakas C.
      • Christensen P.
      • Baek D.
      • Jung M.
      • Ro J.Y.
      Dedifferentiated gastrointestinal stromal tumor: recent advances.
      Genetically, most GISTs harbor gain-of-function mutations in either c-KIT or platelet-derived growth factor receptor α (PDGFRA) oncogene.
      • Joensuu H.
      • Hohenberger P.
      • Corless C.L.
      Gastrointestinal stromal tumour.
      ,
      • Joensuu H.
      • Vehtari A.
      • Riihimäki J.
      • Nishida T.
      • Steigen S.E.
      • Brabec P.
      • Plank L.
      • Nilsson B.
      • Cirilli C.
      • Braconi C.
      Risk of recurrence of gastrointestinal stromal tumour after surgery: an analysis of pooled population-based cohorts.
      ,
      • Karakas C.
      • Christensen P.
      • Baek D.
      • Jung M.
      • Ro J.Y.
      Dedifferentiated gastrointestinal stromal tumor: recent advances.
      Pathologists always need the panel of protein biomarkers and molecular detection to achieve the definite diagnosis and guide accurate target therapy. This brings great challenges to many pathology departments that lack diagnostic experience and auxiliary pathologic technology. Therefore, considering the achievements of CNNs in distinguishing carcinoma histopathologic features, we believe that a novel CNN-based system may help in the diagnosis of malignant STTs represented by GIST.
      For GIST, by training a variety of CNNs, an effective hierarchical feature representation strategy was proposed so that only H&E-stained slides were enough to distinguish GIST. Furthermore, our STT-BOX system achieved higher diagnostic accuracy than experienced pathologists. Next, we tested the H&E-stained images of six common types of soft tissue sarcomas from The Cancer Genome Atlas (TCGA) data set (https://gdc.cancer.gov, last accessed February 24, 2023), a total of 235 cases from 32 centers, including leiomyosarcoma, dedifferentiated liposarcoma, undifferentiated pleomorphic sarcoma, myxofibrosarcoma, synovial sarcoma, and malignant peripheral nerve sheath tumor.
      • Lazar A.J.
      • McLellan M.D.
      • Bailey M.H.
      • Miller C.A.
      • Appelbaum E.L.
      • Cordes M.G.
      • Fronick C.C.
      • Fulton L.A.
      • Fulton R.S.
      • Mardis E.R.
      Comprehensive and integrated genomic characterization of adult soft tissue sarcomas.
      Our STT-BOX system accurately highlighted the malignant mass lesion in each case. Last, we transferred the established CNN-based model of GIST to the reproductive system and tested the effectiveness of STT-BOX system in the diagnosis of mesenchymal tumors from different primary locations. There are three common types of ovarian sex-cord stromal tumors [SCSTs; ie, theca cell tumors (TCTs), adult granulosa cell tumors (AGCTs), and Sertoli-Leydig cell tumors (SLCTs)].
      • Young R.H.
      Ovarian tumors: a survey of selected advances of note during the life of this journal.
      TCT is benign, and AGCT and SLCT are malignant. These three types of tumors consist of or contain a spindle cell component. Therefore, the histopathology of them has obvious intertwining and similarity.
      • Karnezis A.N.
      • Cho K.R.
      • Gilks C.B.
      • Pearce C.L.
      • Huntsman D.G.
      The disparate origins of ovarian cancers: pathogenesis and prevention strategies.
      ,
      • Young R.H.
      Reflections on a 40-year experience with a fascinating group of tumors, including comments on the seminal observations of Robert E. Scully, MD.
      Without any training or fine-tuning, our STT-BOX system showed excellent capability and stability to distinguish between benign and malignant SCSTs. Overall, our study demonstrated the potential of STT-BOX system in distinguishing malignant STTs from benign ones only through H&E images. Our codes are available online (https://github.com/dreambamboo/STT-BOX-public, last accessed February 24, 2023).

      Materials and Methods

      Ethics Statement and Case Selection

      This study was approved by the Institutional Review Board and Ethics Committee Board of the Peking University Third Hospital (Beijing, China). The present study consists of four steps (Figure 1A) and 430 whole-slide images (WSIs) from 386 patients who were enrolled in this study (Supplemental Table S1). Step 1 conducted the system construction for differentiation of GIST with spindle cell type from benign schwannomas and leiomyomas with WSIs collected from Peking University Third Hospital (Dataset-P). Step 2 performed the system cross-cohort validation on WSIs of GIST with spindle cell type, schwannomas, and leiomyomas, from Peking University Cancer Hospital and Institute (Dataset-T) and Beijing Luhe Hospital (Dataset-L). When selecting the GIST cases, the IHC results of CD117, Dog-1, CD34, S100, α-smooth muscle actin, and desmin, and gene mutation analysis of c-KIT and PDGFRA were all known. The tumor size of all GISTs was at least 2 cm. Finally, a total of 145 WSIs, 6250 screenshots, and 50,000 patches were enrolled in training, cross-validation, and cross-cohort validation (Supplemental Tables S2 and S3 and Supplemental Figures S2 and S3). Dataset-P was used for training and three-fold cross-validation (Supplemental Table S3). Patients in different folds were independent of each other to ensure that the data of the patients in the test set would not appear in the training set. Dataset-T and Dataset-L were used to validate the generalization of the present system across cohorts (Supplemental Table S2). Step 3 assigned the direct testing and highlighting of malignant lesions on the public data set TCGA with the present system. There were 262 cases of soft tissue sarcomas in TCGA data set. Each case contains H&E-stained images of frozen and paraffin sections. Two specialist pathologists (Y.L. and Y.W.) browsed all the pathologic reports and images, and selected the best-qualified paraffin section image containing tumor and peritumoral areas for each case. According to the pathologic report and morphologic observation of each case, these six types of tumors are either spindle cell subtypes or contain spindle cell components. Finally, 235 WSIs were used for testing, including leiomyosarcoma (n = 96), dedifferentiated liposarcoma (n = 54), undifferentiated pleomorphic sarcoma (n = 44), myxofibrosarcoma (n = 22), synovial sarcoma (n = 10), and malignant peripheral nerve sheath tumor (n = 9). Step 4 put forward automatic identification of ovarian TCTs, AGCTs, and SLCTs on the Dataset-O without model fine-tuning. In Dataset-P/T/L/O, for any patient, at least one tissue slide was available. All slides were completely anonymous, and images were scanned at ×40 magnification (0.12 μm/pixel) by two pathologists (Y.W. and B.H.) with a UNIC scanner (China). The clinicopathologic features of all cases in each cohort are summarized in Supplemental Table S1.
      Figure thumbnail gr1
      Figure 1Overview of the STT-BOX system pipeline. A: Data partition. Step 1: system construction for gastrointestinal stromal tumor (GIST) identification from schwannoma and leiomyoma, where Dataset-P was utilized for three-fold cross-validation. Step 2: system generalization validation by cross-cohort Dataset-T and Dataset-L. Step 3: soft tissue sarcomas (SARCs) in The Cancer Genome Atlas (TCGA) were used for testing the system capability in indicating lesions of different tissues without target training. Step 4: ovarian sex-cord stromal tumors (Dataset-O) were used to measure the performance of our system for discriminating benign and malignant tumors in reproductive systems, despite the fact that the core model was trained on the gastrointestinal tract. B: Feature extractor acquisition. The annotations for hematoxylin-and-eosin (H&E)–stained slides referred to multiple immunohistochemical (IHC) slides. The screenshots were taken by pathologists from the tumor areas to directly explore the histologic features of GIST, schwannoma, and leiomyoma. Each screenshot was cropped into eight overlapping patches. The patches were augmented before being input to the convolutional neural networks (CNNs) as training data. The feature extractor was obtained after iterations of CNN training. C: The patches generated training data with different color distributions through fluctuation, thereby improving the robustness of feature extractors to cope with color diversity (). D: Slide inference without IHC assistance. Our hierarchical feature representation strategy mapped the patch features gradually. WSI, whole-slide image.

      Deep Learning Pipeline for Malignancy Detection

      First, different from conventional deep learning algorithms trained with large amounts of data, the authors built a feature extractor with limited data from Dataset-P (Figure 1B). Second, the authors proposed a hierarchical feature representation strategy for the inference of cross-validation and cross-cohort validation (Figure 1D). Third, without any fine-tuning, the STT-BOX system with trained model was directly applied for labeling lesions on soft tissue sarcoma WSIs from TCGA data set to measure the performance of the present system against unfamiliar tissues and organs, which were not included in the training set. Finally, the STT-BOX system was challengingly used to distinguish malignant from benign ovarian SCSTs.

      Preprocessing for Training and Validation

      A total of 121 patients with 62 cases of GIST with spindle cell type, 27 cases of schwannoma, and 32 cases of leiomyoma were enrolled in training, cross-validation, and cross-cohort validation, including 145 WSIs, 6250 screenshots, and 50,000 patches (Supplemental Tables S1–S3, Supplemental Figure S2, and Figure 1A). Each slide was annotated with a single label (namely, GIST, schwannoma, or leiomyoma) (Supplemental Figure S3). The labels were confirmed by pathologists with the assistance of IHC staining and gene mutation analysis. The screenshots were cropped within the tumor areas by the pathologists to fairly measure the morphologic differences among the three different types of STTs. Eight patches were then cropped with an overlapping manner on each screenshot. Dataset-P was used for training and three-fold cross-validation (Supplemental Table S3). In addition, Dataset-T and Dataset-L were used to validate the generalization of the present system across cohorts (Supplemental Table S2). In the cross-cohort validation, only data from Dataset-P were trained for the model of the STT-BOX system, whereas the data from the other two data sets were only used for testing without any fine-tuning.

      System Construction for Differentiation of GIST (Training)

      Feature Extractor

      Figure 1B shows the acquisition process of the feature extractor. The slides in Dataset-P were diagnosed as GIST, schwannoma, or leiomyoma through IHC. The corresponding screenshots were cropped by pathologists (J.Y. and B.H.) from the tumor regions of interest on the slides. The average size of the screenshots was 1898 × 878 pixels. Every screenshot was cropped into eight patches with a size of 512 × 512 pixels, and thus, the adjacent patches overlapped each other to ensure that the information of each part of the screenshot was considered. All patches made up the training set. Data were augmented through random cropping, flip, rotation, superimposing gaussian noise, and color transformation before being fed into the CNNs.

      Color Transformation

      H&E stained the nucleus and cytoplasm of the cells into blue and red, respectively. External factors, including dye concentration, staining time and temperature, and scanner versions, inevitably led to differences in the color distribution of digital images. Therefore, each input patch was color fluctuated randomly to simulate different color distributions. As shown in Figure 1C, the color of an original patch fluctuated through random changes in hue, saturation, and brightness.

      Implementation Details

      In the present study, five CNNs, including Inception-v3,
      • Szegedy C.
      • Vanhoucke V.
      • Ioffe S.
      • Shlens J.
      • Wojna Z.
      Rethinking the inception architecture for computer vision.
      ResNet-18,
      • He K.
      • Zhang X.
      • Ren S.
      • Sun J.
      Deep residual learning for image recognition.
      ResNet-34,
      • He K.
      • Zhang X.
      • Ren S.
      • Sun J.
      Deep residual learning for image recognition.
      ResNet-50,
      • He K.
      • Zhang X.
      • Ren S.
      • Sun J.
      Deep residual learning for image recognition.
      and ResNet-101,
      • He K.
      • Zhang X.
      • Ren S.
      • Sun J.
      Deep residual learning for image recognition.
      were trained as the feature extractors F. The extractors were loaded with parameters pretrained on a large data set called ImageNet
      • Deng J.
      • Dong W.
      • Socher R.
      • Li L.-J.
      • Li K.
      • Fei-Fei L.
      Imagenet: a large-scale hierarchical image database.
      before training. The extractors were optimized via stochastic gradient descent while training. The learning rate τe was initialized with τmax=0.001, and then decreased on the basis of cosine annealing schedule:
      τe=τmin+12(τmaxτmin)[1+cos(e100π)]
      (1)


      where τmin=0.00001, e was the current epoch. The batch size was set to 32. The tools adopted in this work were CUDA (10.0.130), Pytorch (1.2.0),
      • Paszke A.
      • Gross S.
      • Massa F.
      • Lerer A.
      • Bradbury J.
      • Chanan G.
      • Killeen T.
      • Lin Z.
      • Gimelshein N.
      • Antiga L.
      Pytorch: an imperative style, high-performance deep learning library.
      Python (3.7.6), Numpy (1.18.1), OpenSlide,
      • Goode A.
      • Gilbert B.
      • Harkes J.
      • Jukic D.
      • Satyanarayanan M.
      OpenSlide: a vendor-neutral software foundation for digital pathology.
      and Matplotlib.
      • Hunter J.D.
      Matplotlib: a 2D graphics environment.
      All the experiments were conducted on a single Tesla T4 GPU with 16 G of memory.

      Hierarchical Feature Representation Strategy (Inference)

      A hierarchical feature representation strategy was designed for the inference of STT-BOX (Figure 1D). Specifically, hierarchical features (ie, patch-, screenshot-, and slide-level features) were concerned (Figure 1D). The patches and screenshots were obtained through the same strategy as that during the training process in Figure 1B. Each screenshot xij from the slide Xi was cropped into eight patches, ie, xij={xij1,xij2,,xij8}. First, the patch-level features were extracted by the feature extractor F, which was trained on the basis of Dataset-P. As shown in Figure 1D, the output feature vector of the feature extractor F (ie, a classification model) was F(xijm)=<pijm1,pijm2,pijm3>. pijm1 indicated the similarity between the input patch xijm and GIST, pijm2 indicated the similarity between the input patch xijm and schwannoma, and pijm3 indicated the similarity between the input patch xijm and leiomyoma. The output vector met the condition of pijm1+pijm2+pijm3=1. Then, the patch-level output feature vectors were fused into the screenshot-level feature vectors through a voting mode or a mapping mode. Similarly, the screenshot-level output feature vectors were finally fused into the slide-level feature vectors. Voting: When the voting mode of the STT-BOX system was adopted, a screenshot was diagnosed as GIST when more patches were diagnosed as GIST instead of schwannoma and leiomyoma. The probability of screenshots that corresponded to the category was the proportion of the patches predicted for this category. The plurality label (ie, the most common prediction) of the screenshots from a slide determined the classification of this slide, whereas the proportion of the predicted screenshot category determined the probabilities of the corresponding categories of the slide. Mapping: There is also a mapping mode in the STT-BOX system. As mentioned before, a patch xijm would generate a feature vector through the feature extractor, ie, F(xijm)=<pijm1,pijm2,pijm3>, the feature of the corresponding screenshot could be represented as a matric with a shape of (8 × 3) by concatenating the feature vectors of the eight patches. Then the screenshot-level feature vector F(xij)=<pij1,pij2,pij3> of a screenshot xij was mapped by
      pijc=m=18pijmc,
      (2)


      where c{1,2,3} was the channel of F(xij). Similarly, the slide-level feature vector F(Xi)=<Pi1,Pi2,Pi3> of a slide Xi was mapped by
      Pic=j=1NPijc,
      (3)


      where N was the number of screenshots cropped from the slide Xi. The category with the highest probability was regarded as the final diagnostic category D(Xi) of the input slide Xi:
      D(Xi)=argmax[F(Xi)]=argmax(Pi1,Pi2,Pi3),
      (4)


      and Pi1,Pi2,Pi3 represented the confidence probability that Xi was predicted to be GIST, leiomyoma, or schwannoma, respectively. Above-mentioned voting and mapping are two modes of the hierarchical feature representation strategy embedded in the STT-BOX system. Through this hierarchical feature representation strategy, the slide diagnosis was quickly obtained without IHC data.

      Prediction on Soft Tissue Sarcoma of TCGA

      The authors selected 235 WSIs of soft tissue sarcomas from TCGA data set. Each WSI showed the tumor area and peritumoral tissues, including skeletal muscle, smooth muscle, collagen fiber, nerve, or fat. The authors planned to validate the performance of the STT-BOX system to distinguish and outline malignant from normal areas in all images. Specifically, all the soft tissue sarcoma WSIs were cropped into (512 × 512 pixel) patches with a stride of 256 pixels before being input to the feature extractor trained on Dataset-P. The present system generated a probability pi (pi[0,1]) for each patch i to represent the similarity of this patch to GIST in the training data. The predictions were spliced according to the cropping locations of the input patches, and a heat map highlighted that the malignant lesions could be obtained. To avoid missing small lesions, the predicted probabilities pi of the patches belonging to a WSI were sorted, and M patches with the highest probabilities constituted the regions of interest set {pi|ri%>20%}, where r% was the foreground ratio of each patch. The mean of the predicted probabilities of the regions of interest was regarded as the likelihood P that the WSI contained a region with high similarity to GIST:
      P=i=1M{pi|ri%>20%}/M.
      (5)


      When P>50%, the tested WSI was regarded as a positive sample containing the target area.

      Prediction on Ovarian SCSTs

      To further validate the capability of capturing the morphologic high-dimensional features of malignant SCSTs in the ovary, the authors selected three common ovarian SCSTs with spindle cell morphology, including TCTs, AGCTs, and SLCTs. Specifically, the WSIs from Dataset-O were cropped into (512 × 512 pixel) patches with a stride of 256 pixels before being input to the system with the model trained on Dataset-P. After extracting features, each patch would obtain a probability value pg, which represented the similarity between the input patch and the GIST patches. Thus, the dissimilarity between the input patch and GIST could be represented by pg¯=1pg. The similarity score S between a WSI and GIST could be obtained by aggregating the pg of all corresponding patches. S was defined as the ratio of similarity and dissimilarity between the input WSI features and GIST features:
      S=(pg1r1%+pg2r2%++pgnrn%)/n(pg¯1r1%+pg¯2r2%++pg¯nrn%)/n=i=1npgiri%n/i=1npg¯iri%n,
      (6)


      where n was the total number of patches cropped from the corresponding WSI. The more patches in WSI resembled GIST, the larger S would be. Similarly, when most tissue in WSI was unlike GIST, S was relatively small.

      Results

      Performance Evaluation of the Hierarchical Strategy

      Three-fold cross-validation was conducted on Dataset-P. All experiments were performed on the basis of finely tuned color transformation to avoid the distribution deviation between the training set and the test set (Figure 1C and Supplemental Figure S4). The areas under the curve (AUCs) of different networks were compared via box plots (Figure 2A). The screenshot- and slide-level features were obtained through the mapping mode of hierarchical feature representation strategy. Feature representations of different levels captured different sizes of the fields of view. The field of view of a patch was the smallest, and that of a slide was the largest. Although there are differences among network performance, the average results of the models are all increasing with the gradual expansion of the field of view. The 15 box plots of Inception-v3
      • Szegedy C.
      • Vanhoucke V.
      • Ioffe S.
      • Shlens J.
      • Wojna Z.
      Rethinking the inception architecture for computer vision.
      and ResNet-18/34/50/101
      • He K.
      • Zhang X.
      • Ren S.
      • Sun J.
      Deep residual learning for image recognition.
      (Figure 2A), the precision-recall curves (Supplemental Figure S5), and the confusion matrices (Supplemental Figure S6) suggest that the hierarchical feature representation strategy is stable and suitable for the slide-level H&E image diagnosis of GIST, schwannoma, and leiomyoma.
      Figure thumbnail gr2
      Figure 2Results of cross-validation on Dataset-P. A: Box plots of the three-fold cross-validation on Dataset-P. The three rows show the areas under the curve (AUCs) of gastrointestinal stromal tumor (GIST), schwannoma, and leiomyoma, respectively. The five columns represent the AUCs of five different network structures as the feature extractor. Each plot compares the performance of the feature representation in different levels, including patches, screenshots, and slides. The maximum, minimum, and median values of each box are assigned by the AUCs of the three-fold cross-validation, and the quartile lines represent the means of adjacent AUCs. Regardless of the network structure of the feature extractor, our hierarchical feature representation strategy combined with the slide features achieves the best results. Pink indicates patch; green, screenshot; and yellow, slide. B and C: The comparison of feature extractors with different network structures on Dataset-P. All networks are compared according to the patch-level features without hierarchical feature representation strategy. The three rows are results of GIST, schwannoma, and leiomyoma, respectively. B: Receiver operating characteristic (ROC) curves of the three-fold cross-validation. Each plot consists of five ROC curves of different networks as feature extractors, and the key difference areas are magnified and highlighted in the center. C: Box plots with AUCs of five different convolutional neural networks (CNNs). The maximum, minimum, and the median value (orange line) for each box are assigned by AUCs of the three-fold cross-validation; the quartile lines are the means of the adjacent AUCs; the green points are the mean values of the cross-validation. CNNs perform much better on the diagnoses of GIST and leiomyoma than on schwannoma. ResNet-50 and ResNet-101 predict schwannoma more accurately than ResNet-18, ResNet-34, and Inception-v3. Yellow indicates ResNet-101; orange, ResNet-50; green, ResNet-34; purple, ResNet-18; and blue, Inception-v3.

      Comparison of AI Network Structures

      Although many CNNs can be used as feature extractors, subtle differences in classification performance still exist. The receiver operating characteristic curves of the three-fold cross-validation and the average AUCs of different networks were analyzed in Figure 2. To purely compare the performance differences brought about by the network structures, only the results of the patch level were compared. For the classification of GIST, ResNet-101 showed obvious advantages in fold 1 and fold 2, and achieved the best overall performance (Figure 2B). The average performance of ResNet-50 and Inception-v3 was second only to ResNet-101. ResNet-34 performed best in fold 3, but was noticeable for lacking robustness (Figure 2, B and C). Diagnosis of schwannoma was challenging for all the five networks, and the results of ResNet-101 and ResNet-50 were notably better than the other networks (Figure 2C).
      The hierarchical mapping results of the three-fold cross-validation on Dataset-P with different CNNs as feature extractors are shown in Supplemental Table S4. According to the average performance of cross-validation, the accuracy of ResNet-101 and ResNet-50 was notably higher than that of the other networks. Although the average AUC of ResNet-101 was more advantageous, ResNet-50 achieved similar effects with approximately half parameters (Supplemental Figure S7), which indicated that the feature extraction speed of ResNet-50 was twice that of ResNet-101. Therefore, considering the efficiency of the CNNs comprehensively, ResNet-50 was selected as the optimal model for subsequent validation and analyses.

      Effect of Color Transformation

      After converting to the uniform color space hue, saturation, and value, the mean value of each channel was counted to visually represent the color distribution. The red, blue, and green scatter points in Supplemental Figure S4 represent the channel means of the screenshots corresponding to GIST, schwannoma, and leiomyoma, respectively. The color distributions on the hue, saturation, and value space of different categories of data in Dataset-T and Dataset-L were intertwined and difficult to separate. To improve the robustness of the system in dealing with various color distribution, the data in Dataset-P (ie, the training data) were expanded by random fluctuation of hue, saturation, brightness, and contrast. After 100 epochs of data augmentation, the distribution range of each channel was wider than before (Supplemental Figure S4). Then, the present system reduced the dependence on accidental color distribution by learning the characteristics of multiple color distributions of the same image. Supplemental Figure S8 compares the confusion matrices and receiver operating characteristic curves on cross-cohort Dataset-T and Dataset-L when the models were trained with or without color transformation.

      Robust Performance in Two External Cross-Cohort Validations

      Dataset-T and Dataset-L were set for cross-cohort validation of the generalization performance of the present system. All data in Dataset-P were assigned to train the ResNet-50 feature extractor. The cases of Dataset-T and Dataset-L were from different hospitals from those of Dataset-P; therefore, they were stained and imaged using an entirely different pipeline, thereby ensuring the authenticity and reliability of cross-cohort validation. The hierarchical feature representation strategy was conducted to determine the predicted label, including the voting mode and the mapping mode. Supplemental Table S5 shows the performance of the STT-BOX system on cross-cohorts. Voting was an intuitive feature fusion strategy; however, it was susceptible to interference from complex samples. By embedding the confidence probabilities of the feature extractor step by step, mapping increased the weights of simple samples and reduced that of difficult samples, and thereby improved the performance of feature fusion. Briefly, the features of the large field of view were obtained by averaging the features of the small field of view after regularization. It showed that on both data sets, mapping performed slightly better than voting on the screenshot-level representation (Supplemental Table S5). Both voting and mapping revealed notable advantages on the slide-level representation compared with that of patch-level representation. The robust performance of cross-cohort validation on Dataset-T and Dataset-L demonstrated the feasibility and generalization of the algorithm for GIST, schwannoma, and leiomyoma automatic diagnosis of GIST. Because the overall performance of the mapping strategy was slightly better than that of the voting strategy, subsequent experiments adopted the mapping strategy as the basis of the feature aggregation of the WSIs.

      Comparison of the STT-BOX to Pathologists

      Seven pathologists were invited to diagnose the slides in Dataset-T and Dataset-L. Three of them had 3 to 5 years of experience, whereas the other four had >10 years of experience. All pathologists independently diagnosed these slides. For objective and fair comparison, the seven pathologists were required to diagnose only through H&E-stained slides without auxiliary information of IHC. Moreover, the pathologists summarized the main diagnostic criteria and discussed the possible interpretation criteria for the algorithm (Supplemental Table S6). The receiver operating characteristic curves of the present system in the diagnosis of GIST, schwannoma, and leiomyoma with the sensitivity and specificity of the actual diagnosis are shown in Figure 3, A and B. The system achieved 100% AUC in the diagnosis of GIST and schwannoma of Dataset-T (Figure 3A). Although three pathologists were competitive with the system in the diagnosis of leiomyoma, the average sensitivity was 54.76%. The system achieved 100% AUC in the diagnosis of GIST, schwannoma, and leiomyoma of Dataset-L (Figure 3B). For the diagnosis of schwannoma in Dataset-L, although the average specificity of the diagnosis by pathologists reached 82.86%, the average sensitivity of their diagnosis was 24.49%. Compared with the pathologists, the STT-BOX system was more stable and discriminative on both data sets.
      Figure thumbnail gr3
      Figure 3Comparison of our automatic diagnosis STT-BOX system with seven pathologists on Dataset-T and Dataset-L. A and B: The receiver operating characteristic curves of our system are displayed by blue lines. Each red point is determined by the specificity and sensitivity of a pathologist, and the green diamond is the average performance of the seven pathologists. A and B: Our system outperformed most pathologists on Dataset-T (A) and was superior to all pathologists on Dataset-L (B). A and B: The yellow lines are the chance lines, which represent the performance of a random classifier. C: Confusion matrices of the seven pathologists and our system. The bottom center matrix is the ideal confusion matrix. The experienced pathologists D/E/F/G diagnosed more accurately than the junior pathologists A/B/C. Our STT-BOX system reached the highest accuracy on both Dataset-T and Dataset-L. AUC, area under the curve; GIST, gastrointestinal stromal tumor.
      Confusion matrices in Figure 3C show the accuracy comparison of the seven pathologists with the system. To fairly measure the offset of diagnosis, the frequencies of diagnosis were normalized by the number of true labels. Therefore, the sum of each row of a confusion matrix was one. Ideally, the confusion matrix was the identity matrix when all diagnoses were correct. Pathologists A/B/C indicated the junior pathologists with less experience, and their accuracy was far inferior to that of experienced pathologists D/E/F/G on both Dataset-T and Dataset-L. The accuracy of the system was notably higher than that of the seven pathologists on both data sets. On Dataset-T, the algorithm achieved 79.0% accuracy, which was comparable to the best accuracy of the seven pathologists. On Dataset-L, the algorithm achieved 90.6% accuracy, whereas the best accuracy of the seven pathologists was 75%. The confusion matrix on Dataset-L of the pathologist G who had >10 years' experience was a lower triangular matrix. Pathologist G was highly sensitive to the diagnosis GIST. Pathologist G did not miss any case of GIST, but overdiagnosed 100% of schwannoma as GIST. It seems that without the information of IHC, it is challenging for pathologists to diagnose GIST, schwannoma, and leiomyoma solely referring to H&E-stained slides. The STT-BOX system might capture typical features, and thus obtained high accuracy. Because the AUCs of the system for the diagnosis of GIST, schwannoma, and leiomyoma on the two cross-cohort data sets were close to 100%, the STT-BOX system may approach the auxiliary diagnosis performance of IHC when an appropriate threshold is selected.
      To clearly observe the morphologic features of interest in the diagnosis process of the system, the authors used Grad-CAM
      • Selvaraju R.R.
      • Cogswell M.
      • Das A.
      • Vedantam R.
      • Parikh D.
      • Batra D.
      Grad-cam: visual explanations from deep networks via gradient-based localization.
      to visualize the regions of interest of the system when predicting Dataset-L. The diagnostic criteria listed and ranked by pathologists (Supplemental Table S6) overlapped with the features interested by the algorithm system (Supplemental Figure S9). For example, they both regard the density, shape, atypia, and polarity of spindle cells as key reference features. Perhaps the difference is that pathologists, who are deeply aware of the malignant behavior of GIST and bias diagnostic GIST, result in better sensitivity but lower specificity in GIST diagnosis. However, the distinction between leiomyoma and schwannoma has no clinical adverse consequences, leading to low sensitivity or specificity.

      Prediction Performance on Soft Tissue Sarcoma of TCGA

      Obviously, the STT-BOX system showed satisfactory performance in identifying GIST from two benign STTs with histopathologic similarity. It suggested that the system may capture the high-dimensional features of GIST. Next, the authors investigated the performance of the system for image recognition in other common sarcomas. After the selection in TCGA data set, 235 WSIs from six types of sarcomas originated in bone and soft tissue were enrolled to validate. Although these tissues were not included in the training data, the system still measured the presence of regions of interest in each WSI that were highly like GIST. The system generated a probability P for each WSI to estimate the likelihood of containing the target areas (Equation 5). The overall diagnostic sensitivity was 93.19% (leiomyosarcoma, 88.54%; dedifferentiated liposarcoma, 98.15%; undifferentiated pleomorphic sarcoma, 93.18%; myxofibrosarcoma, 100%; synovial sarcoma, 100%; and malignant peripheral nerve sheath tumor, 88.89%), which showed that the system surprisingly correctly diagnosed most of the sarcomas despite no targeted training (Supplemental Figure S10). Although the probability P could indicate the candidate malignant WSIs, tumor areas, especially suspected tumor residue in the surgical margin tissues, are required to be mapped in clinical practice to assist pathologists in rapid diagnosis. Thus, the heat maps were also generated to show which areas required additional attention. In Figure 4, 10 example cases of soft tissue sarcoma were shown, including two cases of leiomyosarcoma (Figure 4, A and B), two cases of undifferentiated pleomorphic sarcoma (Figure 4, C and D), two cases of myxofibrosarcoma (Figure 4, E and F), one case of dedifferentiated liposarcoma (Figure 4G), one case of synovial sarcoma (Figure 4H), and two cases of malignant peripheral nerve sheath tumor (Figure 4, I and J). In the 10 cases, all tumor cells, whether in the tumor core or in the invasion edge, are clearly highlighted. Moreover, even small lesions composed of only a few tumor cells (Figure 4G) could be found and highlighted. In addition, the areas of hemorrhage, necrosis, and degenerated collagenous stroma are suppressed. In Figure 4, C, E, and H, the authors could also observe the nearest distance of tumor cells to the margin routinely marked with colored ink. Adjacent normal tissues, including muscle, nerve, blood vessel, fiber, adipose, and skin, were unmarked according to the system.
      Figure thumbnail gr4
      Figure 4Visualization of soft tissue sarcomas from The Cancer Genome Atlas data set. AJ: Heat maps of 10 representative cases of soft tissue sarcoma. A: In one case of leiomyosarcoma (LMS) that originated in retroperitoneal area, tumor cells were clearly highlighted, and the edematous stroma (A1) and intratumoral necrotic foci were unmarked (A2). B: In another case of LMS that originated in pelvis and retroperitoneal area, tumor cells were clearly highlighted, and the intratumoral hemorrhage was unmarked (B1). C: In one undifferentiated pleomorphic sarcoma (UPS) in left arm, the necrotic foci were unidentified, whereas the tumor cells were clearly delineated (C1). D: In another case of UPS in left posterior thigh, tumor cells and surrounding adipose tissue can be clearly distinguished (D1). E: In one myxofibrosarcoma (MFS) in right thigh, the tumor cells were clearly highlighted, whereas the collagen stroma were not outlined. F: In another case of MFS in right pretibia, tumor cells were clearly outlined, whereas the degenerated stroma among the nodules were unmarked (F1). The invasion edge of tumor to the neighboring striated muscle tissue can be clearly displayed (F2). G: In one dedifferentiated liposarcoma (DDLPS) in retroperitoneal area, invasion of tumor cells into the adipose tissue can be observed. Tumor cells millimeters away from the tumor core (G1), scattered in the adipose (G2), and in the intratumoral area (G3) can be clearly highlighted. H: In one synovial sarcoma (SS) in left foot, the tumor area was clearly delineated, whereas the surrounding skin and muscle and adipose tissues area unmarked. I: In one malignant peripheral nerve sheath tumor (MPNST) in right lower back, tumor cells were clearly outlined, whereas the collagenous pseudomembrane was unmarked (I1). In the tumor area, except for the small areas of hemorrhage and necrosis, the tumor cells were highlighted (I2). J: In another case of MPNST in upper lobe of left lung, except for the vessel and collagen stroma, all tumor cells were clearly outlined. The 10 whole-slide image probabilities P (Equation ) were all >0.9, indicating that our system believed that there were regions similar to gastrointestinal stromal tumor. Our STT-BOX system seemed sensitive enough to define malignant soft tissue tumors.

      Prediction Performance on Ovarian SCSTs

      As above, the present system has shown excellent feature extraction ability and robustness from benign and malignant STTs. To verify the generalization of the system, the authors next transferred the system to reproductive system. The authors selected three common ovarian SCSTs with spindle cell morphology and histopathologic similarity. Thus, a data set (Dataset-O) containing one type of benign SCST of TCT (10 cases) and two types of malignant SCSTs, including AGCTs (10 cases) and SLCTs (10 cases), was used to test the system (Supplemental Table S1 and Supplemental Figure S11). The system with core model trained on Dataset-P without fine-tuning was directly used to classify benign and malignant lesions, which was implemented by predicting the similarity to GIST of each slide. In this study, slides like GIST were set as malignant. Figure 5 shows the visualization results with a similarity score S for quantitative comparison. When the system measured that more tissue in the slide contained features like GIST, S was relatively larger. Compared with TCT (85%, S<1), more cases in AGCT (75%, S>1) and SLCT (90%, S>1) contained features like GIST. The AUC on the SCST data set was 88.83%. Supplemental Figure S12 shows the precision-recall curve and the confusion matrix of the prediction. These results indicated that the system used to predict GIST had the potential to reveal the malignant characteristics of ovarian SCSTs.
      Figure thumbnail gr5
      Figure 5Prediction performance on ovarian sex-cord stromal tumors (SCSTs). A: A brief description of the visualization of the diagnosis of ovarian SCSTs. Slides of each patient and the corresponding results are framed together, with hematoxylin-and-eosin–stained slides (left side) and predicted results (right side). In the predicted results, red regions were predicted by our STT-BOX system as similar to gastrointestinal stromal tumor (GIST), whereas blue regions were considered dissimilar to GIST. S below the slides is the malignant degree of the slide, which was obtained by Equation . The larger S is, the more similar the slide is to GIST and the more likely to be malignant. B: The receiver operating characteristic curve of ovarian SCST malignancy prediction. C: Visualization of the prediction results of theca cell tumor (TCT), adult granulosa cell tumor (AGCT), and Sertoli-Leydig cell tumor (SLCT). A total of 85% of TCT slides were predicted to be dissimilar to GIST, whereas 75% of AGCT slides and 90% of SLCT slides were predicted to be similar to GIST. The prediction phenomenon is consistent with the tumor nature that TCT is benign and AGCT and SLCT are malignant. Scale bars = 4000 μm (A and C). AUC, area under the curve; WSI, whole-slide image.

      Discussion

      In the present study, we optimized the first and compact CNN-based hierarchical feature representation tool for distinguishing a variety of malignant and benign mesenchymal tumors originated in digestive system, soft tissue and bone, and reproductive system by comparing with the histopathologic resemblance of GIST.
      Our selection of initial system construction stems from recent advances. Because CNNs can capture numbers of features of tissue structure and cell morphology, deep learning has shown excellent accuracy and robustness in lesion detection, histopathologic diagnosis, and automatic grading of H&E-stained images of common carcinomas.
      • Esteva A.
      • Kuprel B.
      • Novoa R.A.
      • Ko J.
      • Swetter S.M.
      • Blau H.M.
      • Thrun S.
      Dermatologist-level classification of skin cancer with deep neural networks.
      • Bejnordi B.E.
      • Veta M.
      • Van Diest P.J.
      • Van Ginneken B.
      • Karssemeijer N.
      • Litjens G.
      • Van Der Laak J.A.
      • Hermsen M.
      • Manson Q.F.
      • Balkenhol M.
      Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer.
      • Le H.
      • Gupta R.
      • Hou L.
      • Abousamra S.
      • Fassler D.
      • Torre-Healy L.
      • Moffitt R.A.
      • Kurc T.
      • Samaras D.
      • Batiste R.
      Utilizing automated breast cancer detection to identify spatial distributions of tumor-infiltrating lymphocytes in invasive breast cancer.
      • Li J.
      • Mi W.
      • Guo Y.
      • Ren X.
      • Fu H.
      • Zhang T.
      • Zou H.
      • Liang Z.
      Artificial intelligence for histological subtype classification of breast cancer: combining multi-scale feature maps and the recurrent attention model.
      • Lin H.
      • Chen H.
      • Dou Q.
      • Wang L.
      • Qin J.
      • Heng P.-A.
      Scannet: a fast and dense scanning framework for metastastic breast cancer detection from whole-slide image.
      • Litjens G.
      • Bandi P.
      • Ehteshami Bejnordi B.
      • Geessink O.
      • Balkenhol M.
      • Bult P.
      • Halilovic A.
      • Hermsen M.
      • van de Loo R.
      • Vogels R.
      1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset.
      • da Silva L.M.
      • Pereira E.M.
      • Salles P.G.
      • Godrich R.
      • Ceballos R.
      • Kunz J.D.
      • Casson A.
      • Viret J.
      • Chandarlapaty S.
      • Ferreira C.G.
      Independent real-world application of a clinical-grade automated prostate cancer detection system.
      • Ström P.
      • Kartasalo K.
      • Olsson H.
      • Solorzano L.
      • Delahunt B.
      • Berney D.M.
      • Bostwick D.G.
      • Evans A.J.
      • Grignon D.J.
      • Humphrey P.A.
      Artificial intelligence for diagnosis and grading of prostate cancer in biopsies: a population-based, diagnostic study.
      • Viswanathan V.S.
      • Toro P.
      • Corredor G.
      • Mukhopadhyay S.
      • Madabhushi A.
      The state of the art for artificial intelligence in lung digital pathology.
      • Bouteldja N.
      • Hölscher D.L.
      • Klinkhammer B.M.
      • Buelow R.D.
      • Lotz J.
      • Weiss N.
      • Daniel C.
      • Amann K.
      • Boor P.
      Stain-independent deep learning–based analysis of digital kidney histopathology.
      • Hermsen M.
      • Ciompi F.
      • Adefidipe A.
      • Denic A.
      • Dendooven A.
      • Smith B.H.
      • van Midden D.
      • Bräsen J.H.
      • Kers J.
      • Stegall M.D.
      Convolutional neural networks for the evaluation of chronic and inflammatory lesions in kidney transplant biopsies.
      • Park J.
      • Jang B.G.
      • Kim Y.W.
      • Park H.
      • Kim B.-H.
      • Kim M.J.
      • Ko H.
      • Gwak J.M.
      • Lee E.J.
      • Chung Y.R.
      A prospective validation and observer performance study of a deep learning algorithm for pathologic diagnosis of gastric tumors in endoscopic BiopsiesDeep learning–assisted diagnosis in gastric biopsies.
      • Kumar N.
      • Verma R.
      • Chen C.
      • Lu C.
      • Fu P.
      • Willis J.
      • Madabhushi A.
      Computer-extracted features of nuclear morphology in hematoxylin and eosin images distinguish stage II and IV colon tumors.
      • Echle A.
      • Grabsch H.I.
      • Quirke P.
      • van den Brandt P.A.
      • West N.P.
      • Hutchins G.G.
      • Heij L.R.
      • Tan X.
      • Richman S.D.
      • Krause J.
      Clinical-grade detection of microsatellite instability in colorectal tumors by deep learning.
      • Xu H.
      • Cha Y.J.
      • Clemenceau J.R.
      • Choi J.
      • Lee S.H.
      • Kang J.
      • Hwang T.H.
      Spatial analysis of tumor-infiltrating lymphocytes in histological sections using deep learning techniques predicts survival in colorectal carcinoma.
      • Kim Y.J.
      • Jang H.
      • Lee K.
      • Park S.
      • Min S.-G.
      • Hong C.
      • Park J.H.
      • Lee K.
      • Kim J.
      • Hong W.
      PAIP 2019: liver cancer segmentation challenge.
      ,
      • Van Rijthoven M.
      • Balkenhol M.
      • Siliņa K.
      • Van Der Laak J.
      • Ciompi F.
      • HookNet
      Multi-resolution convolutional neural networks for semantic segmentation in histopathology whole-slide images.
      For example, network structures, such as Micro-Net
      • Raza S.E.A.
      • Cheung L.
      • Shaban M.
      • Graham S.
      • Epstein D.
      • Pelengaris S.
      • Khan M.
      • Rajpoot N.M.
      Micro-Net: a unified model for segmentation of various objects in microscopy images.
      and Hover-Net,
      • Graham S.
      • Vu Q.D.
      • Raza S.E.A.
      • Azam A.
      • Tsang Y.W.
      • Kwak J.T.
      • Rajpoot N.
      Hover-Net: simultaneous segmentation and classification of nuclei in multi-tissue histology images.
      were specially designed to separate nuclei in multitissue histology images. In addition, some algorithms aimed at accurately delineating the shape of glands, especially the separation and instance segmentation of the overlapping or adjacent glands.
      • Sirinukunwattana K.
      • Pluim J.P.
      • Chen H.
      • Qi X.
      • Heng P.-A.
      • Guo Y.B.
      • Wang L.Y.
      • Matuszewski B.J.
      • Bruni E.
      • Sanchez U.
      Gland segmentation in colon histology images: the glas challenge contest.
      • BenTaieb A.
      • Hamarneh G.
      Topology aware fully convolutional networks for histology gland segmentation.
      • Ding H.
      • Pan Z.
      • Cen Q.
      • Li Y.
      • Chen S.
      Multi-scale fully convolutional network for gland segmentation using three-class classification.
      • Chen H.
      • Qi X.
      • Yu L.
      • Dou Q.
      • Qin J.
      • Heng P.-A.
      DCAN: deep contour-aware networks for object instance segmentation from histology images.
      • Yan C.
      • Xu J.
      • Xie J.
      • Cai C.
      • Lu H.
      Prior-aware CNN with multi-task learning for colon images analysis.
      • Graham S.
      • Chen H.
      • Gamper J.
      • Dou Q.
      • Heng P.-A.
      • Snead D.
      • Tsang Y.W.
      • Rajpoot N.
      MILD-Net: minimal information loss dilated network for gland instance segmentation in colon histology images.
      Collectively, it seems that these networks can dig out the features of malignant epithelial components and surrounding stroma at low magnification, but also capture the fine features of cells at high magnification. Therefore, we ask whether these models can be used for image recognition of tumors from various mesenchymal tissues? Until now, the only breakthrough came from a pan-cancer study in which 206 cases of soft tissue sarcomas in TCGA data set were selected for training.
      • Fu Y.
      • Jung A.W.
      • Torne R.V.
      • Gonzalez S.
      • Vöhringer H.
      • Shmatko A.
      • Yates L.R.
      • Jimenez-Linan M.
      • Moore L.
      • Gerstung M.
      Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis.
      A generalized linear model could recognize both these soft tissue sarcomas and 27 types of carcinomas. This suggested that CNNs could be applied to STTs.
      To adapt to our STT task, CNNs
      • Szegedy C.
      • Vanhoucke V.
      • Ioffe S.
      • Shlens J.
      • Wojna Z.
      Rethinking the inception architecture for computer vision.
      ,
      • He K.
      • Zhang X.
      • Ren S.
      • Sun J.
      Deep residual learning for image recognition.
      that have achieved outstanding performance in AI diagnosis of epithelial tumors were adopted and modified. After a finely tuned color transformation, these networks were assigned as feature extractors to capture the high-dimensional local features inside the tumoral histopathology. Then, we proposed a hierarchical feature representation strategy to aggregate multiscale information gradually. This hierarchical mapping strategy can be widely transferred to other AI processing tasks with superresolution medical images. Compared with the pathologists, our STT-BOX system distinguished GIST from other two benign tumors with morphologic resemblance. Next, we applied the system to identify the malignant lesions of six types of soft tissue sarcomas in TCGA data set. Our system clearly judged and outlined the tumor areas, which formed a sharp contrast with the peritumoral areas and the surrounding normal tissues. The secondary tumor changes, including intratumoral hemorrhage and necrosis, were not outlined. Collectively, it suggested our system can discover the accurate features of viable tumor cells. Even small lesions composed of several tumor cells can be highlighted with our system. Such highly sensitive and specific functions enable STT-BOX system to be more widely used in the judgment of benign and malignant STTs, the evaluation of surgical margin specimens, and the biopsy samples. Finally, without any adjustment, STT-BOX was competent to identify benign and malignant images of ovarian SCSTs. Collectively, it suggests that the STT-BOX system may effectively and robustly identify the characteristics of malignant mesenchymal tumors from benign ones in cell polarity, density, and atypia. Although the tumor originated from different organs, the system can still distinguish the nature of lesions. Also, GIST may be an appropriate watershed to distinguish benign and malignant STTs with spindle cell morphology. That is, STT, similar to GIST, is more likely to have malignant features. Conversely, the greater the difference between GIST and one STT, the more likely it is to be benign.
      Our study still has some limitations. The size of the cohort data set used in this study is relatively small compared with AI-assisted pathologic diagnosis studies on carcinoma types with high morbidity and mortality. However, this is more consistent with the overall incidence ratio of epithelial/mesenchymal tumors, and the actual working scene for most pathologists.
      • Ywasa Y.
      • Fletcher C.
      • Flucke U.
      WHO Classification of Soft Tissue and Bone Tumours.
      The existing results show that even the deep learning model trained on limited samples has considerable generalization advantages and robustness. We speculate that a well-cropped and small data set is sufficient to capture the morphologic variation of STTs and the technical deviation introduced in histopathologic staining and preparation. However, the mechanical explanation of detailed features extracted by the algorithm needs further research and validation of our system on more data sets before the clinical application. At present, benign and malignant STTs from different primary organs and tissues have been included in our study. As far as STTs are concerned, there are also borderline tumors. In addition, further differentiation of histologic subtypes of malignant STTs has a great impact on clinical treatment strategy. These six different sarcomas entities from TCGA data set have different genetic characteristics, tissue characteristics, immune microenvironment, and response to immunotherapy.
      • Kannan S.
      • Lock I.
      • Ozenberger B.B.
      • Jones K.B.
      Genetic drivers and cells of origin in sarcomagenesis.
      Our STT-BOX is only the first step to distinguish malignant STTs from benign STTs only with H&E-stained images. In the future, we need to focus on their fine tissue microenvironment and cell characteristics, and further refine and upgrade the STT-BOX system by combining genome and prognosis data. It also needs to be further verified in a larger number of samples and tumor types, so as to make the STT-BOX system more stable and generalized.
      In summary, our study realized a system called STT-BOX for diagnosing a wide variety of human mesenchymal tumors. We first designed a deep learning–based system for the automatic diagnosis of GIST with the H&E-stained images, and verified the accuracy of the system in more types of mesenchymal tumors. It may greatly improve the pathologic diagnosis of STTs originated in different organs, prevent misdiagnosis of STTs by inexperienced pathologists and departments lacking IHC and genetic detection conditions, and further help clinical treatment and patient prognosis.

      Supplemental Data

      Figure thumbnail figs1
      Supplemental Figure S1The tool interface of STT-BOX. Pathologists can observe multiscale hematoxylin-and-eosin (H&E)–stained slides through STT-BOX, and the benign and malignant results of artificial intelligence (AI) diagnosis can be selectively checked. The AI diagnosis results clearly contrast the malignant degree of the lesions through the color changes. STT-BOX assists pathologists clinically to quickly find suspected malignant soft tissue tumor (STT) lesion areas only through H&E whole-slide images.
      Figure thumbnail figs2
      Supplemental Figure S2Statistics of the number of patches cropped from each slide in the training set (Dataset-P).
      Figure thumbnail figs3
      Supplemental Figure S3Visualization examples of prediction results in Dataset-T and Dataset-L. AC and GI: Each hematoxylin-and-eosin slide was annotated through immunohistochemical detection as one single label: gastrointestinal stromal tumor (GIST), schwannoma, or leiomyoma. DF and JL: The prediction results obtained by the system with ResNet-50 as the feature extractor. According to the color bars, the more confident the system is, the darker the color is.
      Figure thumbnail figs4
      Supplemental Figure S4Color distribution on hue, saturation, and value (HSV) color space of Dataset-T, Dataset-L, Dataset-P, and Dataset-P∗, which was composed of the samples of the Dataset-P after color transformation. The mean values of the three channels (HSV) of each sample were calculated, and each sample corresponded to a scatter point in the plots. The red, blue, and green dots indicate that the relevant samples belong to gastrointestinal stromal tumor (GIST), schwannoma, or leiomyoma, respectively. The first three rows show the color distributions of the samples in Dataset-P, Dataset-T, and Dataset-L, respectively. The last row shows the color distribution of the training samples (Dataset-P) after color transformation (see ). With color transformation, the feature extractor could focus more on the semantic features without being impeded by the color distribution of the training data.
      Figure thumbnail figs5
      Supplemental Figure S5Precision-recall (PR) curves of gastrointestinal stromal tumor classification. The five rows list the curves of Inception-v3, ResNet-18, ResNet-34, ResNet-50, and ResNet-101, respectively. The three columns list the curves of the three folds. Each PR curve contains the precision and recall of patch-, screenshot-, and slide-level feature representation. All the screenshot- and slide-level features were captured by the mapping mode of hierarchical feature representation strategy. Regardless of which one of the five convolutional neural networks was selected as the feature extractor, the screenshot-level feature representation got a higher AUPRC than the patch-level feature representation, and similarly, the slide-level mode with a broader field of view always got a higher AUPRC than the screenshot-level mode.
      Figure thumbnail figs6
      Supplemental Figure S6Confusion matrices of cross-validation and cross-cohort validation. A: Confusion matrices of three-fold cross-validation on Dataset-P at patch, screenshot, and slide levels. B: Confusion matrices of cross-cohort validation on Dataset-T and Dataset-L at patch, screenshot, and slide levels.
      Figure thumbnail figs7
      Supplemental Figure S7Comparison of the performance and parameters of the five convolutional neural networks (Inception-v3, ResNet-18, ResNet-34, ResNet-50, and ResNet-101). The red points represent the average performance of cross-validation on Dataset-P. The areas of the circles centered on the red points are proportional to the amounts of network parameters. The accuracy and area under the curve (AUC) of ResNet-50 and ResNet-101 are similar and higher than those of Inception-v3, ResNet-18, and ResNet-34, whereas the number of parameters of ResNet-50 is obviously less than that of ResNet-101. GIST, gastrointestinal stromal tumor.
      Figure thumbnail figs8
      Supplemental Figure S8Performance of color transformation. A and B: Confusion matrices and receiver operating characteristic curves in cross-cohort validation on Dataset-T and Dataset-L. A: The inference model was trained without color transformation. B: The inference model was trained with color transformation. The model trained with color transformation reflects superior performance with high average areas under the curve (AUCs) on both data sets. GIST, gastrointestinal stromal tumor; LEI, leiomyoma; SCH, schwannoma.
      Figure thumbnail figs9
      Supplemental Figure S9The regions of interest for STT-BOX when diagnosing Dataset-L. These maps were generated through Grad-CAM. AD: In gastrointestinal stromal tumor cases, bundles of regularly arranged spindle cells are highlighted, both in the long (AC) and short (D) axis directions. A, C, and D: The dense arrangement of nuclei and cytoplasmic processes appear to be more easily highlighted. AC: The vacuoles in tumor cells (AC) and degenerated stroma (B) were not identified by the algorithm. E and F: In schwannoma cases, wavy and regularly arranged spindle cells are clearly identified. Also, a portion of the long spindle-shaped, sparsely spaced nuclei is highlighted. G and H: In leiomyoma cases, the bundles of spindle cells are highlighted. Long, straight cytoplasmic processes appear to be more easily identified. Collectively, the algorithm seems to focus only on local region features and the features of nuclei and cytoplasm of tumor cells, to give preference judgments, and synthesize the various regions in the whole-slide image to give quantitative results.
      Figure thumbnail figs10
      Supplemental Figure S10The prediction score P (see , Equation ) of each sample in The Cancer Genome Atlas (TCGA) data set. For the six types of sarcomas in TCGA data set, the scores P of samples for each type are listed in descending order. When the threshold P is set to 0.5, the correctly predicted parts are shown in orange in each subfigure, and the incorrectly predicted parts are shown in yellow. The proportions of correctly predicted sarcomas for each type in TCGA data set are as follows: leiomyosarcoma (LMS), 88.54%; dedifferentiated liposarcoma (DDLPS), 98.15%; undifferentiated pleomorphic sarcoma (UPS), 93.18%; myxofibrosarcoma (MFS), 100%; synovial sarcoma (SS), 100%; and malignant peripheral nerve sheath tumor (MPNST), 88.89%.
      Figure thumbnail figs11
      Supplemental Figure S11Data examples in the Dataset-O. Dataset-O contained hematoxylin-and-eosin (H&E) slides that were diagnosed by pathologists with the assistance of immunohistochemistry (IHC; FOXL2, SF-1, α-inhibin, and WT1) and gene mutation detection of FOXL2 and DICER1 into ovarian sex-cord stromal tumors, including theca cell tumors (TCTs), adult granulosa cell tumors (AGCTs), and Sertoli-Leydig cell tumors (SLCTs). Scale bars = 4000 μ m.
      Figure thumbnail figs12
      Supplemental Figure S12Precision-recall (PR) curve and confusion matrix of the prediction on ovarian sex-cord stromal tumors. A: PR curve of the prediction on Dataset-O. The AUPRC is 0.93. At point M, the precision is 0.90 and the recall is 0.87. B: The confusion matrix of the prediction on Dataset-O with the threshold corresponding to the point M in A.

      References

        • Esteva A.
        • Kuprel B.
        • Novoa R.A.
        • Ko J.
        • Swetter S.M.
        • Blau H.M.
        • Thrun S.
        Dermatologist-level classification of skin cancer with deep neural networks.
        Nature. 2017; 542: 115-118
        • Bejnordi B.E.
        • Veta M.
        • Van Diest P.J.
        • Van Ginneken B.
        • Karssemeijer N.
        • Litjens G.
        • Van Der Laak J.A.
        • Hermsen M.
        • Manson Q.F.
        • Balkenhol M.
        Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer.
        JAMA. 2017; 318: 2199-2210
        • Le H.
        • Gupta R.
        • Hou L.
        • Abousamra S.
        • Fassler D.
        • Torre-Healy L.
        • Moffitt R.A.
        • Kurc T.
        • Samaras D.
        • Batiste R.
        Utilizing automated breast cancer detection to identify spatial distributions of tumor-infiltrating lymphocytes in invasive breast cancer.
        Am J Pathol. 2020; 190: 1491-1504
        • Li J.
        • Mi W.
        • Guo Y.
        • Ren X.
        • Fu H.
        • Zhang T.
        • Zou H.
        • Liang Z.
        Artificial intelligence for histological subtype classification of breast cancer: combining multi-scale feature maps and the recurrent attention model.
        Histopathology. 2022; 80: 836-846
        • Lin H.
        • Chen H.
        • Dou Q.
        • Wang L.
        • Qin J.
        • Heng P.-A.
        Scannet: a fast and dense scanning framework for metastastic breast cancer detection from whole-slide image.
        in: IEEE Winter Conference on Applications of Computer Vision: IEEE. 2018: 539-546
        • Litjens G.
        • Bandi P.
        • Ehteshami Bejnordi B.
        • Geessink O.
        • Balkenhol M.
        • Bult P.
        • Halilovic A.
        • Hermsen M.
        • van de Loo R.
        • Vogels R.
        1399 H&E-stained sentinel lymph node sections of breast cancer patients: the CAMELYON dataset.
        Gigascience. 2018; 7: giy065
        • da Silva L.M.
        • Pereira E.M.
        • Salles P.G.
        • Godrich R.
        • Ceballos R.
        • Kunz J.D.
        • Casson A.
        • Viret J.
        • Chandarlapaty S.
        • Ferreira C.G.
        Independent real-world application of a clinical-grade automated prostate cancer detection system.
        J Pathol. 2021; 254: 147-158
        • Ström P.
        • Kartasalo K.
        • Olsson H.
        • Solorzano L.
        • Delahunt B.
        • Berney D.M.
        • Bostwick D.G.
        • Evans A.J.
        • Grignon D.J.
        • Humphrey P.A.
        Artificial intelligence for diagnosis and grading of prostate cancer in biopsies: a population-based, diagnostic study.
        Lancet Oncol. 2020; 21: 222-232
        • Viswanathan V.S.
        • Toro P.
        • Corredor G.
        • Mukhopadhyay S.
        • Madabhushi A.
        The state of the art for artificial intelligence in lung digital pathology.
        J Pathol. 2022; 257: 413-429
        • Bouteldja N.
        • Hölscher D.L.
        • Klinkhammer B.M.
        • Buelow R.D.
        • Lotz J.
        • Weiss N.
        • Daniel C.
        • Amann K.
        • Boor P.
        Stain-independent deep learning–based analysis of digital kidney histopathology.
        Am J Pathol. 2023; 193: 73-83
        • Hermsen M.
        • Ciompi F.
        • Adefidipe A.
        • Denic A.
        • Dendooven A.
        • Smith B.H.
        • van Midden D.
        • Bräsen J.H.
        • Kers J.
        • Stegall M.D.
        Convolutional neural networks for the evaluation of chronic and inflammatory lesions in kidney transplant biopsies.
        Am J Pathol. 2022; 192: 1418-1432
        • Park J.
        • Jang B.G.
        • Kim Y.W.
        • Park H.
        • Kim B.-H.
        • Kim M.J.
        • Ko H.
        • Gwak J.M.
        • Lee E.J.
        • Chung Y.R.
        A prospective validation and observer performance study of a deep learning algorithm for pathologic diagnosis of gastric tumors in endoscopic BiopsiesDeep learning–assisted diagnosis in gastric biopsies.
        Clin Cancer Res. 2021; 27: 719-728
        • Kumar N.
        • Verma R.
        • Chen C.
        • Lu C.
        • Fu P.
        • Willis J.
        • Madabhushi A.
        Computer-extracted features of nuclear morphology in hematoxylin and eosin images distinguish stage II and IV colon tumors.
        J Pathol. 2022; 257: 17-28
        • Echle A.
        • Grabsch H.I.
        • Quirke P.
        • van den Brandt P.A.
        • West N.P.
        • Hutchins G.G.
        • Heij L.R.
        • Tan X.
        • Richman S.D.
        • Krause J.
        Clinical-grade detection of microsatellite instability in colorectal tumors by deep learning.
        Gastroenterology. 2020; 159: 1406-1416.e11
        • Xu H.
        • Cha Y.J.
        • Clemenceau J.R.
        • Choi J.
        • Lee S.H.
        • Kang J.
        • Hwang T.H.
        Spatial analysis of tumor-infiltrating lymphocytes in histological sections using deep learning techniques predicts survival in colorectal carcinoma.
        J Pathol Clin Res. 2022; 8: 327-339
        • Kim Y.J.
        • Jang H.
        • Lee K.
        • Park S.
        • Min S.-G.
        • Hong C.
        • Park J.H.
        • Lee K.
        • Kim J.
        • Hong W.
        PAIP 2019: liver cancer segmentation challenge.
        Med Image Anal. 2021; 67101854
        • Ywasa Y.
        • Fletcher C.
        • Flucke U.
        WHO Classification of Soft Tissue and Bone Tumours.
        2020
        • Burns J.
        • Brown J.M.
        • Jones K.B.
        • Huang P.H.
        The cancer genome atlas: impact and future directions in sarcoma.
        Surg Oncol Clin. 2022; 31: 559-568
        • Joensuu H.
        • Hohenberger P.
        • Corless C.L.
        Gastrointestinal stromal tumour.
        Lancet. 2013; 382: 973-983
        • Papke Jr., D.J.
        • Hornick J.L.
        Recent developments in gastroesophageal mesenchymal tumours.
        Histopathology. 2021; 78: 171-186
        • Joensuu H.
        • Vehtari A.
        • Riihimäki J.
        • Nishida T.
        • Steigen S.E.
        • Brabec P.
        • Plank L.
        • Nilsson B.
        • Cirilli C.
        • Braconi C.
        Risk of recurrence of gastrointestinal stromal tumour after surgery: an analysis of pooled population-based cohorts.
        Lancet Oncol. 2012; 13: 265-274
        • Joensuu H.
        Risk stratification of patients diagnosed with gastrointestinal stromal tumor.
        Hum Pathol. 2008; 39: 1411-1419
        • Karakas C.
        • Christensen P.
        • Baek D.
        • Jung M.
        • Ro J.Y.
        Dedifferentiated gastrointestinal stromal tumor: recent advances.
        Ann Diagn Pathol. 2019; 39: 118-124
        • Lazar A.J.
        • McLellan M.D.
        • Bailey M.H.
        • Miller C.A.
        • Appelbaum E.L.
        • Cordes M.G.
        • Fronick C.C.
        • Fulton L.A.
        • Fulton R.S.
        • Mardis E.R.
        Comprehensive and integrated genomic characterization of adult soft tissue sarcomas.
        Cell. 2017; 171: 950-965
        • Young R.H.
        Ovarian tumors: a survey of selected advances of note during the life of this journal.
        Hum Pathol. 2020; 95: 169-206
        • Karnezis A.N.
        • Cho K.R.
        • Gilks C.B.
        • Pearce C.L.
        • Huntsman D.G.
        The disparate origins of ovarian cancers: pathogenesis and prevention strategies.
        Nat Rev Cancer. 2017; 17: 65-74
        • Young R.H.
        Reflections on a 40-year experience with a fascinating group of tumors, including comments on the seminal observations of Robert E. Scully, MD.
        Arch Pathol Lab Med. 2018; 142: 1459-1485
        • Szegedy C.
        • Vanhoucke V.
        • Ioffe S.
        • Shlens J.
        • Wojna Z.
        Rethinking the inception architecture for computer vision.
        in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 2818-2826
        • He K.
        • Zhang X.
        • Ren S.
        • Sun J.
        Deep residual learning for image recognition.
        in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 770-778
        • Deng J.
        • Dong W.
        • Socher R.
        • Li L.-J.
        • Li K.
        • Fei-Fei L.
        Imagenet: a large-scale hierarchical image database.
        in: IEEE Conference on Computer Vision and Pattern Recognition: IEEE. 2009: 248-255
        • Paszke A.
        • Gross S.
        • Massa F.
        • Lerer A.
        • Bradbury J.
        • Chanan G.
        • Killeen T.
        • Lin Z.
        • Gimelshein N.
        • Antiga L.
        Pytorch: an imperative style, high-performance deep learning library.
        Adv Neural Inf Process Syst. 2019; 32
        • Goode A.
        • Gilbert B.
        • Harkes J.
        • Jukic D.
        • Satyanarayanan M.
        OpenSlide: a vendor-neutral software foundation for digital pathology.
        J Pathol Inform. 2013; 4: 27
        • Hunter J.D.
        Matplotlib: a 2D graphics environment.
        Comput Sci Eng. 2007; 9: 90-95
        • Selvaraju R.R.
        • Cogswell M.
        • Das A.
        • Vedantam R.
        • Parikh D.
        • Batra D.
        Grad-cam: visual explanations from deep networks via gradient-based localization.
        in: Proceedings of the IEEE International Conference on Computer Vis. 2017: 618-626
        • Van Rijthoven M.
        • Balkenhol M.
        • Siliņa K.
        • Van Der Laak J.
        • Ciompi F.
        • HookNet
        Multi-resolution convolutional neural networks for semantic segmentation in histopathology whole-slide images.
        Med Image Anal. 2021; 68101890
        • Raza S.E.A.
        • Cheung L.
        • Shaban M.
        • Graham S.
        • Epstein D.
        • Pelengaris S.
        • Khan M.
        • Rajpoot N.M.
        Micro-Net: a unified model for segmentation of various objects in microscopy images.
        Med Image Anal. 2019; 52: 160-173
        • Graham S.
        • Vu Q.D.
        • Raza S.E.A.
        • Azam A.
        • Tsang Y.W.
        • Kwak J.T.
        • Rajpoot N.
        Hover-Net: simultaneous segmentation and classification of nuclei in multi-tissue histology images.
        Med Image Anal. 2019; 58101563
        • Sirinukunwattana K.
        • Pluim J.P.
        • Chen H.
        • Qi X.
        • Heng P.-A.
        • Guo Y.B.
        • Wang L.Y.
        • Matuszewski B.J.
        • Bruni E.
        • Sanchez U.
        Gland segmentation in colon histology images: the glas challenge contest.
        Med Image Anal. 2017; 35: 489-502
        • BenTaieb A.
        • Hamarneh G.
        Topology aware fully convolutional networks for histology gland segmentation.
        in: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2016: 460-468
        • Ding H.
        • Pan Z.
        • Cen Q.
        • Li Y.
        • Chen S.
        Multi-scale fully convolutional network for gland segmentation using three-class classification.
        Neurocomputing. 2020; 380: 150-161
        • Chen H.
        • Qi X.
        • Yu L.
        • Dou Q.
        • Qin J.
        • Heng P.-A.
        DCAN: deep contour-aware networks for object instance segmentation from histology images.
        Med Image Anal. 2017; 36: 135-146
        • Yan C.
        • Xu J.
        • Xie J.
        • Cai C.
        • Lu H.
        Prior-aware CNN with multi-task learning for colon images analysis.
        in: IEEE 17th International Symposium on Biomedical Imaging: IEEE. 2020: 254-257
        • Graham S.
        • Chen H.
        • Gamper J.
        • Dou Q.
        • Heng P.-A.
        • Snead D.
        • Tsang Y.W.
        • Rajpoot N.
        MILD-Net: minimal information loss dilated network for gland instance segmentation in colon histology images.
        Med Image Anal. 2019; 52: 199-211
        • Fu Y.
        • Jung A.W.
        • Torne R.V.
        • Gonzalez S.
        • Vöhringer H.
        • Shmatko A.
        • Yates L.R.
        • Jimenez-Linan M.
        • Moore L.
        • Gerstung M.
        Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis.
        Nat Cancer. 2020; 1: 800-810
        • Kannan S.
        • Lock I.
        • Ozenberger B.B.
        • Jones K.B.
        Genetic drivers and cells of origin in sarcomagenesis.
        J Pathol. 2021; 254: 474-493