Quantitative Histomorphometric Features of Prostate Cancer Predict Patients Who Biochemically Recur Following Prostatectomy

Prostate cancer is the most commonly diagnosed cancer in men, accounting for 27% of the new male cancer diagnoses in 2022. If organ-confined, removal of the prostate through radical prostatectomy is considered curative; however, distant metastases may occur, resulting in a poor patient prognosis. This study sought to determine whether quantitative pathomic features of prostate cancer differ in patients who biochemically experience biological recurrence after surgery. Whole-mount prostate histology from 78 patients was analyzed for this study. In total, 614 slides were hematoxylin and eosin stained and digitized to produce whole slide images (WSI). Regions of differing Gleason patterns were digitally annotated by a genitourinary fellowship-trained pathologist, and high-resolution tiles were extracted from each annotated region of interest for further analysis. Individual glands within the prostate were identified using automated image processing algorithms, and histomorphometric features were calculated on a per-tile basis and across WSI and averaged by patients. Tiles were organized into cancer and benign tissues. Logistic regression models were fit to assess the predictive value of the calculated pathomic features across tile groups and WSI; additionally, models using clinical information were used for comparisons. Logistic regression classified each pathomic feature model at accuracies >80% with areas under the curve of 0.82, 0.76, 0.75, and 0.72 for all tiles, cancer only, noncancer only, and across WSI. This was comparable with standard clinical information, Gleason Grade Groups, and CAPRA score, which achieved similar accuracies but areas under the curve of 0.80, 0.77, and 0.70, respectively. This study demonstrates that the use of quantitative pathomic features calculated from digital histology of prostate cancer may provide clinicians with additional information beyond the traditional qualitative pathologist assessment. Further research is warranted to determine possible inclusion in treatment guidance.


Introduction
Prostate cancer (PCa) accounts for approximately one in 4 new male cancer diagnoses, making it the most commonly diagnosed cancer in American men.3][4] There has been a growing interest in pathological studies for developing improved strategies for predicting and preventing the biochemical recurrence of PCa.
Prostate cancer is currently diagnosed using the Gleason grading scale, which assigns a score corresponding to the Gleason grades of the 2 most predominant cell patterns on histopathological assessment.More recently, these patterns have been used to assign patients into 5 Grade Groups (GG) to predict prognosis. 1Low-risk diseases may be managed through active surveillance, whereas clinically significant cancer (GG ≥2, tumor volume ≥0.5 mL, or stage ≥T3) is often treated with therapies such as radical prostatectomy or radiation.Prostatectomy is considered curative if tumors are organ-confined; however, distant metastases may form and result in biochemical recurrence (BCR).
Biochemical PCa recurrence is determined by increasing prostate-specific antigen (PSA) scores after treatment.However, PSA is currently the only validated biomarker for disease recurrence. 2 After surgery, PSA levels should drop to 0 ng/mL.Typically, the cancer is considered recurrent when PSA levels rise above 0.1-0.2ng/mL, although some physicians use higher PSA thresholds or wait to observe the elevated PSA score on 2 or more successive tests. 3,4 recent years, whole slide images (WSI), resulting from the use of microscopes for digitizing glass slides with histology samples, have become increasingly popular.Digital pathology enables fast acquisition, management, and interpretation of histology. 5Additional opportunities have arisen from the use of digital pathology, such as pathological workflows including computational algorithms and applications of artificial intelligence. 6Quantitative histomorphometric approaches to define the features of tumor morphology have previously discriminated between the gland and nuclear shape and architectural and textural features between Gleason grades in prostate cancer histology. 7,8Additionally, recent studies have shown that in benign and less aggressive prostate cancers, local gland orientations are similar to each other, but these orientations differ in aggressive diseases; thus, gland angularity was able to predict PCa BCR. 9 These recent studies, although promising, were limited to tissue microarrays, needle biopsies, or small sections of WSI.Additionally, the process of assigning Gleason grades is subjective and relies on manual grading by experienced pathologists, which can be timeconsuming and result in interobserver variability between regions of high and low-grade cancers. 6This study aims to determine whether quantitative pathomic features of prostate cancer differ in patients who exhibit biochemical recurrence after surgery using WSI and high-resolution tiles taken from pathologist-annotated cancer and noncancer regions.Specifically, we tested the hypothesis that histomorphometric features calculated from digital pathology would be able to predict BCR better than clinicopathological features alone.

Patient Population and Data Acquisition
Data from 78 prospectively recruited patients with biopsy-confirmed PCa undergoing radical prostatectomy between 2014 and 2021 were analyzed for this institutional review boardapproved study.Patients underwent multiparametric magnetic resonance imaging before prostatectomy on a 3T magnetic resonance imaging scanner (General Electric or Siemens Healthineers) using an endorectal coil.Each protocol included T2-weighted imaging, dynamic contrast-enhanced imaging, and diffusion-weighted imaging.
After surgery, patients were monitored with PSA testing according to the standard-of-care practices.Patients were followed up for more than 7 years after surgery, and those with a measured PSA score of at least 0.2 ng/mL at any time point after surgery were considered biochemically recurrent.Inclusion criteria for this study included digitized and annotated histology after radical prostatectomy and at least one postsurgery PSA test.Additionally, patients were excluded if digitized histology was of poor quality (ie, rips, tears, low resolution, etc.).Clinicopathological features and demographic information for the study cohort are summarized in Table 1.

Histopathologic Analysis
Robotic prostatectomy was performed approximately 2 weeks after imaging using the da Vinci system (Intuitive Surgical). 10,11Surgical margin status was not explicitly reported in our patient notes; however, surgical stage, which factors in margins, was included in our analyses.Prostate samples were fixed in formalin overnight and sectioned using custom slicing jigs. 12Prostate masks used to create each patient-specific slicing jig were manually segmented from the patient's T2-weighted image using Analysis of Functional NeuroImages (http://afni.nimh.nih.gov/), 13 3D modeled using 3dSlicer (slicer.org),and imported into Blender 2.75 (https://www.blender.org/)5][16][17] The patient-specific jigs were then 3D printed using a fifth-generation Makerbot (Makerbot Industries; Fig. 1).
Whole-mount tissue sections were paraffin-embedded, sectioned, and hematoxylin and eosin (H&E)-stained in our histology core laboratory.The stained slides were digitally scanned at 40× magnification using either an Olympus (Olympus Corporation) (n = 47 patients, 359 slides) or a Huron (Huron Digital Pathology; n = 32 patients, 256 slides) sliding stage microscope at resolutions of 0.34 or 0.2 microns per pixel, respectively.Due to the large file size that these images had, downsample factors of 8 and 10 for the Olympus and Huron slide scanners were applied, respectively-where the Huron scanner used a larger downsampling factor due to the increased resolution (ie, magnification of 5 or 4, and resolution of 0.04 or 0.02 microns per pixel, respectively).Two slide scanners were used in this analysis due to a recent equipment upgrade.Although this may be a confounding factor in our analyses, we opted to use a greater number of patients and slides rather than limit ourselves to less available data.Slides were downsampled for computational processing, ensuring a comparable downsampling factor between the 2 slide scanners.A total of 614 digitized slides across all patients (mean: 7.9 slides, range: 2-14 per patient) were manually annotated for different Gleason patterns by a genitourinary fellowship-trained pathologist using a stylus on Microsoft Surface Pro 4 (Microsoft) with a preloaded color palette.Annotations from 289 slides were performed on digital images produced on a third lower resolution slide scanner at 40× magnification and 0.85 μm/px (Nikon Metrology).These slides were rescanned, and the annotations were brought into the higher resolution space of the previously aforementioned slide scanners.A flowchart for the slide scanners used and their respective properties can be found in Supplementary Figure S1.The manually annotated tissue classes include seminal vesicles, atrophy, high-grade prostatic intraepithelial neoplasia (HGPIN), Gleason 3 (G3), Gleason 4 non-cribriform glands (NC), Gleason 4 cribriform (to papillary) glands (CG), and Gleason 5 (G5).9][20][21][22] However, G4CG was defined as true papillary, small cribriform (ie, rounded acinar spaces with ≤12 lumens and no solid area), and large cribriform (ie, more sprawling, cribriform to focally solid formations).An example slide with annotations can be found in Figure 1.

Annotation Segmentation
Digital whole-mount mount slides were divided into high-resolution tiles that were 3,000 × 3,000 pixels or 5,100 × 5,100 pixels for the Olympus or Huron microscopes, respectively.The Olympus scanner was originally used for slide scanning, and a tile size of 3,000 × 3,000 pixels was previously found to be sufficient for showing gland morphology.When upgrading to the use of the higher resolution Huron scanner, a larger tile size was chosen to capture a comparable amount of histology per tile.The tiles were named using xy-coordinates corresponding to their location within the WSI.These tiles were stitched back together to recreate the WSI and concurrently create x-and y-coordinate look-up tables.
Annotations previously performed on the lower resolution microscope were aligned to high-resolution WSI using the MATLAB R2021b's (The MathWorks) imregister function.
Each of the possible annotations was isolated into individual masks, including a mask for nonatrophic benign tissues.An additional averaged white image of nontissue was found to remove the background or primarily white spaces from the masks.Annotation masks were divided into regions of interests (ROIs) per lesion.Therefore, ROIs were individually compared with the xy-look-up tables to determine coordinates corresponding to tiles within the ROI.Five randomly selected tiles that were considered more than 50% within the mask after white-space removal were saved into annotation-specific directories.For nonatrophic benign tissues, 15 tiles were selected instead to not only obtain the most representative examples of benign tissue but also a balance between cancer and benign tissue tiles.Additionally, ROIs that were too small to extract 5 tiles from were skipped.A representative tile from each annotation mask is shown in Figure 1.Table 2 breaks down the distribution of patients, slides, and Gleason-annotated histology tiles per slide scanner.

Pathomic Feature Calculation
Both high-resolution tiles and WSI were first downsampled by a factor of 2 to decrease the processing time while maintaining the highest resolution.Images were then processed using custom, in-house MATLAB (Mathworks Inc) pipelines to extract pathological features for quantitative analysis.A color deconvolution algorithm was first applied to segment stroma, epithelium, and lumen based on their corresponding stain optical densities (ie, positive hematoxylin or eosin and background), 23 which visually proved to create better histology masks than using RGB color channels.These features were then filtered and smoothed to remove noise and improve histology segmentations using built-in MATLAB functions from the Image Processing Toolbox.These resulting masks were used to automatically calculate histomorphometric features per gland, including lumen roundness and area; epithelial roundness, area, and wall thickness; and cell fraction (ie, the percent of epithelial cells per total gland area, defined by the area of the epithelium without lumen).Individual lumen and epithelium were numbered using bwlabel, perimeters were defined using bwboundaries, and areas were calculated using regionprops.Epithelial wall thickness was defined as the minimum distance between the inner and outer edges of the gland.Roundness was calculated using the following equation: 4πArea P ermeter 2 .Additionally, overall stromal and epithelial areas were computed using bwarea and normalized to the image area as the tile sizes differed between the 2 microscopes (ie, 3,000 × 3,000 pixels and 5,100 × 5,100 pixels) and variation in prostate size across WSI. Figure 2 shows an example of the pathomic feature calculations across an extracted tile from the Huron scanner.An example WSI from the Huron scanner is shown in Figure 1, and one from the Olympus scanner is demonstrated in Supplementary Figure S2.

Tumor Volume Calculation
In addition to pathomic features, we also calculated the tumor volume and ratio for each patient for additional quantitative information to the model.The patient-specific prostate masks previously described were used to calculate the patient's prostate volume by multiplying the number of voxels present in the mask by the size in the x-, y-, and z-planes.However, WSI was segmented into their individual annotation classes, and the area of each annotation was calculated, as well as the total tissue area per slide.The total area of cancerous regions was summed and normalized to the total tissue area per slide to determine the ratio of cancer to noncancer on a slide.This ratio was then multiplied by the patient's prostate volume to approximate the volume of tumor present.

Statistical Analysis
To evaluate the ability of pathomic features to predict biochemical recurrence, two-tailed, two-sample t tests were used to determine whether individual pathomic features were significantly different between those who did or did not experience biochemical recurrence across the 3 tile groups and WSI.Logistic regressions were fit to assess the ability of the pathomic features, as well as tumor volume and cancer ratio, to predict BCR (PSA ≥0.2 ng/mL postsurgery) using both features calculated across tiles and WSI using SPSS Statistics (IBM, Armani).From the tile-level analysis, a model was fit per tissue type (all tiles, cancer only, and noncancer only).Additionally, models were fit using clinicopathological information, which included age, surgery stage, Grade Group, presurgery PSA, tumor ratio/volume, CAPRA score, and Grade Group alone to compare current gold standard predictors of BCR to pathomic features.Finally, 2 models encompassing all tile pathomic features or WSI pathomic features with clinicopathological features were fitted to determine whether all features together could further predict recurrence.Multiple comparison corrections were performed using the Benjamini-Hochberg false discovery rate (FDR) procedure. 24All logistic regression models were evaluated for performance using receiver operator characteristic (ROC) curves, and performance was quantified by area under the curve (AUC).Kaplan-Meier survival curves were generated on the all-tile pathomic feature group, cribriform presence, 19,20,25 CAPRA score, 26 and Gleason Grade Groups to directly visualize observed survival and determine significance; however, a Cox proportional hazards regression was used to generate hazard ratios.Tile pathomic features were divided into low and high risk for recurrence using a threshold of 0.2 based on the ROC curve.Additionally, CAPRA scores were merged into low, medium, and high risk for recurrence. 27

Results
From our t test analyses of independent pathomic features, across all tile groups and WSI, lumen roundness was found to be significantly different between those who did and did not experience BCR (all P <.05).Additionally, epithelial wall thickness was significantly different in the cancer-only tile group (P <.05).Features that we found to be significant predictors of BCR from the logistic regressions were further examined to determine the relationship between the feature and BCR. Figure 3 compares representative cancer tiles from 2 patients who did or did not develop biochemical recurrence.
Logistic regression classified each of the tile groups at accuracies of 86%, 84%, and 81% for all tiles, cancer, and noncancer, respectively, and area under the ROC curve of >0.70 for the all tiles and cancer groups and 0.82 for the noncancer group.The WSI model performed at an accuracy of 84% with an AUC of 0.72.These were comparable with our clinical feature models that had accuracies of 81%, 86%, and 80% for encompassing clinical features, CAPRA score, and GG only, respectively, with AUCs of 0.78, 0.77, and 0.70.Finally, the models encompassing pathomic and clinical features performed at accuracies of 86% and 84% with AUCs of 0.78 and 0.82 for tile or WSI features, respectively.ROC curves comparing pathomic feature, clinical, and combined models are shown in Figure 4, top.Each model was statistically compared using a randomized permutation test (n = 10,000 permutations; Table 3), and the pathomic feature model had a statistically significant higher AUC compared with the 3 clinical feature models (all P <.01) and a smaller AUC compared with both combined models (both P <.05).Additionally, the combined feature models statistically outperformed all 3 of the clinical feature models (all P <.005).
The Kaplan-Meier analyses, using the log-rank test to determine significance, revealed that no significant differences in time to recurrence existed between Gleason Grade Groups (P =.1; Fig. 4, bottom).Although we found our logistic regression models were comparable across pathomic and clinical feature models, we found that the Kaplan-Meier survival analysis using pathomic features marginally outperformed CAPRA score and cribriform presence; however, all 3 models showed significant time differences to recurrence (P =.0036,.006,and.0039, respectively).Individual hazard ratios for features within models can be found in Table 4.Note that the CAPRA and Grade Group models needed to be simplified for model convergence.Additionally, CAPRA (1) includes scores 0-2, CAPRA (2) includes scores 3-5, and the reference class includes scores 6-10.Similarly, Grade Group (1) includes GG1 and 2, Grade Group (2) includes GG3, and the reference class includes GG4 and 5.
Table 5 shows b values and P values for each feature within each model, where P values indicate the significance of individual features between recurrent and nonrecurrent groups, stratified through our logistic regression models.After FDR correction, these results did not maintain significance, indicating that although no individual feature may predict biochemical recurrence, the combination of features can.Regression tables for the clinical features, Grade Group only, and combined pathomic and clinical feature models can be found in Supplementary Tables S1 and S2.

Discussion
The high rate of prostate cancer recurrence has led to growing interest in predicting biochemical recurrence to identify individuals at high risk for adverse outcomes early and accurately.Of the 78 patients in this study cohort, 16 experienced eventual BCR.This study showed that the use of quantitative pathomic features calculated from digital histology of PCa, as well as tumor volume and ratio, may provide clinicians with better information than the traditional qualitative pathological assessment.Most notably, we show that in areas of noncancer, pathomic features can predict BCR better than Gleason Grade Groups and CAPRA score, indicating that gland morphologies differ in noncancer regions between patients who will recur and those that will not.
The likelihood of recurrence is typically thought to increase depending on the aggressiveness of the cancer, as defined by the Gleason Grade Groups, the current gold standard prognostic indicator of BCR; however, this system is subjective and prone to significant interobserver variability. 17,28The Gleason Grade Group model (ROC AUC = 0.70, P =.1) performed worse than all pathomic features models, as well as the CAPRA and cumulative clinical feature models (ROC AUC range: 0.72-0.87,P <.05).Of note, 7 patients in our cohort graded as GG5, with the worst prognosis, did not experience a recurrence, indicating room for improvement in the current system.Quantitative features of prostate cancer histology may aid clinicians in predicting a patient's risk of biochemical recurrence compared with a qualitative analysis of gland morphology.This may be especially highlighted due to the increased performance of cribriform presence survival analysis compared with Grade Groups, which has an obvious difference in gland morphology compared with other patterns.
Machine and deep-learning approaches are becoming increasingly popular in digital pathology studies, especially regarding automated grading of cancers to reduce the time burden on pathologists.Machine learning algorithms use computationally derived metrics to assess information similarly to human intervention, whereas deep-learning models use deeper features that are extracted throughout the process and eliminate human intervention.A previous deep-learning study showed luminal features to be prognostic of BCR. 29 We showed that in addition to luminal features, epithelial and stromal features can also stratify patients by BCR risk.A recent machine learning study applied a model to 16 clinical features to predict BCR. 30 This model achieved an accuracy of 97%; however, we show in our analyses that adding quantitative pathomic features to clinicopathological reports could further improve predictive power.
In this study, we tested the hypothesis that pathomic features of prostate cancer can predict patients who eventually exhibit biochemical recurrence after radical prostatectomy.Of the 8 features calculated with our pathomic feature calculator, lumen roundness was found to be a significant predictor of BCR in all pathomic feature models.Epithelial wall thickness was significantly different within the cancer tile group.These findings suggest that the variation in gland architecture across tiles and WSI may be related to prostate cancer aggressiveness.
Although the results of this study are promising, several limitations exist to note.This study uses a relatively small patient cohort compared with previous prostate cancer studies; thus, a larger cohort of patients may show different pathological characteristics.The small number of patients in this cohort who experienced eventual biochemical recurrence limited our ability to create a test dataset or perform multiple comparison corrections to assess our models' performances thoroughly.Future studies assessing these features as predictors of BCR should look to larger datasets with greater representation of BCR to fully evaluate these pathomic features.Additionally, this was a single-center study with one pathologist's annotations of the data.Future studies may seek external validation and comparison of additional pathologist annotations to increase the generalizability of the models.Finally, a subset of our annotations was completed on whole-mount samples that were digitized using lower resolution sliding stage microscope and were later rescanned at a higher resolution.Efforts to align and rescale the low-resolution annotations to a higher resolution may have introduced minor alignment differences.
We demonstrate in a cohort of 78 patients who underwent surgery for prostate cancer that those who had an increase in lumen roundness by prostate histology showed an increased risk for biochemical recurrence after prostatectomy.The use of quantitative histomorphometric feature models calculated from the digital histology of PCa was comparable with the traditional qualitative classification of patients defined by clinicopathological information and Gleason Grade Groups alone.Furthermore, combining the quantitative features with standard clinical features performed best, indicating that the addition of pathomic features is superior to the information from the qualitative reports alone.Further research is warranted to determine possible inclusion in treatment guidance.Additional studies should probe the inclusion of machine and deep-learning applications to further predict the risk of recurrence.Top: ROC curves for each logistic regression model with AUCs and 95% CI.Pathomic feature models were generated for noncancer, cancer, all tiles, and WSI.Clinical feature models included a general model encompassing clinicopathological information, CAPRA score, and Grade Groups.Combined feature models included tile or WSI pathomic features with clinicopathological information.Bottom: Kaplan-Meier survival analyses were conducted to compare survival of pathomic features, cribriform presence, CAPRA score, and Gleason Grade Groups (P values calculated using log-rank test).In our patient cohort, higher grade cancers were not significantly more likely to recur.R1, low-risk; R2, highrisk; Cr1, cribriform glands absent; Cr2, cribriform glands present; C1, CAPRA 0-2; C2, CAPRA 3-5; C3, CAPRA 6-10; G1-5, Gleason Grade Group 1-5.Lab Invest.Author manuscript; available in PMC 2024 February 16.

Figure 1 .
Figure 1.Schematic representation of the annotation and tile extraction process, with second-order feature segmentations across a WSI.Top left: T2-weighted MR image used to model the prostate slicing jig.Custom prostate slicing jigs allow the prostate to be sliced to match the slice thicknesses of the MR image.Top right: whole-mount samples were stained, digitized, and annotated by a pathologist.Annotations were color-coded by class to extract representative tiles from each of the annotation classes: atrophy, HGPIN, G3, G4NC, G4CG, G5, and Seminal Vesicles (not pictured).Bottom: pathomic features are calculated across WSI and feature maps are overlaid on the original image.

Figure 2 .
Figure 2. Pathomic feature segmentations.A representative Gleason 3 tile from the Huron microscope with pathomic feature maps.Calculated features include lumen roundness and area; cell fraction; epithelial roundness, area, and wall thickness.Calculated values are overlaid on the respective glands.Units of area maps are in mm 2 , and thickness in mm.Roundness and cell fraction are unitless.

Figure 3 .
Figure 3.Significant feature predictors of BCR with examples from cancer tiles.Top left: the patient who did not experience biochemical recurrence had regions of papillary to cribriform glands that had been previously associated with BCR.Bottom left: the patient who did experience BCR had regions of low-risk G3 cancer.Middle: feature maps overlaid on tiles.

Table 2
Breakdown of patients, slides, and tiles per scanner Lab Invest.Author manuscript; available in PMC 2024 February 16.

Table 3
Randomized permutation test results comparing statistical differences between each tested model

Table 4
Kaplan-Meier and Cox proportional hazards regression models to determine the relative risk of eventual biochemical recurrenceHazard Ratios and within group P values were calculated using the Cox proportional hazards model and corresponding Wald test.The 4 main groups' significance was determined using KaplaneMeier survival analysis and corresponding log-rank test (denoted with an *).