The MechSens trial (ClinicalTrials.gov ID NCT04128566) recruited participants with a unilateral ACL injury that had occurred 2–10 years prior to enrollment, without concomitant or other knee injuries, and participants without any knee injury (controls). All participants were generally healthy and physically active (> 2 h of moderate-intensity aerobic activity per week), had a body mass index < 35 kg/m2 and no MRI contraindications (pacemaker, neurostimulator, pregnancy, metal splinters or metal prosthesis); further details on the demographics have been described previously [13, 14]. The MechSens trial aimed to recruit 96 participants across two age groups (20–30 and 40–60 years), with each age group comprising 12 women and 12 men with ACL injury, and 12 women and 12 men without knee injury. Of 89 participants who were recruited between January 2020 and December 2022, 85 had complete MRI baseline data (20–30 years ACL injury: n = 14/9 women/men, mean ± standard deviation age: 26.1 ± 3.1 years, BMI: 23.8 ± 2.8 kg/m2, time since injury: 5.2 ± 2.8 years; 20–30 years controls: n = 12/12, age: 26.6 ± 3.0 years, BMI: 22.9 ± 2.0 kg/m2; 40–60 years ACL injury: n = 10/4, age: 51.5 ± 5.5 years, BMI: 24.7 ± 2.6 kg/m2, time since injury: 5.6 ± 2.7 years; 40–60 years controls: n = 12/12, age: 49.0 ± 6.1 years, BMI: 24.7 ± 4.3 kg/m2) [13, 14]. Because the collection and processing of 2-year follow-up data is still ongoing, the current study only used baseline data. The MechSens trial was approved by the regional ethics board (Ethics Committee Northwest and Central Switzerland EKNZ 2019–01315) and conducted in accordance with the Declaration of Helsinki. All participants provided written informed consent.
Imaging and manual segmentationThe MechSens imaging and manual analysis protocol has been described in detail [14]. The imaging protocol included a double-oblique coronal qDESS sequence (in-plane resolution: 0.31 × 0.31 mm, acquisition matrix: 512 × 512, number of slices: 64–72, slice thickness: 1.5 mm, repetition/echo time: 17 ms/4.85 ms, flip angle: 15°, acquisition time: 9–12 min) that was acquired separately for both knees and covered the entire knee joint (Fig. 1). The qDESS was the only imaging sequence from the protocol [14] used for analyzing cartilage T2 and morphometry in the current study (Fig. 1) [7]. MRI scans of both knees were acquired after 20–30 min of sitting without weight-bearing using a 3 T Prisma scanner and a 15-channel Tx/Rx knee coil (both Siemens Healthineers, Erlangen, Germany). The software for the qDESS sequence was developed at the Department of Radiology, University Hospital Basel, Basel, Switzerland and was installed on the scanner (software version: VE11C) based on a research agreement with the manufacturer.
Fig. 1Illustration showing qDESS MRIs, cartilage T2 maps and the segmentation agreement. The top row shows the first echo (a), the second echo (b), and the cartilage T2 map together with the color code map (c, in ms). The middle row shows the 1st echo of the qDESS MRI (d), the manual segmentation (e), and the automated segmentation (f) in the knee with the highest segmentation agreement. The bottom row shows the 1st echo of the qDESS MRI (g), the manual segmentation (h), and the automated segmentation (i) in the knee with the lowest segmentation agreement. (DSC: Dice Similarity Coefficient)
Manual segmentation of the cartilages in the weight-bearing femorotibial joint (MT & LT: medial & lateral tibia and cMF & cLF: central medial & lateral femoral condyle) was performed by experienced readers (Chondrometrics GmbH, Freilassing, Germany) blinded to injury status, sex, and age [14]. All segmentations were quality-controlled by an expert reader and corrections were made as needed.
Automated cartilage segmentationThe CNN-based automated cartilage segmentation was based on a U-Net architecture previously adapted and validated for morphometric analysis of 3D gradient echo MRI [17, 18] and for cartilage T2 analysis of 2D MESE MRI [19, 20]. Because no separate set of qDESS MRIs was available for training the U-Net, two separate U-Nets were trained, one from both knees of 42 odd-numbered MechSens participants (training set: n = 70 knees from 35 participants; validation set: n = 14 knees from 7 participants) and one from both knees of 42 even-numbered MechSens participants (training set: n = 70 knees from 35 participants; validation set: n = 14 knees from 7 participants). Both U-Nets were trained on all MRI slices of both knees of the respective training sets. Left knees were flipped horizontally to ensure consistent location of medial and lateral cartilages in the coronal images. To account for the two qDESS echoes, the first convolutional layer of the U-Nets consisted of two input channels. The training was configured to use five labels with a weight of 0.22 for each of the four femorotibial cartilages, and a weight of 0.12 for the background (weights determined empirically). We also used a weighted cross-entropy loss function, Adam optimization with an initial learning rate of 0.01 and a mini-batch size of four slices per training iteration. Network weights were randomly initialized using a variance scaling initializer. The U-Nets were implemented in Python (Python Software Foundation, DE, USA) and used the TensorFlow framework (Google LLC, Mountain View, CA, USA).
The two independently trained U-Nets were then applied to the segmentation of the respective other group (even or odd-numbered) of participants (Fig. 1). After the automated segmentation, an automated post-processing was performed to detect and correct problems, such as filling of small enclosed unsegmented areas, eliminating implausible segmentations (e.g., fragments not connected to the main segmentation in the same or other slices), and smoothing of segmentation spikes. In contrast to the manual segmentation, no quality control or manual correction was performed for the automated cartilage segmentations. The automated analysis took approximately 26 s for each qDESS volume (1 s for the U-Net segmentation and 25 s for the post-processing).
Cartilage T2 and thickness analysisCartilage T2 was estimated after acquisition using in-house software (Chondrometrics GmbH, Freilassing, Germany) for each of the segmented voxels as described previously [7, 14]. Briefly, T2 was calculated from the two steady-state free precession (SSFP) echoes (1st echo: S+, 2nd echo: S−) by numerically minimizing the difference between the observed signal ratio \(\frac_^}_^}\) and the ratio between the analytically derived signal intensities \(\frac}^ \left( ,T_ } \right)}}}^ \left( ,T_ } \right)}} \cdot e^} }}}}\) (with \(\frac}^ \left( ,T_ } \right)}}}^ \left( ,T_ } \right)}} \cdot e^} }}}} = \frac \cdot }\alpha } \right) \cdot r}} - }\alpha } \right) \cdot r}} \cdot e^} }}}}\), \(_= ^_}}\), \(r = \left( ^ } \right) \cdot \left( - q^ } \right)^}}\), \(_= ^_}}\), \(p=1-_cos \alpha -_^\left(_-cos \alpha \right)\), \(q = E_ \left( } \right)\left( \right)\), α = 15°, TR = 17 ms, TE = 4.85 ms, and T1 = 1000 ms, T1 estimate based on [21]). The numerical minimization was performed using a golden section search method, which was initialized with a T2 search range ranging from 0 to 5000 ms.
Because cartilage T2 is dependent on tissue depth [2], the four cartilage plates were divided into the top (superficial) and bottom (deep) 50%, based on the distance of each cartilage voxel to the cartilage surface and bone interface, respectively. Laminar T2 across the total femorotibial joint (FTJ) was derived as the average of all four femorotibial cartilages. Laminar T2 in the medial and lateral femorotibial compartment (MFTC & LFTC) was derived as the average T2 observed in the respective medial and lateral cartilages (MFTC: MT and cMF, LFTC: LT and cLF).
Cartilage thickness was computed for the four cartilage plates from the cartilage segmentations as described previously [22]. Cartilage thickness in the MFTC and LFTC was calculated as the sum of the cartilage thickness of the respective medial/lateral cartilages. Total FTJ cartilage thickness was calculated as the average of MFTC and LFTC cartilage thickness.
Statistical analysisSegmentation agreement between automated and manual cartilage segmentations was evaluated using the Dice similarity coefficient (DSC, range: 0 to 1) and the Hausdorff distance (HD), with the HD measuring the distance between border voxels of automated vs manual segmentations (in mm, 0 mm representing perfect overlap of border voxels). The accuracy of laminar cartilage T2 and cartilage thickness from automated vs. manual cartilage segmentations was evaluated using Bland & Altman plots. Intraclass correlation coefficients (ICC, two-way mixed effects, absolute agreement, single rater/measurement) were used to evaluate the correlation between automated vs. manual cartilage segmentations for laminar cartilage T2 and cartilage thickness, Pearson correlation coefficients are reported in Supplemental Table 1 to allow comparisons with other studies.
Table 1 Segmentation agreement between automated and manual cartilage segmentations (mean and standard deviation (SD) measured using the Dice Similarity Coefficient (DSC, range: 0 (no agreement) to 1 (perfect agreement)) and the Hausdorff Distance (HD, in mm) in the medial and lateral tibia (MT & LT) and the central medial and lateral femoral condyle (cMF & cLF)Cross-sectional differences between ACL-injured and contralateral knees (between-knee comparison) and between ACL-injured and left and right knees from control participants (between-group comparison) were evaluated as previously done in the primary analysis [14], but now for both segmentation techniques. The primary focus of analysis was on comparing effect sizes of between-knee and between-group differences in laminar T2 and cartilage thickness in the total FTJ between segmentation techniques. Effect sizes of between-knee and between-group differences were also reported for the MFTC and LFTC for both segmentation techniques. Cartilage T2 and thickness measures were tested for outliers and the Shapiro–Wilk test and Q–Q plots were used to test for normality. Because the data of some of the groups were not normally distributed, results of between-knee and between-group comparisons were reported as medians and 25th/75th percentiles. Non-parametric tests were also used for detecting between-knee and between-group differences. Kruskal–Wallis tests were used to assess overall effects. Conovar–Iman (between-knee) and Dunn (between-group) tests were used for the post hoc comparisons. To account for multiple comparisons, the p values were adjusted using Holm correction (padj), the significance level was padj < 0.05.
A non-parametric variant of the Cohen’s D (dn-p) was used for comparing effect sizes of between-knee and between-group comparisons between segmentation techniques. dn-p was computed using the difference in the medians instead of the difference in the means, and using the pooled median absolute deviation instead of the pooled standard deviation. The statistical group comparisons were performed with the R package PMCMRplus (v1.9.6; Pohlert; 2022). The correlation analyses were performed using the R packages irr (v.0.84.1; Gamer et al.; 2019) and stats (v.3.5.2; R Core Team; 2023).
Comments (0)