Tables 2 and 3 present a comparative analysis of the performance of different modalities—tabular data, text, and medical images—for metastasis prediction. Table 2 highlights the results using imbalanced data, while Table 3 demonstrates the impact of applying SMOTE (Synthetic Minority Oversampling Technique) to mitigate class imbalance. These tables provide valuable insights into how class imbalance affects the performance of various modalities and how techniques like SMOTE can enhance the utility of image-based data. The terms Case 1 and Case 2 refer to distinct image generation scenarios, each characterized by the use of different sub-strings incorporated into the prompt.
Case 1: tumor structure description, presence of lymph nodes compared to the primary tumor site, histological type, and tumor differentiation
Case 2: describes histological type and differentiation status of tumors
Table 2 Comparison of generated text and image modality-based unimodal prediction performance with the baseline, with tabular data. In this case, the same imbalanced data was used for analysisIn the imbalanced data scenario (Table 2), image modalities such as histopathology, mammogram, and ultrasound exhibit significant performance limitations, particularly in detecting the minority class (class 1, representing metastasis cases). The recall values for class 1 are notably low across these modalities. For instance, histopathology achieves a recall of only 0.03 for class 1, while mammograms and ultrasound show recall values ranging from 0.03 to 0.09 and 0.06 to 0.12, respectively. These results underscore the challenges posed by class imbalance, as the models tend to favor the majority class (class 0) and struggle to identify metastasis cases. In contrast, the text modality outperforms the image modalities under these conditions, achieving an accuracy of 0.81 and a recall of 0.57 for class 1. This suggests that the text modality captures semantic features that are more robust to class imbalance, making it more effective for metastasis prediction in imbalanced datasets.
Table 3 Comparison of generated text and image modality-based unimodal prediction performance with the baseline, with tabular data. In this case, SMOTE is applied to mitigate data imbalanceThe application of SMOTE to address class imbalance, as shown in Table 3, leads to a dramatic improvement in the performance of image modalities. The recall values for class 1 increase significantly, demonstrating that SMOTE effectively balances the model’s ability to predict both classes. For example, histopathology recall for class 1 improved from 0.03 to 0.79 in Case 1, while mammogram and ultrasound showed increases from 0.03 to 0.74 and 0.06 to 0.85, respectively. These results highlight SMOTE’s effectiveness in enhancing the detection of metastasis cases. Additionally, the overall accuracy of image modalities improves substantially. Histopathology’s accuracy rises from 0.72 to 0.86, mammogram’s from 0.69 to 0.82, and ultrasound’s from 0.74 to 0.85 in Case 1.
Table 4 Early fusion performance with imbalanced data, using various classifiers with combinations of generated images with generated textTable 5 Early fusion performance with SMOTE, using various classifiers with combinations of generated images with generate textThe F1-scores for class 1 also show significant improvements, indicating a more balanced performance between the two classes. Histopathology’s F1-score for class 1 increases from 0.05 to 0.84, mammogram’s from 0.05 to 0.82, and ultrasound’s from 0.11 to 0.83 in Case 1. These findings demonstrate that SMOTE is highly effective in addressing class imbalance, significantly improving the performance of image modalities for metastasis prediction. The results emphasize the importance of balancing datasets to enhance the predictive capabilities of models, particularly when working with medical imaging data.
Early FusionTables 4 and 5 provide a comparative analysis of early fusion performance using various classifiers and combinations of generated images and text. Table 4 presents the results with imbalanced data, while Table 5 demonstrates the impact of applying SMOTE to address class imbalance in early fusion scenarios. These tables offer insights into how class imbalance affects early fusion performance and how SMOTE can enhance the detection of the minority class.
In the imbalanced data scenario (Table 4), early fusion shows significant limitations across all modality combinations, particularly in detecting the minority class (class 1). The recall values for class 1 are extremely low, highlighting the challenges of class imbalance. For instance, Histopathology+BERT achieves a recall for class 1 ranging from 0.06 to 0.32, while Mammogram+BERT and Ultrasound+BERT show recall values ranging from 0.00 to 0.35 and 0.06 to 0.41, respectively. These results indicate that the models struggle to identify metastasis cases effectively when the data is imbalanced.
The F1-scores for class 1 are also consistently low across all modality combinations. For example, Histopathology+BERT achieves F1-scores of 0.29 to 0.41 for class 1, while Mammogram+BERT and Ultrasound+BERT show F1-scores ranging from 0.21 to 0.49 and 0.20 to 0.53, respectively. This further emphasizes the difficulty in detecting the minority class under imbalanced conditions.
Different classifiers exhibit varying performance, but all struggle with minority class detection. The best performance is achieved by XGBoost with Ultrasound+BERT, which attains an accuracy of 0.81. However, even this combination shows limitations in recall and F1-score for class 1, underscoring the pervasive impact of class imbalance on early fusion performance.
The application of SMOTE, as shown in Table 6, leads to a dramatic improvement in early fusion performance. SMOTE effectively balances the performance between the majority class (class 0) and the minority class (class 1), significantly enhancing the detection of metastasis cases. The recall for class 1 increases substantially across all modality combinations. For example, Histopathology+BERT shows an improvement in recall for class 1 from 0.06–0.32 to 0.75–0.93, while Mammogram+BERT and Ultrasound+BERT demonstrate increases to 0.78–0.93 and 0.77–0.98, respectively.
The overall accuracy of early fusion also improves significantly with SMOTE. Histopathology+BERT achieves an accuracy increase from 0.65–0.77 to 0.75–0.89, Mammogram+BERT improves from 0.64–0.80 to 0.75–0.89, and Ultrasound+BERT rises from 0.65–0.81 to 0.73–0.90. These improvements highlight the effectiveness of SMOTE in enhancing the predictive capabilities of early fusion models.
Furthermore, the F1-scores become much more balanced between class 0 and class 1. For class 1, F1-scores improve from 0.20–0.53 to 0.72–0.90, while class 0 F1-scores remain strong, ranging from 0.76 to 0.91. This demonstrates that SMOTE not only improves the detection of the minority class but also maintains robust performance for the majority class.
In conclusion, the results from Tables 5 and 6 illustrate the significant impact of class imbalance on early fusion performance and the effectiveness of SMOTE in addressing these challenges. By balancing the dataset, SMOTE enables early fusion models to achieve more accurate and reliable predictions, particularly for the critical task of metastasis detection.
MCGA FusionTables 6 and 7 present a comparative analysis of the performance of Mutual Co-Guided Attention (MCGA) fusion using various classifiers combined with generated images and text. Table 7 highlights the results with imbalanced data, while Table 7 demonstrates the improvements achieved by applying SMOTE to address class imbalance in MCGA fusion. These tables provide insights into how class imbalance affects MCGA fusion and how SMOTE can enhance its performance, particularly for detecting the minority class.
Table 6 Mutual Co-Guided Attention (MCGA) fusion performance with imbalanced data, using various classifiers with combinations of generated images with generated textIn the imbalanced data scenario (Table 6), MCGA fusion shows significant limitations, particularly in detecting the minority class (class 1). The recall values for class 1 are extremely low across all modality combinations, reflecting the challenges posed by class imbalance. For instance, Histopathology+BERT achieves a recall for class 1 ranging from 0.15 to 0.35, while Mammogram+BERT and Ultrasound+BERT show recall values ranging from 0.00 to 0.53 and 0.03 to 0.32, respectively. These results indicate that the models struggle to identify metastasis cases effectively when the data is imbalanced.
The F1-scores for class 1 also remain low across all modality combinations. For example, Histopathology+BERT achieves F1-scores of 0.16 to 0.47 for class 1, while Mammogram+BERT and Ultrasound+BERT show F1-scores ranging from 0.00 to 0.55 and 0.05 to 0.38, respectively. This further underscores the difficulty in detecting the minority class under imbalanced conditions.
Despite the poor performance for class 1, the models exhibit strong performance for the majority class (class 0). Precision for class 0 ranges from 0.70 to 0.83 across modalities, while recall ranges from 0.76 to 1.00. This highlights the model’s tendency to favor the majority class, which is a common issue in imbalanced datasets.
Table 7 Mutual Co-Guided Attention (MCGA) fusion performance with SMOTE, using various classifiers with combinations of generated images with generated textThe application of SMOTE, as shown in Table 7, leads to significant improvements in the performance of MCGA fusion. SMOTE effectively balances the performance between the majority class (class 0) and the minority class (class 1), significantly enhancing the detection of metastasis cases. The recall for class 1 increases dramatically across all modality combinations. For example, Histopathology+BERT shows an improvement in recall for class 1 from 0.15–0.35 to 0.71–0.87, while Mammogram+BERT and Ultrasound+BERT demonstrate increases to 0.67–0.90 and 0.71–0.90, respectively.
The overall accuracy of MCGA fusion also improves substantially with SMOTE. Histopathology+BERT achieves an accuracy increase from 0.59–0.77 to 0.74–0.88, Mammogram+BERT improves from 0.67–0.78 to 0.70–0.88, and Ultrasound+BERT rises from 0.69–0.80 to 0.72–0.90. These improvements highlight the effectiveness of SMOTE in enhancing the predictive capabilities of MCGA fusion models.
Furthermore, the F1-scores become much more balanced between class 0 and class 1. For class 1, F1-scores improve from 0.00–0.55 to 0.68–0.91, while class 0 F1-scores remain strong, ranging from 0.74 to 0.91. This demonstrates that SMOTE not only improves the detection of the minority class but also maintains robust performance for the majority class.
Thus, the results from Tables 6 and 7 illustrate the significant impact of class imbalance on MCGA fusion performance and the effectiveness of SMOTE in addressing these challenges. By balancing the dataset, SMOTE enables MCGA fusion models to achieve more accurate and reliable predictions, particularly for the critical task of metastasis detection.
Comments (0)