Through an extensive search that included reference cross-checking of relevant articles, a total of 4715 unique articles were collected. An initial screening, based only on studies’ titles, led to the exclusion of 3946 articles. A following evaluation of the remaining 769 articles, based on their abstract, led to the removal of 666 articles. Among these, about 44% were excluded because they were not in English, presented non-original data, or were classified as case reports. Further analysis of the remaining 103 articles using full-text assessment (Supplementary Table S5) resulted in the exclusion of another 74 articles. The main reason of exclusion at this stage was the lack of an adequate description of calcification morphology.
Ultimately, 29 studies met the strict inclusion criteria, which covered CMDs and associated clinicopathological factors in patients diagnosed with DCIS. The results section is further organized in three sections: (i) study characteristics, presenting an overview of the included studies and patient populations; (ii) results of synthesis, consolidating the extracted outcomes from the included studies; and (iii) assessment of bias using the QUIPS tool, evaluating the risk of bias across the selected studies.
Figure 1 outlines the approach used for the systematic literature search and subsequent study selection process.
Fig. 1Overview of the Medline, EMBASE, and Web of Science literature search and selection process of eligible articles. The searches were performed on January 25, 2022. Note that 4713 articles were identified, of which 29 met our inclusion criteria. Abbreviations: DCIS, ductal carcinoma in situ; IBC, invasive breast cancer
Study characteristicsIn all 29 studies, the data was collected retrospectively from hospital registries, national registries, or clinical trials. Cohort (n = 27), case–control (n = 1), and a case-cohort within a random control trial (n = 1) designs were used. Twenty were single-center studies, and 9 involved multiple centers. While 20 studies exclusively studied DCIS, nine studies included patients with both DCIS and IBC. The number of DCIS patients with calcifications per study ranged from 32 to 1783. Fifteen studies described calcification morphology according to the BI-RADS system, while the remaining 14 studies used non-BI-RADS descriptors. In ten studies, the lesions were described as only screen-detected, while in another ten studies, they were reported to be both screen-detected and non-screen-detected. The remaining studies did not specify the method of lesion detection. Thirteen studies specified using mammograms with calcifications only, without other mammographic abnormalities.
Table 1 shows the characteristics of the selected studies between 2000 and 2022.
Table 1 Characteristics of studies reporting on mammographic morphology of calcifications associated with clinicopathological factorsReported clinicopathological factors in relation to CMDsA total of 29 studies investigated 17 distinct factors concerning CMDs (Table 2), with 28 studies assessing non-prognostic outcomes, including high grade (n = 16), (micro)invasion (n = 8), (comedo)necrosis (n = 7), HER2 overexpression (n = 6), ER positivity (n = 6), age (n = 3), Ki67 or proliferation (n = 2), histological size (n = 2), neoductgenesis (n = 2), calcification distribution (n = 2), margin status (n = 1), comedocarcinoma (n = 1), multicentricity (n = 1), tenascin-C (n = 1), and Oncotype DX score (n = 1). Furthermore, five studies assessed prognostic outcomes including recurrence (n = 4) and DCIS progression to IBC (n = 1).
Table 2 Overview of clinicopathological factors that were assessed in the studiesData synthesis and meta-analysisOut of the 17 clinicopathological factors reported across 29 studies, 14 factors were significantly association in at least one study (Table 2). A meta-analysis was conducted for five clinicopathological factors, deemed sufficiently homogeneous across 20 studies (Fig. 2): high grade (n = 11), HER2 overexpression (n = 4), ER positivity (n = 4), (comedo)necrosis (n = 5), and the presence of (micro)invasion) (n = 5). The meta-analysis shows the aggregated results for low-, intermediate-, and high-risk CMDs concerning the clinicopathological factors.
Fig. 2The meta-analysis results for each clinicopathological factor in a forest plot. For the calcification morphology descriptor (CMD) risk groups, the pooled odds ratios (pORs), 95% confidence intervals and associated p-values are shown. Furthermore, associated heterogeneity measures (I2, P(Q)) and publication bias (Egger’s p-value), as well as certainty of evidence summarized in the GRADE score are given. The low-risk CMDs served as a reference. Per CMD risk-group, details on the studies (number, number of calcified lesions, and number of cases) are given
High-risk CMDs demonstrated a significant association with four clinicopathological factors including high grade (pOR, 4.92; 95% CI, 2.64–9.17), (comedo)necrosis (pOR, 3.46; 95% CI, 1.29–9.30), (micro)invasion (pOR, 1.53; 95% CI, 1.03–2.27), and ER positivity (pOR, 0.33; 95% CI, 0.12–0.89). High-risk CMDs were negatively associated with ER positivity, indicating a reduced incidence of ER positivity in high-risk versus low-risk CMDs.
Intermediate-risk CMDs were significantly associated with high grade (pOR, 2.07; 95% CI, 1.44–2.96) and (comedo)necrosis (pOR, 2.58; 95% CI, 1.87–3.54), while showing an increased pOR of 1.66 (95% CI, 0.92–2.99) with a p-value of 0.09 for (micro)invasion.
Heterogeneity measures I2 and P(Q) revealed inconsistencies in the estimates reported in the included studies concerning high grade (I2, P(Q): 54%, p = 0.002 for high-risk and 47%, p = 0.04 for intermediate-risk CMDs), ER positivity (I2, P(Q): 49%, p = 0.12 for high-risk CMDs), comedo(necrosis) (I2, P(Q): 52%, p = 0.08 for high-risk CMDs), and invasion (I2, P(Q): 55%, p = 0.06 for intermediate-risk CMDs).
Neither high-risk CMDs (pOR, 1.80; 95% CI, 0.28–11.46) nor intermediate-risk CMDs (pOR, 0.72; 95% CI, 0.19–2.82) were significantly associated with HER2. One contributing study by Zhou et al. [47] reported odds ratios below one, indicating a reduced risk. A considerable discrepancy existed between odds ratios calculated from the different included studies, reflected in the heterogeneity measures I2 with > 83% and p(Q) < = 0.001.
Egger’s test was not significant, indicating that there was no publication bias for high grade, while publication bias was not determined due to the small sample sizes for the other outcomes.
Certainty of evidenceAccording to the GRADE tool approach, the certainty of evidence for all outcomes can be rated as low (Supplementary Table S3), as the studies assessed associations through observations. The calculated GRADE score denoted the level of insufficient evidence or bias across five domains (risk of bias according to the QUIPS tool, heterogeneity, indirectness, imprecision and publication bias). The highest GRADE score of −11 for (high-risk and intermediate-risk CMDs combined) was identified for ER positivity and high grade, indicating the highest level of evidence. The GRADE score for the other outcomes were as follows: invasion (−13), (comedo)necrosis (−14), and HER2 overexpression (−16). The next section evaluates the risk of bias across the selected studies in detail using the QUIPS tool.
Risk of bias per QUIPS domainTo further understand the reliability of the included studies, a thorough assessment of bias was conducted using the QUIPS tool. The risk of bias was assessed across five study domains, namely study participation, exposure measurement, outcome measurement, study confounding, and statistical analysis and reporting. For studies measuring prognostic outcomes a sixth domain, study attrition, was also evaluated (Fig. 3).
Fig. 3Risk of bias per QUIPS domain for each individual study with (a) non-prognostic outcome(s) and (b) prognostic outcome(s)
Across the 29 studies, five out of 170 individual rated QUIPS domains exhibited a low-high discrepancy between the two reviewers in their rating of bias. Following consultation with the third reviewer (A.W.B.D.), these domains were assigned a moderate risk of bias rating. This suggests that the discrepancies in the assessment of the bias using the QUIPS tool were limited.
The study participation domain revealed eight studies with a high risk of bias and 15 with a moderate risk in either prognostic or non-prognostic outcomes. The downgrading of studies was primarily attributed to small sample sizes and inadequate description of study groups, data collection criteria, and methods or reasons for missing data.
In the exposure domain, seven studies exhibited a high risk of bias, while 13 demonstrated a moderate risk. The downgrading mainly resulted from situations where only one reader determined the CMDs, or when crucial details were omitted, such as whether the readers were blinded to the outcome and how consensus was achieved between readers.
The outcome measurement domain indicated three studies with a high risk of bias and 11 studies with a moderate risk. High-risk studies were characterized by a severe lack of detail regarding the definition and method of measuring the outcome variable. Moderate-risk studies contained insufficient information on the measurement of outcome variables and, if applicable, blinding of reviewers.
With regard to the confounding domain, most individual studies did not adjust their results for potential confounders. Five studies were rated as having a high risk of bias in this domain and 18 as having a moderate risk of bias. High-risk studies failed to account for potential confounding through matching, stratification, or the initial assembly of comparable groups. Prognostic studies were rated as moderate or high risk when they did not adjust for treatment or age in their statistical analyses. Studies with a design that somewhat limited the risk of confounding were rated as moderate risk.
The statistical analysis and reporting domain predominantly displayed a low risk of bias. However, in thirteen studies, this domain was rated as moderate, because the analysis was not powerful enough to prove or disprove the hypothesis. This occasionally occurred for individual CMD groups, e.g., when chi-square tests were applied to small sample sizes.
In the study attrition domain, three prognostic studies were rated as having a moderate risk of bias because the follow-up or characteristics of women who completed the study and those who did not were not described.
Notably, the average risk of bias was significantly higher in the exposure measurement (p = 0.01) and confounding (p = 0.025) domains for studies published before 2010 compared to those published after 2010.
Comments (0)