While it is almost always possible to calculate a TD50 or benchmark dose in many cases the values are not meaningful, in that they do not represent a biologically relevant response. Typically for TD50s this occurs when there is no response (no carcinogenic effects were observed in the respective experiments) and an arbitrary large number will be returned by the TD50 model. In the case of the TD50 model an AIC test (Akaike 1973) is used to identify cases where no response is present, and these values are then excluded from further analysis. Benchmark doses calculated using the current version of PROAST do not provide an explicit response curve necessary for calculating a log-likelihood, making an equivalent test impossible. Additionally, it is possible for PROAST to fail to converge on some responses. While a BMD analysis results in BMDs and BMDLs derived from the dose–response data, it also serves as a tool for evaluating the (statistical) quality of the data. Criteria for data rejection are currently under discussion in various groups, including EFSA (Benford et al. 2010). It has been suggested that the ratio of BMDL to BMD might serve as a measure for this and that the data should be rejected if this ratio exceeds a particular value (e.g. 10 or 100, (Barlow et al. 2006)). Based on this, we used a convergence threshold of 100 for the ratio of BMDL to BMDU (more stringent than the BMDL to BMD ratio), with benchmark doses only being accepted if the upper confidence interval was less than 100 times the lower confidence interval as well as the response having passed the AIC test using the TD50 model. We thus categorise responses into 3 types depending on these criteria:
1.Not significant: responses which show no significant response (under the TD50 model), whether or not the benchmark dose model converged.
2.Un-converged: responses which were significant, but PROAST was unable to find a good solution for the benchmark dose.
3.Converged: responses which were significant and PROAST was able to find a good benchmark dose.
A breakdown of the categories is given in Table 1.
Table 1 Comparison of model fits for the TD50 and BMD modelsComparison of individual responsesAnalysis of individual responses was performed on the converged results. For the calculation of an acceptable intake (AI), linear extrapolation is commonly used from some point of departure. In line with this, the TD50 was linearly extrapolated downwards to a TD10 to give a comparable estimate to the BMD10. It should be noted that as linear extrapolation is non-conservative relative to the true TD10 this will lead to an over-estimation (under-estimation of the potency) of approximately 31% (see Supplementary Material 1 for calculations) for the extrapolated TD10 compared to the true value under the TD50 model. However, while not an accurate estimate of the TD10 the extrapolated value is in line with common use under ICH M7 and so represents the relevant value in a regulatory context.
Both individual TD50s and BMDs appear similarly distributed (p = 0.85 based on a Kolmogorov–Smirnov test of converged results), showing good agreement between the models on aggregate.
Individual responses showed a high level of agreement between the TD10 values and the BMD10s, with an r2 value of 0.94. A best-fit line was calculated using orthogonal least-squares on the logged values which found a close match with ln(BMD10) = 0.96 ln(TD10)—0.18 (Fig. 4).
Fig. 4Comparison of response-level potencies using BMD10 and TD10 values. The best-fit line is fitted against the (natural) log of these values for the converged responses. The importance of model convergence in robust potency estimation is illustrated by the poor correlation obtained from unconverged results relative to the agreement of the data to the best-fit line for the converged values. Potencies have been adjusted for experiment duration vs. lifetime
While the overall agreement was high there was significant variation around the best-fit line, with the log-difference in estimates appearing non-normal (p = 0.002) and heavy tailed with a mean error of 0.40 log10 units or approximately 2.5 times relative error (Fig. 5).
Fig. 5Comparison of the difference between BMD10 and TD10 potency estimates. The distribution of the difference in potencies is heavy tailed with a mean error of 0.40 log10 units (approximately 2.5 times relative error). Additionally, the unconverged BMDs (BMD ratio > 100) are highly skewed towards low (high potency) estimates. “Error factor” is the fold-change of error between BMD10 and TD10 values, so a value of 102 would indicate a BMD 100 times greater than the corresponding TD10
Potency comparisons summarised at compound-levelSummary values for each compound were calculated in line with the Carcinogenic Potency Database (CPDB) and Lhasa Carcinogenicity Database (LCDB) methods. Studies were grouped according to the species of test subject in which the compound was tested. For each study that showed an increase in tumour formation, the most potent TD50 was selected. The harmonic mean of the most potent TD50s across each species was then used as a representative summary value for that compound within the species.
Compound summary TD10s were able to be calculated for 53 of the 55 analysed compounds, with two compounds showing no significant responses. Summary BMDs were calculated for 41 compounds with the remaining compounds having no converged findings (Table 1).
While there was a high correlation between compound summary BMD10s and TD10s (r2 = 0.905), the benchmark doses do not follow the same distribution as the TD10s. A best-fit line using orthogonal least squares shows BMDs on average predict a lower potency but with higher variation in results [ln(BMD) = 1.35 ln(TD10) + 1.54] (Figure 6).
Fig. 6Comparison of compound level potencies using the BMD10 and the TD10. The best-fit line is fitted against the (natural) log potencies. Compound-level BMD estimates only account for the converged responses
This is supported by the overall distributions which both appear log-normal (p = 0.476 and p = 0.471 for log TD10s and BMDs respectively). The TD10s show a lower (more potent) mean value with log(TD10) = –1.92 compared to –1.02 for the log(BMD10), and a lower dispersion with a standard deviation of 1.88 compared to 2.46 for the BMDs (Fig. 7).
Fig. 7The distributions of compound-level potencies using TD10 or BMD10 methods. The benchmark dose approach yielded, on average, lower potency estimates (i.e., higher values) coupled with greater variation. Points show the 5th, 50th, and 95th percentiles
Acceptable intakesOne notable application of the benchmark dose is in the setting of acceptable intakes for nitrosamine impurities. Although partially superseded by the “Carcinogenic Potency Categorisation Approach” (CPCA) (EMA 2023; Kruhlak et al. 2024), a class limit of 18 ng has been proposed for nitrosamine intake based on the 5th percentile of nitrosamine TD50s from LCDB. For comparison, we present an equivalent limit calculated on both the TD10s and benchmark doses included in this analysis. Acceptable intakes have been calculated based on Eq. (5):
$$\text=\frac10\}}\times 50 \text=\frac10\}}\times 50 \text$$
(5)
It is important to note that we derive a slightly different limit from the TD50s to those previously published (Fine et al. 2023; Thomas et al. 2021). This likely occurred due to the exclusion of compounds for which we were unable to calculate a benchmark dose. Interestingly however the limit derived from the benchmark dose values is similar to that derived using the TD10s (Table 2).
Table 2 Comparison of acceptable intakes based on the 5th percentile of the TD10 and BMD10 distributionsAlthough the benchmark doses were less potent estimates on average, the increased variability means that the lower extreme of the BMD10 distribution lies below that of the TD10 distribution. By chance, the cross-over point where both models produce the same potency estimate is at the 5.78th percentile, very close to the 5th percentile used to calculate the acceptable intakes. This suggests that while a class-level acceptable intake is unlikely to change significantly with the wider acceptance of benchmark doses, at least in the case of nitrosamines, individual estimates used for read-across may change significantly.
Comments (0)