Fully automated computational measurement of noise in positron emission tomography

Study population

Thirty-eight patients who underwent clinically indicated [18F]FDG PET/CT imaging between March and April 2021 were retrospectively selected. There were no specific inclusion criteria except the full availability of clinical and imaging data. The patients included in the study were part of an earlier investigation by our institution evaluating an unrelated image quality classifier (currently under review). Written informed consent for the scientific use of medical data was obtained from all patients. The study was approved by the local ethics committee (BASEC 2021–00444, Cantonal Ethics Committee Zürich, Switzerland).

PET acquisition and reconstruction

Examinations were performed on a latest generation six-ring digital detector PET/CT scanner (Discovery MI Gen 2, GE Healthcare). A body mass index (BMI)-adapted 18F-FDG dosage protocol was used as outlined in detail elsewhere [6]. To generate standardized uptake value (SUV) images with increasing noise levels, five datasets were reconstructed from each exam by unlisting list mode data, resulting in reduced emission counts equivalent to 120 s, 90 s, 60 s, 30 s, and 15 s acquisition time per bed position. For each patient, 6–8 bed positions were acquired (depending on patient size), with an overlap of 23% (17 slices) [6]. Furthermore, images were reconstructed with a proprietary reconstruction kernel using block sequential regularized expectation maximization (Q.Clear, GE Healthcare) with beta values 450 and 600 as suggested in a previous study [6]. Proprietary image analysis software (Advantage Workstation Version 4.7, GE Healthcare) was used to generate maximum intensity projection (MIP) images in anteroposterior orientation.

Manual assessment of image noise and image quality

One reader (A.G., board-certified radiologist with 6 years of experience in diagnostic imaging) measured the pixel-wise standard deviation of a semi-automated cubicle VOIs (2 × 2 × 2 cm3) in the right liver lobe and in lung parenchyma, avoiding focal lesions and vasculature. Two readers (M.M. and S.S., board-certified radiologists and/or nuclear medicine physicians with 9 and 6 years of experience in diagnostic imaging respectively) reviewed all MIP images per patient in consensus. Each image was assigned the label “sufficient image quality” if both readers rated the image quality sufficient, and the label “insufficient image quality” if at least one reader rated the image quality insufficient. Readers were blinded to image reconstruction settings during the readout.

Automated measurement of image noise

An automated algorithm previously used for global image noise measurements in clinical CT examinations [12,13,14] was adapted for the analysis of SUV images from [18F]FDG PET. This algorithm builds on an approach originally described by Christianson et al [15] and was recently implemented in the open-source statistics programming language R (version 4.1.0, R Foundation for Statistical Computing) [16]. On both CT images and SUV images of PET, the standard deviation of pixel/voxel values in a given region is declared as noise [6, 12, 13]. Thus, the exact approach of the original algorithm designed for CT imaging may also be used for SUV images of PET. A visual representation of the method is provided in Fig. 1. In brief, the original SUV images from [18F]FDG PET are first subjected to a thresholding procedure, in which all voxels that are not part of patient tissue are excluded (part A of Fig. 1). This procedure is performed slice-by-slice as the algorithm loops through all images of a dataset (part B, left side of Fig. 1). Then, to generate so-called noise maps, the SUV images are resampled slice-by-slice to a lower matrix size so that a novel pixel (so-called macro pixel) in a resampled image (i.e., noise map) contains information from 64 pixels (i.e., 8 × 8 pixels) of the original SUV image. Importantly however, the value assigned to each of these novel macro pixels in the noise maps corresponds to the standard deviation of the SUV values of the 64 pixels contained in the original SUV image. Thus, each noise map contains locally resolved standard deviation values of the original SUV images which should provide an accurate representation of the image noise (part B, right side of Fig. 1).

Fig. 1figure 1

Illustration of the generation of the Globals Noise Index (GNI). A shows a maximum intensity projection image. To calculate the GNI, the whole imaging volume is subjected to further processing. B shows representative transversal image slices of the imaging volume at 4 different locations and the corresponding noise maps. C shows the distribution of noise values across the whole imaging volume. Specifically, the histogram is generated by considering all noise values from each image slice. The mode value of the histogram corresponds to the GNI, a single global surrogate parameter of image noise across the whole imaging volume of a given imaging dataset

From these noise maps (i.e., one noise map per slice), a histogram of the noise distribution across the whole patient is computed. Importantly, all noise values from each noise map of each slice are considered for the computation of the histogram (part C of Fig. 1). From this histogram (typically showing a right-skewed distribution), the mode value is extracted representing the global image noise level (so-called Global Noise Index, GNI). Notably, the histogram is right-skewed because noise sharply increases at anatomical borders (for example, if the standard deviation of SUV values is computed across a bordering area such as thoracic wall and lung tissue). Consequently, these few pixels in the noise maps covering anatomical borders will have very high noise values. However, because most pixels cover homogeneous tissue in which noise should be relatively lower, the histogram rises sharply at lower noise values. Thus, by using the mode value, the noise distribution in homogeneous tissue is accurately and effectively represented by the GNI (part C of Fig. 1).

Notably, for the GNI as computed in the current study, all noise values from each noise map of each image slice are considered. However, theoretically, the GNI could also be computed from individual image slices (Fig. 2). While not investigated in our study, this would allow the user to focus on the noise levels of individual anatomical regions.

Fig. 2figure 2

Illustration of the Global Noise Index (GNI) as computed slice-wise. A shows a coronal image slice. B shows the GNI as calculated speerately for each image slice. While not considered for this current study, a slice-wise computation of the GNI would allow the user to analyze specific anatomical regions in terms of their image noise level

Statistical analysis

All statistical analyses were performed in the open-source statistics programming language R (version 4.1.0, R Foundation for Statistical Computing) [16]. Categorical variables are expressed as frequency distribution. Continuous variables are presented as mean ± standard deviation. As absolute values between GNI and manual noise measurements may differ, we quantified whether the distribution of noise values was similar between GNI and manual measurements irrespective of the absolute values. To this extent, noise values from GNI and manual measurements were first standardized (z-scoring). Then after standardization, the two-sample Kolmogorov–Smirnov tests modified for paired data were computed to compare the distribution of noise values between GNI and manual measurements of the liver and lung.

Furthermore, to further benchmark the GNI relative to manual measurements, we quantified the correlation between GNI and manual measurements in liver and lung parenchyma (i.e., without prior standardization of noise values) by computing Spearman’s rank correlation coefficients. Coefficients were interpreted according to Chan [17, 18] as follows: at least 0.8 very strong, 0.6 up to 0.8 moderately strong, 0.3 to 0.5 fair, less than 0.3 poor. To assess whether the GNI can differentiate between sufficient and insufficient image quality as determined subjectively by expert readers, receiver operating characteristic analysis was performed. The area under the curve (AUC) was computed and sensitivity and specificity were calculated at a cutoff value maximizing Youden’s index. Two-sided p-values of < 0.05 were considered statistically significant.

Comments (0)

No login
gif