51 head and neck (H&N) VMAT plan data were used, with 43, 1 and 7 cases as training, validation, and testing datasets, respectively. The VMAT plans were optimized using the direct machine parameter optimization (DMPO) method of Pinnalce system. The MLC coordinates and MU at each CP were exported. The FM was derived by accumulating the fluence of every two sequential CPs. The corresponding dose was also exported. Each plan contained two full arcs. With 3°or 4°CP spacing, the number of CPs was 241 or 181, and the number of the FM and corresponding dose was 240 or 180, correspondingly. In total 7182, 180 and 1260 samples were included in the training, validation, and testing datasets.
Data preparationFM accumulationFigure 1 illustrated the FM accumulation between two sequential CPs with a simple example. The example supposed the leading MLC was static, and the tracking MLC moved from the position plotted in solid line to the position in dashed line. With this example, it was easy to deduct more complicate cases. Since the magnitude of the MLC movement was marginal, we supposed the MLC moved in a uniform speed for the sake of simplicity. The intensity at any position (x) was calculated as:
$$I(x)=\left\} 1& \leqslant x<} \\ _}}_ - }}}& \leqslant x<} \\ 0&,x \geqslant } \end} \right.$$
(1)
Fig. 1Illustration of FM accumulation
Figure 2 showed the accumulated FM and its corresponding dose of the CPs at 181°, 185°and 189°gantry angle. The MLC position was exported from the dicomRT plan file. The leaf gap and transmission were also considered, which were set to 0.25 mm and 1% according to the commissioning data. The corresponding dose were shown in coronal view. Due to the leaf transmission, low dose within the body was observed at the region where the MLC was closed.
Fig. 2Examples of the accumulated FM and corresponding dose. The corresponding dose were shown in coronal view
FM projectionAs shown in Fig. 2, the accumulated FM was in 2D format. But the desired dose distribution was in 3D format. It was difficult for the DL network to use the 2D FM as input and to derive the 3D dose distribution. The 2D FM needed to be projected onto 3D volume representing the patient body for further proceeding. That was to assign corresponding FM intensity to the 3D volume. The commonly used 3D-DDA [13] or Bresenham [14] traversal methods first constructed the vector connecting the source point and the FM pixel, and then calculated the intersection voxels within the 3D volume. These methods were sensitive to the resolution of FM due to the divergence effect. The divergence effect was significant especially for the FM pixels far away from the center. If the resolution of FM was coarse, the divergence effect may cause the omission of certain voxels within the 3D volume. In order to handle this issue, we constructed the vectors by connecting the source point and the voxels. The intersection points on the FM plane were calculated. And the voxels were assigned to the value of corresponding intersection point. The comparison of the 3D-DDA and Bresenham methods against our method was shown in Fig. 3.
Fig. 3FM Projection. 3-a and 3-b ploted only one row of voxels and one row of FM pixels for clarity. 3-c showed our method on a patient case. One coronal layer of the 3D volume was shown. The example voxel was plotted with blue dot, and the intersection point and source point in orange and red dots
Model inputThe input of the network included the projected 3D FM, CT, radiological depth and the source to voxel distance (SVD). All inputs were in 3D format. Figure 4 showed the inputs at one axial layer. The CT value was converted to Hounsfield unit (HU) value. The radiological depth was calculated using the ray-tracer method proposed by Siddon [15]. The SVD was calculated as the magnitude of the vector connecting the source point and the voxel with 3D volume. The range of 3D FM and HU was 0 to 1 and 0 to around 2, respectively. In order to keep the data consistence, the radiological depth and SVD was normalized by 10 and 100, respectively.
Fig. 4Model architectureWe used the classical 3D UNet [16] in this study. The architecture was shown in Fig. 5. UNet was currently one of the most widely used networks in the field of medical image processing. The network included encoder and decoder modules. The encoder module sequentially performed convolution, rectified linear unit (Relu), and down-sampling operations, and the decoding module sequentially performs up-sampling, convolution, and Relu operations. Additionally, UNet useed skip (copy and crop) connections to fuse the features at different scales to improve the network’s performance.
Fig. 5Network architecture. The number of channels was denoted at the bottom
Training and evaluationAll plans were delivered on the Varian Novalis linac. The width of MLC was 5 mm at the center, and 10 mm at the top and bottom. The resolution of accumulated FM was set to 2.5 mm. The dose was calculated using the adaptive convolution (AC) method of Pinnalce system. The resolution was set to 4 mm×4 mm×4 mm.The dimension was cropped to 128 × 80 × 80, which was wide enough to cover the regions of interest (ROIs) of all patients. The network inputs were also sampled and cropped to align with the dose volume. The FM accumulation, projection, radiological depth and SVD calculation were implemented using Matlab software. The network training was implemented with pytorch [17] on a desktop computer with Intel i9-11900 K processor and NVIDA GeForce RTX 3090 GPU. The batch size was set to 10. The learning rate was set to 1.0e-04 with 1.0e-6 weight decay.
Comments (0)