J. Imaging, Vol. 8, Pages 304: Evaluation of a Linear Measurement Tool in Virtual Reality for Assessment of Multimodality Imaging Data—A Phantom Study

1. IntroductionThe past 20 years have seen major advances in the management of structural and congenital heart defects, with the development of increasingly complex surgical techniques as well as the emergence of catheter-based and minimally invasive interventions. As complexity has increased, operators have become more reliant on noninvasive imaging data such as echocardiography, computed tomography (CT) and magnetic resonance imaging (MRI) to plan procedures. Traditional interrogation of such 3D datasets uses a flat screen to display either two-dimensional (2D) multiplanar reconstructions (MPR) or volume-rendered images, which simulate the appearance of depth using algorithms that generate colour and lighting effects. More recently, there has been a rising interest in novel three-dimensional (3D) imaging techniques, including augmented, mixed and virtual reality (VR), together termed ‘extended reality’ (XR). These applications enable cardiac surgeons, interventionists and cardiologists to visualise and interact with 3D imaging data in an intuitive way, giving realistic depth perception and enhanced anatomical understanding [1]. It is hoped that these benefits may lead to improved outcomes for patients with structural heart disease.An important feature of any procedural planning tool is the ability to perform reliable measurements. While there has been a surge in the number of XR systems developed for use in cardiac patients in the past 5 years, there is a paucity of published measurement validation data [2,3,4,5,6,7,8]. Measurement accuracy is the closeness of a measured value to the true value, which can only be assessed when the actual dimension (ground truth) is known [9]. Previously, XR measurement tools have been evaluated by comparison of XR measurements to another imaging platform using anatomic data where ground truth is not known. Only two publications have compared measurements in cardiac XR systems to ground truth using phantoms; however, in both studies, only a single imaging modality was assessed [5,7]. The use of XR to plan surgical or catheter intervention must be able to measure accurately in a number of different imaging modalities used to plan such procedures.

This study aims to assess the accuracy and reliability of both VR and industry-standard “flat screen” software packages to measure phantoms of known dimensions using 3D echocardiographic, CT and MRI imaging data.

4. DiscussionIntraobserver and interobserver variability, as assessed by intraclass correlation coefficient, were excellent in both VR and on standard software, with values greater than 0.99 across all imaging modalities (Table 1). When intraobserver variability was assessed by coefficient of variation, there was good agreement in all modalities in both VR and Sectra, although the value was higher in 3D echocardiographic measurements in VR at 4.7%. Interobserver variability was good for all standard software measurements and for CT and MRI in VR, and acceptable in 3DE measurements in VR at 6.01%. These results suggest that measurement reliability was lower, although still acceptable, for 3DE measurements in VR. We hypothesise that this may relate to the overall smaller measurement dimensions in the echocardiography phantom compared to those in the CT/MRI phantom, with 4/7 measurements ≤10 mm in the 3DE data compared to no measurement ≤10 mm in CT/MRI. As shown in Table 1, when the 3DE measurement values Measurement accuracy showed very low MAPE for VR measurements in CT and MRI data, at 1.6% for both modalities. This was comparable with MAPE on standard software, which was 1.8% for CT and 2.2% for MRI measurements. MAPE was highest for VR measurements in the 3DE phantom at 7.7%, as compared to 2.3% in standard software. This higher error in echocardiographic data may again be explained by the significantly smaller measurement dimensions in the 3DE phantom compared to CT/MRI. MAPE magnifies differences for relatively smaller measurements; for example, a 1 mm measurement error in the 6 mm anechoic cyst would give a percentage error of 16.7%, whereas the same error in the smallest 12.7 mm measurement on the ACR phantom is 7.9%. Nevertheless, this does not explain the comparatively low error of the same measurements performed on the standard software platform. This trend may suggest lower accuracy of VR in the measurement of smaller structures. Although not performed on phantom data, studies that assessed measurements in other cardiac XR systems suggested similar or larger measurement discrepancies, especially in the smallest measurements. Sadeghi et al. reported differences between VR and 2D CT of –0.3 ± 0.9 mm and –1.4 ± 1.5 mm in measurements of paravalvar leaks [8]. Ballocca et al. compared a VR system to standard software in 3DE, with Bland–Altman plots suggesting up to 4 mm measurement discrepancy across all measurement dimensions, including those 4].Bland–Altman analysis demonstrated no systematic error (bias) in VR measurements in CT and MRI phantom data (Figure 3a and Figure 4a). This is in contrast to measurements made on standard clinical software (Sectra) in these data, where a systematic bias towards overmeasurement was demonstrated (Figure 3b and Figure 4b). We hypothesise that this may relate to the lack of true depth perception when viewing 3D structures on a 2D screen. In Sectra, it is possible to place measurement points at any depth in the volume-rendered image; however, perception of depth is challenging on a 2D screen and may have led to some overestimation of measurement. VR can potentially overcome these issues, as realistic depth perception and the ability to intuitively orientate and move the images can enable more confident 3D point placement. This pattern of larger-than-truth measurements was not seen in the 3DE phantom; instead, there was a small bias towards undermeasurement in both VR and the standard platform TomTec (Figure 5). However, the degree of bias was very small (VR mean −0.52 mm ± 0.36 mm; TomTec mean −0.22 mm ± 0.18 mm) and is unlikely to be of clinical significance. TomTec software differs to Sectra in that it allows users to only place measurement points on the user-defined cropping plane. While this prevents the inadvertent placement of measurements at a different depth in the image, this does not allow true perception of the 3D nature of structures, which can facilitate procedure planning.

The limits of agreement of the measurement differences were similar for VR and Sectra in MRI data (VR: −2.4 to +1.7 mm, total 4.1 mm; Sectra: −1.0 to +2.8 mm, total 3.8 mm) indicating a similar precision for both measurement tools in this modality. Limits of agreement for CT (VR −1.4 to +2.2 mm, total 3.6 mm; Sectra +0.3 to +2.1 mm, total 1.8 mm) and 3DE data (VR −1.7 to +0.7 mm, total 2.4 mm; TomTec −0.9 to +0.4 mm, total 1.3 mm) were wider in VR compared to the standard measurement tool. These results suggest lower precision of VR measurements compared to standard software. However, acceptability of a measurement tool is usually based on clinical requirements and the absolute limits of agreement were relatively small. Whether the limits determined in this experiment are significant will likely depend on the clinical situation, i.e., the overall dimensions of the structures of interest.

Performing very small measurements in VR may be more challenging for a number of reasons. On the whole, VR headsets and controllers are designed for gaming and other applications with gross controller movements, and as such, registration of very fine hand movements may not be adequately tracked and displayed in the VR space. For this study, we used the HTC Vive Cosmos headset, which uses ‘inside-out’ motion tracking, through which user and controller positions are tracked using sensors located within the headset, in contrast to traditional ‘room-space’ VR, which uses external tracking stations placed around the room. This made the VR system portable and less cumbersome, but a drawback of ‘inside-out’ motion-tracking can be lower responsiveness compared to systems using external sensors [15,16]. In addition, it may be more challenging to place measurement points in VR with very high accuracy, as controllers are held in free space without the stability provided by a desktop mouse. In future, developments in headset and tracking technology, as well as innovative mechanisms for fine point placement within the VR environment, would lend additional user confidence in this context. Further validation work is required to assess VR measurement of smaller dimensions in all modalities, and has clinical relevance for procedural planning in smaller patients and children. Limitations

Whilst the use of phantom data was necessary to properly assess measurement accuracy, and they are designed to simulate human tissues, they cannot substitute for the heterogeneity and complexity of real cardiac imaging data. Additionally, this study did not assess for the measurement variation, which might arise from different image acquisition techniques, such as from variation in MRI sequence parameters or other ultrasound probes. In this study, measurements were performed only on volume-rendered images, which may be more susceptible to under- or overestimation due to changes in gain or contrast than MPR; however, measurements performed within the 3D space in a user-defined fashion may be more intuitive, and arguably more useful.

Comments (0)

No login
gif