Clinical validation of an artificial intelligence‐enabled wound imaging mobile application in diabetic foot ulcers

1 INTRODUCTION

Diabetes mellitus (DM) is debilitating chronic disease with a prevalence of 6.28% of the world's population in 2017.1 Patients with DM have a 15% to 25% lifetime risk of developing diabetic foot ulcers (DFUs).2 DFU is associated with significant morbidity and mortality, with major amputation rates ranging between 19.9% and 39%.3-8 It is therefore prudent for proper management of DFUs.

Management of DFUs is complex and requires a multidisciplinary team of doctors, nurses, and allied health professionals for successful management.7 Wound care is an important facet of the DFU “care bundle”, consisting of wound assessment, monitoring, and management. Traditionally, wound assessment and monitoring are performed by specialised and trained wound nurses. With advancements in technology, until now there are several commercially available wound assessment or monitoring systems available for monitoring of chronic wounds.9 We recently conducted a systematic review on wound imaging modalities and concluded the following: (a) paucity of existing studies evaluating the efficacy of wound assessment and imaging systems and (b) bias in existing studies evaluating the effectiveness of wound imaging systems with lack of sample size calculation.10 Though they are several commercially available wound assessment systems, a majority of them have not been reviewed in the literature on measurement accuracy, especially because DFU may occur on curves or angles on the foot. Hence, the aim of this study is to add on to existing literature on commercially available wound imaging systems, by clinically validating an artificial intelligence-enabled wound imaging mobile application (CARES4WOUNDS [C4W] system, Tetsuyu, Singapore) against traditional wound assessment measurements by a trained specialist wound nurse, for patients with DFU.

2 MATERIALS AND METHODS

This is a prospective cross-sectional study on patients with DFUs from June 2020 to January 2021 in a single-centre tertiary hospital. Inclusion criteria were all patients aged 21 years and above with DFUs. Exclusion criteria were patients who were pregnant, breastfeeding, had non-DFUs such as primarily venous ulcer or neuropathic ulcer, or did not have capacity to consent. This study was approved by a local institutional review board (National Healthcare Group Domain Specific Review Board Ref No: 2019/00837). Written consent was obtained for all patients who were included in our study with appropriate translations as required for non-English speakers.

2.1 Tetsuyu C4W wound imaging solution

The C4W system is intended to be used for wound imaging, automatic measurement of wound dimensions, and tissue classification to aid in the assessment of pressure ulcers, diabetic ulcers, venous ulcers, and problem healing wounds. It is a non-contact, digital wound assessment, and documentation tool aimed to provide a standardised approach to the monitoring of wound healing progress by medical professionals. It consists of a software application that is used in combination with commercial off-the-shelf (COTS) mobile computing platforms (iPad with 3D structure sensor and iPhone with dual camera) and a web-based software application that is tailored to the mobile platforms. The mobile application uses locked software algorithms to automatically measure the dimensions of wounds, detect wound bed, and classify the tissue types based on epithelisation, granulation, necrotic, and slough. The data output is documented in real time in accordance with the current state-of-the-art clinical workflow to assure consistency and accuracy in wound assessment. The patients' data are accessible remotely through a secured web-based portal to improve overall patient management.

2.2 Study protocol

Our study protocol is shown in Figure 1. Eligible patients were identified in both inpatient and outpatient settings. Baseline demographics and clinical profile were collected prior to the commencement of the study. The approximate study duration for each patient is five visits or until complete resolution of the ulcer, whichever was earlier. All patients who were included in the study were subjected to a standardised DFU management pathway with standardised follow-up; no additional clinic visits were required for the purpose of this study. During each clinic visit, wound measurements were recorded traditionally by a trained specialised wound nurse and electronically by a dedicated research coordinator using the C4W imaging system. An additional digital wound image capture with a reference ruler was also taken independently with a Canon (PowerShot G7X Mark II) digital camera. A wound episode was defined as any clinic consultation for this study. The total number of wound images is calculated by the number of images taken by each of the C4W imaging system (ie, each wound episode should result in nine wound images, with three wound images taken per device).

image

Study protocol for participation recruitment and standardisation of process for wound measurement

2.3 Standardisation of ulcer measurement

For patients with multiple DFUs, an index ulcer was identified for the purpose of the study during the first clinic visit, which will be monitored during subsequent clinic visits. Measurement of ulcer was conducted in a dedicated room with adequate lighting prior to clinic consultation with the doctor. All participants were positioned sitting on a chair with their feet overhanging. Wound measurements (length and width) were taken traditionally by a specialised trained wound nurse first (Figure 2A) and subsequently taken using the C4W application by a dedicated research coordinator. Method of wound measurement was standardised between the wound nurses prior to the conduct of the study: placement of tracing over wound, followed by the use of sterile marker pen to outline the wound on the tracing paper. The length (defined as the longest axis in the wound box) and width (defined as the longest axis perpendicular to the length in the wound box) were measured using the traced wound (Figure 2B). Area computation was conducted by overlaying the tracing paper on a graphical paper, in which squares were counted corresponding to the area covered by the wound.

image

Foot wound from a patient: A, manual measurement of the wound parameters by a trained wound nurse; B, schematic diagram for determination of length and width; C, application interface of the Tetsuyu CARES4WOUNDS system on an iPhone 11 Pro (C4Wi version 1, build 1)

Digital measurement of the wound was performed by a dedicated research coordinator with the C4W application (Version 1, build 1) installed on three different iPhone (Apple Inc., California) device models: device 1 was an iPhone 8 Plus, device 2 was an iPhone 11 Pro, and device 3 was an iPhone XS. All of the iPhone devices were running on the iOS 13.0 operating system. A novel feature of C4W is the ability to capture wound images and measure wound dimensions using software algorithms via an application on conventional Apple iPhone/iPad devices (Figure 2C). This widens the application of C4W to more clinical scenarios and removes the need for attachment cameras. An optical zoom of ×1.0 was used to capture the images at a distance of approximately 20 cm from the wound. Three images of the same wound were taken from each C4W device. Each repeat image of the same wound involved repositioning of the patient and the research coordinator. This was repeated across the three different devices. The parameters measured (length, width, and area) were automatically calculated based on the image boundaries determined by the imaging system. For images where automatic boundary detection was vastly different from the actual boundaries, manual adjustments were made to ensure accurate detection of the wound boundaries. These instances occurred when there was poor colour contrast with the patients' underlying skin tone, were too small (<1 cm), or were located in areas of large variation in contours (eg, bony prominences such as the malleolus).

2.4 Sample size calculation

Sample size was calculated as per number of wound images rather than the number of subjects as this is a cross-sectional study on wound imaging. Based on Tetsuyu's internal validation, the baseline mean accuracy is 90%. Hence, assuming baseline correction (R0) at 0 and alternative correlation (R1) at 0.2, sample size required for one correlation test with power 90% and α 0.05 is 258 wound images. Intra-rater reliability will be evaluated by comparing three different C4W measurements against the same wound with intra-class correlation (ICC) statistics. Hence, assuming baseline correlation (R0) at 0 with alterative correction (R1) at 0.2, sample size required for ICC with power 90% and α 0.05 is 83. In our institution with an average of 20 and 30 DFU wound inspections in outpatient and inpatient setting, respectively, per week, our target sample size was 341 wound images.

2.5 Statistical analysis

All statistical analyses were performed with SPSS version 25.0 (SPSS Inc., Chicago, Illinois). Statistical significance was determined by P < .05. ICC statistics was used to analyse intra-rater and inter-rater reliability.11 Intra-rater reliability between measurements taken by the same C4W device and inter-rater reliability between the wound nurse measurements and each C4W device were analysed using two-way mixed-effects model, absolute agreement, and single measure. Inter-rater reliability between the C4W devices was analysed using two-way random-effects model, absolute agreement, and single measure. Two-way random-effects model was used for inter-rater reliability between the C4W devices for generalising our results for all existing C4W devices in the market.12

Although there is no standard definition or cut-offs available for ICC to determine the extent of reliability, we have defined ICC values as the following: <0.5 indicates poor reliability, between 0.5 and 0.75 indicates moderate reliability, between 0.75 and 0.9 indicates good reliability, and >0.9 indicates excellent reliability.13

3 RESULTS 3.1 Patient baseline characteristics

The median age of the study population was 60 (IQR 52.5-66) with a male predominance (n = 24, 85.7%). Common comorbidities include DM (n = 28/28, 100%), hypertension (n = 24/28, 85.7%), chronic kidney disease (n = 20/28, 71.4%), and peripheral vascular disease (n = 18/28, 64.3%). Table 1 summarises the overall patient demographics.

TABLE 1. Clinical profile of patients included in the study Number of patients (n = 28) Age 60 (52.5–66) Gender, male 24 (85.7) Smoking, yes 7 (25) Ethnicity Chinese 17 (60.7) Malay 2 (7.1) Indian 9 (32.1) Comorbidities Diabetes mellitus 28 (100) Hypertension 24 (85.7) Coronary artery disease 3 (10.7) Chronic heart failure 4 (14.3) Chronic kidney disease 20 (71.4) Cerebrovascular accident 4 (14.3) Chronic obstructive pulmonary disease 0 (0) Peripheral vascular disease 18 (64.3) Previous ulcer 7 (25) Previous surgical debridement 11 (39.3) Previous revascularisation 7 (25) Previous amputation (minor/major) 6 (21.4) Location of ulcer Heel 4 (14.3) Dorsum 7 (25) Sole 9 (32.1) Toes 8 (28.6) Wound parameters taken at the first visita Length, cm 3.00 (1.63–6.03) Width, cm 2.35 (1.10–3.88) Area, cm2 3.75 (1.40–16.50) Note: All categorical variables are expressed as n (%) unless otherwise specified. All continuous variables are expressed as median (interquartile range) unless otherwise specified. 3.2 Wound images

There was a total of 75 wound episodes from 28 patients. A total of 547 wound images were analysed in this study. One-hundred twenty-eight images (not included in the above count) were excluded in this analysis in view of inaccurate data: overestimated wound boundaries and inaccurate detection of wound due to unclear wound boundaries. Table 2 summarises the overall baseline characteristics of all included wound episodes. There is excellent intra-rater reliability of C4W on three different image captures of the same wound for length (intra-rater reliability ranging 0.956-0.993 for three different devices), width (intra-rater reliability ranging 0.933-0.963 for three different devices), and area (intra-rater reliability ranging 0.984-0.994). Table 3 summarises the intra-rater reliability of three different images taken from the same C4W device.

TABLE 2. Baseline characteristics of wound episodes Number of wound episodes (n = 75) Age 61 (49–66) Gender 70 (93.3) Smoking, yes 23 (30.7) Ethnicity Chinese 51 (68) Malay 4 (5.3) Indian 20 (26.7) Comorbidities Diabetes mellitus 75 (100) Hypertension 67 (89.3) Coronary artery disease 10 (13.3) Chronic heart failure 14 (18.7) Chronic kidney disease 53 (70.7) Cerebrovascular accident 12 (16) Chronic obstructive pulmonary disease 0 (0) Peripheral vascular disease 50 (66.7) Previous ulcer 33 (44) Previous surgical debridement 35 (46.7) Previous revascularisation 26 (34.7) Previous amputation (minor/major) 23 (30.7) Location of ulcer Heel 10 (13.3) Dorsum 18 (24) Sole 21 (28) Toes 26 (34.7) Wound parametersa Length, cm 2.60 (1.00–5.70) Width, cm 1.40 (0.50–3.30) Area, cm2 3.10 (0.60–14.84) Note: All categorical variables are expressed as n (%) unless otherwise specified. All continuous variables are expressed as median (interquartile range) unless otherwise specified. TABLE 3. Intra-rater reliability of the same Testuyu device on three different images obtained from the same wound Measurements, mean ± SD Image 1 Image 2 Image 3 Intra-rater reliability (95% CI) P value Length Width Area Length Width Area Length Width Area Length Width Area Length Width Area Device 1 4.12 ± 3.46 2.50 ± 1.86 9.30 ± 12.00 3.79 ± 3.25 2.37 ± 1.76 8.68 ± 10.90 3.77 ± 3.31 2.33 ± 1.77 8.48 ± 11.01 0.956 (0.931–0.973) 0.946 (0.917–0.967) 0.984 (0.974–0.990) <.001 <.001 <.001 Device 2 4.15 ± 3.39 2.49 ± 1.76 9.81 ± 11.33 4.06 ± 3.35 2.48 ± 1.79 9.58 ± 11.07 4.07 ± 3.36 2.58 ± 2.00 9.72 ± 11.33 0.993 (0.989–0.995) 0.933 (0.901–0.957) 0.994 (0.991–0.996) <.001 <.001 <.001 Device 3 4.32 ± 3.62 2.48 ± 1.73 10.09 ± 11.66 4.20 ± 3.40 2.44 ± 1.72 9.86 ± 11.42 4.20 ± 3.41 2.54 ± 1.87 9.86 ± 11.28 0.984 (0.976–0.990) 0.963 (0.944–0.977) 0.994 (0.991–0.996) <.001 <.001 <.001 Note: Length and width were expressed in cm, while area was expressed in cm2.

Table 4 summarises the inter-rater reliability between the three C4W devices; we obtained excellent inter-rater reliability between the three C4W devices for length (inter-rater reliability 0.947 [95% CI: 0.923–0.964, P < .001]), width (inter-rater reliability 0.923 [95% CI: 0.890–0.948, P < .001]), and area (inter-rater reliability 0.965 [95% CI: 0.949–0.977, P < .001]).

TABLE 4. Inter-rater reliability of the three different Tetsuyu devices on the average measurements across all three images taken on each device Device 1 Device 2 Device 3 Inter-rater reliability P value Length Width Area Length Width Area Length Width Area Length Width Area Length Breath Area Image 3.13 ± 3.21 1.95 ± 1.78 7.23 ± 10.35 3.40 ± 3.38 2.09 ± 1.88 7.98 ± 10.75 3.35 ± 3.47 2.01 ± 1.83 7.85 ± 10.81 0.947 (0.923–0.964) 0.923 (0.890–0.948) 0.965 (0.949–0.977) <.001 <.001 <.001 Note: Length and width were expressed in cm, while area was expressed in cm2.

When comparing the manual measurements obtained by trained wound nurse with each of the C4W devices, we obtained good inter-rater reliability for length (range 0.825–0.934), width (range 0.825–0.930), and area (range 0.872–0.932). The inter-rater reliability values between the trained wound nurse and each of the C4W devices are summarised in Tables 5-7. When comparing the wound area calculated manually to each of the C4W devices, the areas are greater by 25.2%, 13.4%, and 15.3% for device 1, 2, and 3, respectively.

TABLE 5. Inter-rater reliability between wound nurse and Tetsuyu device 1 for the corresponding image Measurements, mean ± SD Inter-rater reliability P value Length Width Area Length Width Area Length Width Area Device 1 3.13 ± 3.21 1.95 ± 1.78 7.23 ± 10.35 0.825 (0.714–0.892) 0.825 (0.737–0.886) 0.872 (0.794–0.920) <.001 <.001 <.001 Wound nurse 3.85 ± 3.49 2.12 ± 1.86 9.05 ± 12.35 Note: Length and width were expressed in cm, while area was expressed in cm2. TABLE 6. Inter-rater reliability between wound nurse and Tetsuyu device 2 for the corresponding image Measurements, mean ± SD Inter-rater reliability P value Length Width Area Length Width Area Length Width Area Device 2 3.40 ± 3.38 2.09 ± 1.88 7.98 ± 10.75 0.934 (0.885–0.961) 0.930 (0.892–0.955) 0.932 (0.893–0.957) <.001 <.001 <.001 Wound nurse 3.85 ± 3.49 2.12 ± 1.86 9.05 ± 12.35 Note: Length and width were expressed in cm, while area was expressed in cm2. TABLE 7. Inter-rater reliability between wound nurse and Tetsuyu device 3 for the corresponding image Measurements, mean ± SD Inter-rater reliability P value Length Width Area Length Width Area Length Width Area Device 3 3.35 ± 3.47 2.01 ± 1.83 7.85 ± 10.81 0.915 (0.857–0.948) 0.908 (0.858–0.941) 0.923 (0.878–0.952) <.001 <.001 <.001 Wound nurse 3.85 ± 3.49 2.12 ± 1.86 9.05 ± 12.35 Note: Length and width were expressed in cm, while area was expressed in cm2. 4 DISCUSSION

Advancements in technology permit the use of adjuncts to improve efficiency of clinical care. Innovation is also ongoing to improve user experience and reduce difficulty in setting up of imaging systems for clinical use. For instance, prior version of the C4W imaging system required the use of a three-dimensional (3D) sensor attachment on a tablet device (iPad) for measurement. With improvements in mobile technology and design, depth sensors are integrated into mobile devices, permitting 3D measurements with depth perception. Hence, this study aims to validate the new C4W imaging system. Our study demonstrated high intra-rater reliability between devices and inter-rater reliability between the C4W imaging system and traditional wound measurements.

Our recent systematic review until March 2020 summarised existing literature on the use of wound assessment, imaging, and monitoring systems in DFUs; of which, we identified 17 articles (5 on computer applications or handheld devices, 2 on mobile interfaces, 2 on optical imaging, 4 on spectroscopy or hyperspectral imaging, and 4 on artificial intelligence).10 The C4W imaging system allows easy use on mobile and/or handheld devices. In our study, this application was used on iPhone 11 Pro. Yap et al in 2018, who analysed the use of the application FootSnap (Manchester Metropolitan University, Manchester, UK) on DFUs, demonstrated high intra-rater reliability for operator 1 (mean Jaccard Similarity Index [JSI] 0.89 [range 0.84–0.93]) and operator 2 (mean JSI 0.91 [range 0.84–0.95]).14 Inter-rater reliability was also high (mean JSI 0.89 [range 0.83–0.94]). Comparatively, in our study, we demonstrated excellent (ICC > 0.900) intra-rater reliability for each of the C4W imaging system and inter-rater reliability between each of the C4W imaging systems.

Our systematic review also described the list of commercially available wound assessment products listed on WoundSource, a reference guide on wound care products, for a list of commercially available wound assessment and monitoring tools available for DFUs.9 Of which, six of the identified 18 wound assessment products have been described and/or validated in the literature: EPISCAN I-200 (Longport, Inc., Pennsylvania), Scout (WoundVision, Indiana), Silhouette (ARANZ Medical Ltd., Christchurch, New Zealand), Swift Skin and Wound (Swift Medical, Ontario, Canada), WoundMatrix (WoundMatrix, Inc., Pennsylvania) and WoundZoom (WoundZoom, Inc., Wisconsin).15-23 Khong et al evaluated the use of WoundAide (Konica Minolta Business Solutions Asia Pte Ltd, Tokyo), an application that uses image segmentation and enhancement techniques.23 The use of C4W imaging system in our study was non-inferior compared to their study (which assessed the use of WoundAide, WoundZoom (WoundZoom, Inc., Wisconsin) and Visitrak (Smith & Nephew, London, UK); we similarly demonstrated ICC of >0.900 for both intra-rater and inter-rater reliability. Unfortunately, we were unable to compare the ICC of the inter-rater reliability of the manual measurements taken by a trained wound nurse compared to the wound imaging system with their study.

Previous literature has interestingly showed that wound measurements by standard ruler results in overestimation of wound area (calculated by length × width) by up to 41%.19 This is an expected finding as the simple multiplication of length and width is unable to account for wounds with irregular surface area. Our study similarly shows overestimation of the calculated wound area from manual measurements compared to the C4W devices (ranging from 13.4 to 25.2%). However, good inter-rater reliability was demonstrated for device 1 (ICC 0.872 (95% CI: 0.794–0.920) and excellent inter-rater reliability for device 2 (ICC 0.932 (95% CI: 0.893–0.957), and device 3 (ICC 0.923 [95% CI: 0.878–0.952]). This implies that despite the seemingly significant overestimation of wound area based on the multiplication of length and width, this overestimation is not significant when performed over a large number of wound images. An additional advantage conferred by the use of digital image capture is its ability to calculate the wound parameters in a shorter time.20 Unfortunately, we did not collect on time taken as our study was primarily focused on assessing the intra- and inter-rater reliability of traditional wound measurement vs digital image capture with the C4W system.

The C4W imaging system is an adaptable software, which u

Comments (0)

No login
gif