Certain dysmorphic features are present in certain genetic conditions, but not in others, which makes the diagnostic process quite difficult. For this reason, the genetic and pediatric communities consider it quite appealing to have software that can identify a particular face pattern and provide the appropriate diagnosis. While Face2Gene is not the first software tool that has attempted to accomplish this, its user-friendly interface and free access make it a highly popular choice among end users [14].
Our study shows that the Face2Gene program correctly identified the right diagnosis in its leading three suggestions in 56% of the studied cases. The results are consistent with those of other research groups. For 25 patients primarily suffering from neurological illnesses, Elmas and Gogus reported a 48% success rate in correctly matching diagnoses [9]. Mishima et al. also used F2G, reaching 60% of accurately identified diagnoses when including disorders for which the algorithm was not trained [7]. Since we too added such conditions, our relatively lower success rate could be expected. Narayanan et al. reported 73% correctly diagnosed cases. However, they checked if the right diagnosis was among the first ten suggestions [15]. We used a different cut-off value, requiring the correct diagnosis to be among the first three recommendations. Our rate of correctly listing the diagnosis among all 30 proposed syndromes (68%) is comparable. Carrer et al. reported one of the highest success rates—93% for the right diagnosis found among the top three suggestions [14]. This rate, however, relates to diagnoses that the researchers have classified as “common” meaning that their prevalence is greater than 1:100,000 but fewer than 5:100,000. Although such an approach is understandable, it would miss the ultra-rare and hyper-rare disorders, with frequencies respectively < 1:50,000 and < 1:108 [16, 17]. Therefore, patients with such conditions, individually infrequent but not when taken as a group, would not benefit from the program’s potential. To evaluate the algorithm’s precision under conditions more closely resembling a real clinical setting, where it could be difficult to determine the exact category of the disorder, we did not divide our diagnoses based on their prevalence.
Another important finding from our study is that F2G functions more effectively when a patient’s photo is uploaded along with phenotypic features. This highlights the significance of incorporating dysmorphic findings identified and selected by a clinical geneticist or a properly trained professional into the software program. By integrating this specialized input, the software would not only benefit from more refined and clinically relevant data but could also potentially expand its database. Such contributions from experienced professionals could bridge gaps in the software’s learning process, ultimately ensuring that it continues to evolve as a more reliable and effective diagnostic tool [18]. Another likely explanation is that F2G is not trained for all genetic disorders. Some rare genetic diseases are not frequently encountered during clinical practice and were only specified in this application using a small number of patient photos [9]. However, we included a wide range of conditions since we were interested in its potential in a genuine clinical situation. Users should carefully check both the so-called gestalt score and feature score, available in the program. The gestalt score is computed based on the analysis of the patient’s photograph. The feature score derives from analyzing phenotypic features, added manually by the user [6]. This demonstrates the significance of clinical findings’ addition to the patients’ virtual case in the application.
If we had more than one patient, we additionally searched for the conditions from our studied group that F2G had the highest diagnostic yield. The software demonstrated 100% diagnostic success rate for disorders with clearly evident phenotypes, such as Angelman, Bardet-Biedl, Cornelia de Lange, Down, Fragile X, Kabuki, Prader-Willi, Silver-Russell, trichorhinophalangeal, Turner, and Williams-Beuren syndromes. This was also reported by other research groups [19,20,21]. This raises the question of whether the output of F2G for such prevalent genetic disorders is better to a clinician’s, given that the conditions’ typical phenotypic features should make them easy to identify.
One potential limitation of F2G may be that much of the research examining its effectiveness was conducted on Caucasian patients, with relatively little research done on individuals of Asian, Latin, and African heritage [7, 14, 20]. It is critical to upload additional images of patients with a range of ancestry backgrounds since the diverse evolutionary process of different communities of Homo sapiens had a significant impact on typical facial morphology [22].
Our study’s findings indicate that the algorithm should be used as a supplementary tool for assisting the dysmorphologist, not as a replacement, because of its restricted possibilities. So far one of the main goals in rare disease cases has been reducing the well-known “diagnostic Odyssey” as short as possible—from years to potentially months or weeks [23]. Therefore, it appears highly promising to use an AI-based strategy, like F2G, to support the daily diagnostic problems faced by clinical geneticists.
Comments (0)