Nowadays, implant therapy has established itself as a reliable and predictable treatment modality for anterior tooth replacement. The literature has shown an excellent survival rate for single implant-supported prostheses.1 Most studies have related the success of implant therapy to only osseointegration and functional criteria, and have neglected the esthetic component.2 However, given the growing interest in esthetics in our society, these criteria can be considered insufficient for the evaluation of the implant therapy success.
For this reason, in 2004, a consensus statement recommended that the analysis of esthetic results be included in studies of implant dentistry.3 So, reliable and reproducible tools are needed for an objective evaluation of the esthetic result. Numerous esthetic indices have been proposed to meet this demand.4-15
The first proposed esthetic indice was the papilla index score. The objective was to provide a clinical assessment of the degree of recession and regeneration of the papillae adjacent to the single implant-supported prostheses.4 The pink esthetic score (PES) was then introduced in 2005 to assess the esthetics of the peri-implant mucosa using a numerical scale.5
However, indices that consider only one aspect of esthetics such as the PIS (Papilla Index Score ) or the PES (peri-implant mucosa) are insufficient tools to assess esthetics. Indeed, an indice must be complete and must include all the parameters influencing the esthetics to allow a global evaluation. Scores evaluating the prosthetic crown and the peri-implant mucosa were then proposed.6, 9-15
The PES was modified in 2009 to incorporate the “white esthetic score” evaluating the prosthesis.10 Other indices such as the implant crown esthetic index and copenhagen index score also provided an overall assessment of the Pink and White components.6, 13
The esthetic indices cited so far have all used numerical scales as their annotation system. Some studies have shown that visual analog scales have greater sensitivity and accuracy, that is, they are better able to detect and score small differences.16 In addition, most clinical studies involving patient-reported esthetic outcome measures (PROMs) include patient perception as assessed by a visual analog scales.17 Therefore, it is not possible to obtain a direct comparison between the perception of patients and practitioners with esthetic indices based on numerical scales. Two more recent esthetic indices, the PICI and the IREI, then used visual analog scales to overcome the drawbacks of numerical scales. According to the authors, visual analog scales are less limiting and provide a wider range for assessment than three or four point scoring systems.14, 15
Currently, most of the clinical studies evaluating the esthetic outcome always use esthetic indices. These scores made it possible to evaluate and compare the esthetic results of the different surgical and prosthetic protocols, thus facilitating the analysis of the relevance of the indications and clinical effects. They also helped to study the stability of the esthetic results over time.18-21
However, despite their frequent use, none of these indices has yet been universally validated and recommended by consensus.22 Actually, there are few studies investigating the reliability of these different scores. Most authors who have introduced or investigated these indices have examined their reproducibility.5, 6, 10, 12-15, 23 But the fact that the authors judged their own indices reflects the lack of hindsight concerning these scores and is not sufficient to validate them. Especially since only reproducibility is considered in these studies, neglecting validity.24 Among the studies discussing the reliability and validity of these indices, none of them has met the standardized quality criteria proposed by groups such as the consolidated standards of reporting trials.25 The following issue regarding the choice of the selection criteria for eligible studies was raised by a review of the literature. The number of prospective randomized controlled studies is very small. The retrospective and/or case series studies collected are difficult to compare and have little scientific value. The lack of data regarding the clinical protocol and the methodological differences in the study design represents an additional difficulty.26
One of the ultimate goals of any dental treatment is patients' satisfaction. The correlation of the results of the esthetic indices with the perception of the nonexpert subjects is therefore an important parameter to study. The results are controversial in the literature.10, 27-30
The objective of this research paper was therefore to study the reproducibility, validity, and correlation of these indices with patients' satisfaction. The null hypotheses of this study were that the esthetic indices studied have (1) poor inter-rater reproducibility, (2) poor intra-rater reproducibility, (3) poor validity, and (4) poor correlation with the perception of nonexpert subjects.
2 MATERIAL AND METHODS 2.1 Selection of esthetic indicesFour esthetic indices were selected for the study; two with numerical scales (the “pink and white esthetic score” and the “copenhagen index score”) and two with visual analog scales (the “peri-implant and crown index” [PICI] and the “implant restoration esthetic index” [IREI]) (Figure 1).
Illustration of the different esthetic indexes studied and their parameters: (A) PES/WES; (B) PICI; (C) CIS; (D) IREI
For subjective evaluation, a score was assigned using visual analog scales.
2.2 Selection of clinical casesThree clinical cases involving single tooth implant-supported restoration in the maxillary anterior sector were chosen for the study. The photographs of the prostheses were chosen by ensuring good visibility of the prosthetic crown, its peri-implant mucosa, and the adjacent, contralateral natural teeth, and their keratinized tissues. The first two cases were selected from the literature. The first one involved an immediate implantation with provisionalization which was performed to replace a maxillary central incisor.19 However, the second case involved an early implant placement (Type 2 according to the ITI classification: implantation after 4 to 8 weeks from the day of extraction31) which was conducted to replace a maxillary central incisor.32 The third one involved a case followed in the department of periodontology at the Dental University clinic of Monastir. A delayed implantation protocol (Type 4 according to the ITI classification: implantation at 6 months or more after extraction31) was performed to replace a maxillary canine.
2.3 ParticipantsFifteen expert examiners were divided according to their specialty and level of expertise (two periodontology professors, one prosthodontic professor, two third-year periodontal residents, two third-year prosthodontic residents, four general practitioners, two recently graduated students, and two dental interns). Thirty nonexpert examiners of different age ranges also participated in the study. Nonexpert subjects include para-medical staff, hospital administration staff, and anyone who has never initiated a course of study or obtained a degree in dentistry.
3 DATA COLLECTIONEach of the expert participants received the photographs of the selected clinical cases. They also received a form to be filled out containing the selected esthetic indices and the visual analog scales for subjective analysis. Assessment was performed a second time by the same participants at an interval ranging from 2 to 3 weeks after the first assessment.
For the nonexpert examiners, photographs of the clinical cases were provided along with a form containing three visual analog scales to score the peri-implant mucosa, the prosthetic crown, and then the overall esthetics.
4 STATISTICAL ANALYSESStatistical Package for Social Science (SPSS, version 26.0) was used for data analysis. Inter-rater reproducibility represents the degree of agreement of results between different participants. Intra-rater reproducibility represents the degree of agreement between the first and second assessments. They were assessed for all participants, specialist participants, and nonspecialist participants. A two-way mixed model intraclass correlation coefficient (ICC) was used for analysis. An ICC value greater than 0.6 was considered acceptable and a value greater than 0.8 was considered excellent.
The validity of an esthetic indice represents its capacity to really evaluate esthetics by its constitution and the relevance of the various parameters which compose it. To evaluate this, the scores of the different parameters that constitute the indices were compared with the visual analog scales. A correlation matrix was created based on Pearson's correlation.
The correlation of the esthetic indices with the perception of the nonexpert examiners was evaluated using Pearson's correlation between the esthetic scores assessed by the expert examiners and the visual analog scales by the nonexpert examiners.
A Pearson's correlation above 0.5 was considered acceptable.
5 RESULTS 5.1 Inter-rater reproducibilityThe ICC values are presented in Table 1. No significant differences were found between the different values and all the esthetic indices had an ICC value >0.8, reflecting an excellent inter-rater reproducibility.
TABLE 1. Inter-rater reproducibility of all studied indices measured by the ICC PES/WES PES WES CIS PICI PICI pink PICI white PICI SUB IREI IREI pink IREI white All participants 0.861 0.884 0.718 0.906 0.888 0.960 0.380 0.682 0.854 0.920 0.000 Specialist participants 0.614 0.767 0.000 0.802 0.728 0.917 0.000 0.127 0.448 0.722 0.67 Nonspecialist participants 0.848 0.823 0.843 0.818 0.818 0.930 0.511 0.485 0.871 0.903 0.609 Abbreviations: CIS, copenhagen index score; PES, pink esthetic score; PES/WES, pink and white esthetic score; PICI, peri-implant and crown index; PICIPINK, pink component of the PICI; PICIWHITE, white component of the PICI; WES, white esthetic score.The “white” components of all the esthetic indices had a low ICC value, which was much lower than that of the “pink” component. Their inter-rater reproducibility was poor (0.000 for the IREI; 0.380 for the PICI), except for the WES (ICC = 0.718), which remained acceptable. The “pink” components of all the esthetic indices showed excellent reproducibility (ICC > 0.8).
Regarding the difference in reproducibility between specialist and nonspecialist participants, significantly lower ICC values were observed for specialist participants. For the latter, only the CIS showed excellent reproducibility (ICC > 0.8). The PES/WES and PICI had an acceptable reproducibility (ICC > 0.6); however, the IREI had poor reproducibility (ICC = 0.448). The “white” components revealed very poor reproducibility.
5.2 Intra-rater reproducibilityThe ICC values reflecting the reproducibility between the first and second observation for all the esthetic indices and their components are presented in Table 2. No significant differences were noted between the several values and all the esthetic indices had an ICC value >0.8, except for CIS (ICC = 0.798), which showed an excellent intra-judge reproducibility.
TABLE 2. Intra-rater reproducibility of the different esthetic indices studied as measured by the ICC PES/WES PES WES CIS PICI PICI pink PICI white PICI SUB IREI IREI pink IREI white All participants 0.815 0.844 0.683 0.798 0.921 0.925 0.890 0.891 0.923 0.937 0.890 Specialist participants 0.622 0.890 0.456 0.896 0.886 0.664 0.822 0.819 0.850 0.885 0.761 Nonspecialist participants 0.873 0.818 0.813 0.695 0.934 0.934 0.915 0.926 0.958 0.961 0.952 Abbreviations: CIS, copenhagen index score; PES, pink esthetic score; PES/WES, pink and white esthetic score; PICI, peri-implant and crown index; PICIPINK, pink component of the PICI; PICIWHITE, white component of the PICI; WES, white esthetic score.The two components “white” and “pink” of the indices revealed an excellent intra-rater reproducibility (ICC > 0.8), except for the WES (ICC = 0.683).
The values found for the specialist participants were lower than those for the nonspecialist ones. However, the difference was not significant. All the indices and their components showed at least acceptable intra-rater reproducibility for both specialists and nonspecialists (with the exception of the WES for specialist participants).
5.3 ValidityThe Pearson correlation values between the different parameters of the esthetic indices studied and the visual analog scales are presented in Tables 3–6. The highest correlations were found for the PICI and IREI (Pearson correlation was higher than 0.5 for most of them). The different parameters of the PES/WES and CIS had rather low Pearson correlation values (below 0.5 for most of them).
TABLE 3. Validity of the PES/WES: Pearson correlation between the different parameters and the VAS Parameters EVA EVA pink EVA white P1 0.103 0.122 P2 0.075 0.117 P3 0.394 0.311 P4 0.024 0.104 P5 0.414 0.481 W1 0.145 0.129 W2 0.273 0.312 W3 0.198 0.336 W4 0.163 0.321 W5 0.337 0.407 Abbreviations: P “n”, n corresponds to the number of the PES parameter; VAS pink, visual analog scale evaluating the peri-implant mucosa; VAS white, visual analog scale evaluating the prosthetic crown; VAS, visual analog scale evaluating the esthetics of the mucosa-prosthetic crown; W “n”, n corresponds to the number of the WES parameter. TABLE 4. Validity of the CIS: Pearson correlation between the different parameters and the VAS Parameters EVA EVA pink EVA white CIS1 0.319 0.391 CIS2 0.285 0.406 CIS3 0.397 0.41 CIS4 0.078 0.318 CIS5 0.052 0.028 CIS6 0.24 0.392 Note: CIS “n”, n is the number of the CIS parameter. TABLE 5. Validity of the PICI: Pearson correlation between the different parameters and the VAS Parameters EVA EVA pink EVA white PICIP1 0.366 0.364 PICIP2 0.38 0.412 PICIP3 0.622 0.614 PICIW1 0.665 0.78 PICIW2 0.665 0.768 PICIW3 0.651 0.69 Note: PICIP “n”, n corresponds to the number of the parameter of the component “pink” of the PICI; PICIW “n”, n corresponds to the number of the parameter of the component “white” of the PICI. TABLE 6. Validity of the IREI: Pearson correlation between the different parameters and the VAS Parameters EVA EVA pink EVA white IREIP1 0.512 0.534 IREIP2 0.348 0.373 IREIP3 0.4 0.512 IREIP4 0.69 0.703 IREIP5 0.371 0.351 IREIP6 0.507 0.641 IREIW1 0.744 0.824 IREIW3 0.645 0.717 IREIW4 0.768 0.843 IREIW5 0.647 0.719 IREIW6 0.018 0.008 Note: IREIP “n,” n corresponds to the number of the parameter of the component “pink” of the IREI; IREIW “n,” n corresponds to the number of the parameter of the component “white.”It can also be noted that the parameters belonging to the “white” component of the esthetic indices correlated better with the VAS compared with those belonging to the “pink” component.
The highest Pearson correlation value with the VAS, evaluating the overall esthetics was found for the fourth parameter of the “white” component of the IREI (crown characterization). However, the lowest value was noted in the sixth parameter (exposure of the prosthetic abutment). A lower correlation was also noted for the distal papillae than for the mesial ones for all the esthetic indices.
5.4 Correlation with the perception of nonexpert examinersPearson's correlation values between objective assessment by expert examiners and subjective assessment by nonexpert examiners are summarized in Table 7. The correlation with the perception of nonexpert subjects was rather poor for all the esthetic indices and their components, with values not exceeding 0.3. The best correlation was found for the CIS (0.236) and the worst for the IREI (0.097).
TABLE 7. Correlation between the perception of expert and nonexpert raters by the Pearson correlation measure PES/WES PES WES CIS PICI PICIP PICIW IREI IREIP IREIW Pearson correlation 0.147 0.161 0 0.236 0.143 0.175 0.107 0.097 0.175 0.079 Abbreviations: IREIP, pink component of IREI; IREIW, white component of IREI; PICIP, pink component of PICI; PICIW, white component of PICI.The Pearson correlations of the “pink” components of the different esthetic indices were higher than those of the “white” components.
6 DISCUSSIONThe present study has shown that all the esthetic indices considered have a good inter- and intra-rater reproducibility. Thus, the first two null hypotheses were rejected.
Supporting the results of the studies introducing the different esthetic indices,4, 6, 10, 14 the findings of this study revealed the same results. However, inter-rater reproducibility showed a slightly higher ICC value (0.906) for CIS and a slightly lower ICC for IREI (0.854). This can be explained by the small number of parameters for CIS and the large number of parameters for IREI. This reflects a difference in the range of possible responses between the different indices.
In the study introducing the PES/WES, the reproducibility was judged to be excellent. The authors explained these results by the fact that in comparison to the PES of Fürhauser et al., with 7 parameters and a maximum score of 14, the new PES contains only 5 parameters, which makes it easier to use. In addition, the maximum score of 10 is easier to remember due to its common use in several domains.10
One study compared the PICI with the ICAI and PES/WES. The authors reported that the best intra-rater reproducibility is found for PES/WES and then for PICI. However, the lowest one was found for ICAI. Inter-rater reproducibility did not show significant differences between the three indices.14
In a previous study comparing eight esthetic indices, a fairly good intra- and inter-rater reproducibility was found for total PES/WES but it was weakened by WES. Indeed, the reproducibility for WES is average compared with that of PES, which is very good. These results are in agreement with those of the present study.23
A more recent study compared the IREI with the PES/WES. They reported that both scores show excellent intra-rater reproducibility, but the inter-rater reliability of PES/WES is lower than that of IREI.15
A difference in inter and intra-rater reproducibility between the different groups of examiners was noted. Indeed, the group of specialist examiners had lower ICC values than the group of nonspecialists ones. As the specialists of fixed prosthodontics and periodontology were in the same study group, it can be assumed that this poor reproducibility within the specialist group was due to the difference in the area of expertise between the two specialties.
Another study investigated the reproducibility and influence of the different specialties on the PES/WES. They noted a very good to moderate intra-rater reproducibility. No significant differences in inter-rater agreement were observed for total PES/WES between the four groups, which is in contradiction with the results found in the present study.27
In the present study, it was found that the esthetic indices using visual analog scales (PICI, IREI) presented a good validity while the studies using numerical rating scale (PES/WES, CIS) revealed low validity. These results partially reject the third null hypothesis.
Contrary to reproducibility, the validity of the different indices has rarely been evaluated in the literature. In a study, the validity of the Copenhagen Index Score was evaluated. Spearman correlation analysis of the CIS and the overall VAS scores showed a significant correlation between the two scores. The six esthetic parameters showed a highly significant correlation with the corresponding VAS scores.13
In the study introducing the IREI, its validity and that of the PES/WES were evaluated. A significant correlation between total IREI and PES/WES scores was noted. With regard to the presence of mesial papillae, presence of distal papillae, gingival trigone, soft tissue curvature, and crown contour, the correlations between the IREI and the PES/WES were significant. Therefore, the authors concluded that the IREI is valid.15 These results are in agreement with those of the present study.
In a previous study, an analysis for each individual parameter compared with the VAS was performed. The highest significant impact on VAS was observed for the following parameters (p < 0.001): PES3 (mucosal level), PES7 (mucosal texture), ICA3 (crown convexity), CEI P2 (distal interproximal bone level), CEI P3(gingival biotype), CEI R1 (crown color and translucency), and CEI R5(crown surface).23
A critical review published in 2021 reported that good validity has so far only been proven for the pink esthetic score and the complex esthetic index.24
It was also noted that in contrast to reproducibility, the parameters belonging to the “white” component of the esthetic indices had better validity than those belonging to the “pink” component. This reflects a greater interest for most participants in the appearance of the prosthetic crown than in the peri-implant mucosa.
The lowest Pearson correlation value was for the sixth parameter of the IREI (prosthetic abutment exposure). This is justified by the absence of exposure of the prosthetic abutments in all the cases selected. Therefore, the influence of this parameter on overall esthetics cannot be evaluated.
The fact that the Pearson correlation of the distal papillae is less important than that of the mesial papillae was explained by their lesser visibility during smile and therefore their lesser impact on esthetics, as well as the difficulty of evaluating them on photographs.
A weak correlation with patients perception for all esthetic indices was found in the present study. The fourth null hypothesis is therefore accepted.
The results are controversial in the literature. The study introducing the PES/WES included a questionnaire on esthetics associated with VAS. Then, the results were compared with the PES/WES scores and no statistically significant correlations between the total PES/WES score and the VAS was noted.10
In another study, the relationship between the PES/WES total score and VAS provided to patients was evaluated. Spearman's analysis revealed a statistically significant correlation between the total PES/WES and the VAS score, which is in agreement with the findings of the present study.27
Another study investigated the correlation between PES/WES and patients' satisfaction using a dental esthetics questionnaire including VAS. A significant correlation between the two scores was found.
Comments (0)