Optimizing Signal Management in a Vaccine Adverse Event Reporting System: A Proof-of-Concept with COVID-19 Vaccines Using Signs, Symptoms, and Natural Language Processing

Signal management is critical to vaccine vigilance, ensuring that AEFIs are appropriately identified and evaluated [16]. The detection and analysis of disproportionality signals in adverse event reporting systems can be a challenging task, particularly when dealing with a large volume of data. Signals of disproportionality must undergo clinical review before they are considered signals of suspected causality. This is a time-consuming task, and any advances to better focus this clinical review can provide potentially enormous benefits [17,18,19,20,21,22]. In this study, we employed a combination of manual and automated techniques to dismiss signals of disproportionality, and utilized NLP for the translation of plain-text signs and symptoms of listed AEFIs retrieved from the scientific literature into MedDRA PTs. Of note, to reduce this burden, researchers have explored various approaches to automate the signal management process. Data-mining techniques, such as data clustering, association rule mining, and machine learning algorithms, have been utilized to identify patterns and relationships within the adverse event data, enabling more efficient and targeted signal detection [23, 24].

The results of our statistical alert dismissal process were promising, as our approach successfully dismissed 17% of disproportionality signals. By accurately eliminating irrelevant signals, resources can be allocated more efficiently towards investigating potential adverse events that warrant further analysis.

Regarding our attempt to automate the translation of plain-text symptoms to MedDRA PTs using NLP techniques, this presented several challenges. When we employed basic NLP techniques [10], we found that only 56% of signs and symptoms extracted from FACTA+ perfectly matched with MedDRA PTs. None of the NLP methods using partial string matching demonstrated a clear separation boundary that could effectively distinguish good matches from poor matches. This was evident from the density functions, which exhibited a minimum overlap of 65%.

The obtained result of 78% accuracy for GPT-3.5 is noteworthy. The high accuracy suggests that GPT-3.5 possesses a significant level of apparent understanding and proficiency in identifying and categorizing MedDRA PTs. These findings are promising, as accurate PT assignment is crucial for efficient and effective pharmacovigilance, adverse event reporting, and medical documentation. However, it is important to acknowledge the remaining 22% of PTs that were not correctly assigned, as there is still room for improvement. Further research and enhancements to GPT-3.5's training and fine-tuning processes could potentially increase its accuracy and make it an even more valuable tool in the field of signal management. The high level of agreement across 10 attempts of PT assignments suggests strong consistency in GPT-3.5's responses, and indicates a robust and reliable performance of the model for the specific task at hand. Despite the exciting results from GPT-3.5, manual matching of signs and symptoms from FACTA+ with lower-level terms (LLTs) from MedDRA was deemed necessary to ensure accurate translation of plain-text signs and symptoms into the appropriate MedDRA PTs. Manual matching of signs and symptoms retrieved from FACTA+ (or any other system) to PTs can be a labor-intensive and time-consuming task. It involves the manual alignment of concepts and requires human experts to review and match the output of FACTA+ to the appropriate PTs. This process is prone to human error and subjectivity, which may introduce inconsistencies and affect the accuracy of the results.

Therefore, at this stage, despite the progress in NLP and other techniques, this study emphasizes the continued importance of human expertise in pharmacovigilance. However, in our study, the combination of automated methods with manual verification allowed a more reliable and accurate signal validation.

It is interesting to observe that the results obtained when using different NLP techniques were different. This result highlights the impact that algorithmic differences, sensitivity to textual differences, string length and complexity, and language and domain specificity have on performing this task [25].

Regarding severity of signs and symptoms detected as disproportionality signals, we investigated whether the dismissed disproportionality signals were present in the IME list over the years to investigate whether we were dismissing disproportionality signals suggestive of an increased severity of labeled AEFIs [14]. Among those dismissed in the IME list, none suggested an increase in the severity of listed AEFIs.

Lastly, while we have not investigated the use of resources such as MedNorm [26] or PubMedBERT for classification tasks, one such avenue could involve incorporating resources such as MedNorm [27] to explicitly train PubMedBERT for multiclass classification of MedDRA terms. Approaches such as XTARS [28] methodology could potentially enhance the accuracy and robustness of classification tasks. However, it is important to note that these approaches may also have their own limitations, such as the availability and quality of training data (29, 30).

4.1 Limitations

While our study has provided valuable insights into the use of FACTA+, it is important to acknowledge certain limitations that should be taken into consideration. First, our NLP approach focused on an unsupervised approach, which may have its own limitations in terms of precision and recall. Supervised approaches, on the other hand, could potentially provide more accurate results by leveraging labeled data for training NLP models. Second, our study focused on the application of string similarity metrics, which may not capture the semantic meaning of medical terms comprehensively. This could result in potential misclassifications or ambiguities in the classification process. Additionally, the use of FACTA+ itself may have inherent limitations, such as its reliance on specific resources or datasets, which could introduce biases or restrict its generalizability to other domains or languages. Overall, these limitations highlight the need for further research and exploration to address these concerns and improve the performance and applicability of FACTA+ in medical classification tasks.

Another potential limitation of this research study is the lack of specific information regarding the version of MedDRA used in the analysis. The NLP language model employed in the study, based on the GPT-3.5 architecture, was trained on a diverse range of data up until September 2021. Consequently, it does not possess explicit knowledge of the MedDRA version released after that time. As the MedDRA terminology is regularly updated to incorporate new medical knowledge and evolving terminology, it is essential to consider whether the version used in this research aligns with the most current standards. Future studies could benefit from incorporating the exact version of MedDRA utilized, thereby ensuring the accuracy and relevance of the findings within the context of the specific MedDRA version.

Regarding our analysis of the severity of signs and symptoms detected as disproportionality signals, it is important to mention that we only used 3 years of data and signals not common among those years, which certainly imposes limitations. One potential issue is the lack of granularity in the analysis, as a smaller sample size may not fully capture the range and variation of the data. Additionally, if the signals being analyzed are not consistent across the 3 years, the analysis may not provide a comprehensive understanding of the underlying patterns and relationships in the data. Therefore, it is important to carefully consider the limitations when interpreting the results of our analysis.

4.2 Implications

The results of this study have significant implications for the field of pharmacovigilance and vaccine vigilance. First, the successful combination of the manual and automated signal dismissal approach provides a valuable way to improve the efficiency and accuracy of the signal validation process.

The challenges observed in converting plain-text symptoms to MedDRA PTs using NLP algorithms, on the other hand, demonstrate the limitations of the current automated methods. While NLP techniques show promise, there is a need for further research and development to enhance their accuracy and effectiveness in this context. Improving the performance of NLP algorithms to accurately convert plain-text symptoms into MedDRA PTs would considerably streamline the signal validation process and lessen reliance on manual matching. Further research should focus on exploring innovative approaches and technologies, such as deep learning models, which may provide further insights into improving automated signal dismissal and NLP translation.

Moreover, it is essential to continue evaluating the dismissed signals within the IME list and their implications for the severity of listed AEFIs. Monitoring and analyzing the trends and patterns of dismissed signals can help identify emerging risks or potential safety concerns that may have been overlooked. Long-term studies assessing the temporal patterns and severity changes in listed AEFIs should be conducted to ensure the continued safety and effectiveness of immunization programs.

Finally, we firmly believe that the methodology we have developed possesses a high degree of transferability to other vaccines/drugs and spontaneous reporting databases. This confidence stems from the foundational principles that underpin these databases and the striking similarities observed in their data structures. The core principles governing the collection and organization of adverse event reports remain consistent across various databases within the same domain. Consequently, we are convinced that our approach, which has demonstrated its efficacy in our current dataset, can be extended to analyze and extract valuable insights from other spontaneous reporting databases. This belief underscores our commitment to providing a robust and broadly applicable solution for the broader scientific community and stakeholders who rely on these databases for pharmacovigilance and healthcare decision making.

Comments (0)

No login
gif