Item does not exist

A new era of antibody discovery: an in-depth review of AI-driven approaches

Antibodies(p1),(p2),(p3),(p4),(p5),(p6) are Y-shaped proteins produced by plasma B cells, essential for immune defense.(p7),(p8),(p9) They specifically recognize pathogens, including viruses and bacteria, known as antigens. Antibodies comprise two variable light (VL) and two variable heavy (VH) chains, linked by disulfide bonds (Figure 1).(p10),(p11) Antibodies are categorized into five subclasses based on the VH chains, whereas there are two types of light chain. Therapeutically, immunoglobulin G (IgG),(p12) particularly IgG4(p13) or IgG1,(p12) are predominantly used. The antibody fragment (Fab) region, encompassing two arms of the Y-structure, is crucial for antigen binding.(p14) It contains variable regions at the N terminus, which house hypervariable loops called complementarity-determining regions (CDRs). Modifications to CDRs can drastically alter antibody specificity or affinity.(p15) The base, termed the fragment crystallizable (Fc) region, comprises constant VH chains and interacts with cellular receptors and complement proteins, influencing the half-life of antibody and complement-dependent cytotoxicity (CDC).(p6)

Antibodies protect the host by identifying infectious agents, including viruses and pathogenic bacteria, and initiating an immune response.(p16) Currently, over 100 monoclonal antibodies (mAbs) have been approved as drugs(p17) and are widely used in the treatment of cancer, autoimmune diseases, drug abuse,(p18) and AD.(p19) Antibodies also have a crucial role in combating infectious viruses, particularly severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2).(p20),(p21),(p22) They are used in coronavirus 2019 (COVID-19) testing(p23) and also serve to neutralize SARS-CoV-2 by binding to its spike protein.(p24) Consequently, therapeutic antibodies have garnered increasing attention in recent years and have become indispensable agents in therapy.(p25) The continued advancement in exploiting antibodies for therapeutic purposes hinges on the development of more efficient approaches for engineering these molecules. Techniques, such as phage display, antibody affinity maturation, and antibody humanization, have been utilized in the development of therapeutic antibodies.(p11) For instance, X-ray crystallography(p26) and electron microscopy (EM)-based methods(p27) are used to determine antibody 3D structures, whereas phage display libraries are used to optimize affinity. Moreover, in vitro antibody affinity maturation enhances antibody affinity through somatic hypermutation (SHM) occurring in the CDRs of the immunoglobulin.(p28) However, these methods are time-consuming, expensive, and labor-intensive. Computational tools and algorithms offer complementary approaches that can decrease the time and costs of antibody design by reducing the likelihood of failures and enhancing the success rate of experimental trials.(p29)

To date, various computational algorithms or tools(p30),(p31),(p32),(p33),(p34),(p35) have been developed to facilitate antibody discovery. For instance, RosettaAntibody(p31) is a reliable homology modeling method that predicts the 3D structure of antibodies by searching for existing antibody structures with highly similar sequences as templates. RosettaAntibody has demonstrated good performance in the Antibody Modeling Assessment II (AMA-II). Recently, Jumper et al. introduced AlphaFold,(p32) a deep learning system that can accurately predict protein structures in the absence of similar templates. AlphaFold integrates innovative neural network architectures and training methodologies that are informed by physical and biological understanding of protein structures. In the 14th Critical Assessment of Protein Structure Prediction (CASP14),(p36) AlphaFold achieved near-experimental accuracy in predicting protein structures, with a median backbone accuracy of 0.96 Å root mean-square deviation (RMSD) when compared with crystal structures, outperforming other competing methods. RoseTTAFold,(p33) another AI-based algorithm, has also made significant progress in predicting the 3D structures of proteins. For antibody docking, SnugDock(p34) was the first flexible docking method specifically designed for antibodies. It refines the CDR loops and optimizes the VH-VL orientation on the antibody–antigen interface. It can utilize the antibody homology model as input and produce more accurate docking results compared with standard rigid-body methods. In the Critical Assessment of Prediction of Interactions (CAPRI)(p37) challenge, SnugDock achieved the best predictions among all other methods. In terms of computational affinity maturation, a de novo antibody design tool called RosettaAntibodyDesign (RAbD)(p35) has been developed. Experimental validation has shown that RAbD can improve antibody affinity by ten- to 50-fold.

However, the accuracy of these in silico methods is a major challenge. Although certain tools, such as RosettaAntibody(p31) and AlphaFold,(p32) can predict framework regions and non-H3 CDR loops with high accuracy, accurately modeling the CDR domain and H3 loop remains a core challenge for these methods. Computational antibody docking faces the challenge of accurately predicting binding conformations between antibodies and antigens without binding information.(p38) For instance, SnugDock(p34) requires binding information (paratope or epitope) of the antibody–antigen complex to generate low RMSD models. Additionally, the reliability of the scoring function of SnugDock is not high. By contrast, RosettaAntibodyDesign(p35) relies on an antibody–antigen complex as input, which might not be available in many cases. Lastly, the in silico design process faces the challenge of accurately determining the conformation of antibody–antigen interactions or the paratopes on antibodies.

Recently, a wealth of innovative AI technologies(p39),(p40),(p41),(p42),(p43),(p44),(p45),(p46),(p47),(p48) has emerged, showcasing their potential in surmounting several of the aforementioned limitations. For example, deep learning models, such as RNNs(p49) and transformers,(p41) can be trained on extensive antibody sequence and structure data sets to recognize intricate data patterns. These patterns enhance the ability of AI models to accurately predict the CDR domain and H3 loop of antibodies, potentially outperforming traditional methods. Another example is that AI technologies can also be used to predict the paratopes on antibodies,(p50) which are the regions responsible for binding to antigens. By analyzing the sequence and structural features of antibodies, machine learning models can identify the likely paratope regions, providing valuable insights for the in silico design of antibody–antigen complexes.

In this review, we examine the myriad AI-driven methodologies that have been used over the past 5 years for their efficacy in predicting antibody–antigen interactions, fine-tuning antibody affinity, and creating new antibody candidates. Furthermore, we briefly discuss the challenges encountered during the incorporation of AI-driven models into conventional antibody discovery processes and shed light on the promising avenues for future advancements in this rapidly evolving domain.

Comments (0)

No login
gif