Protein cysteines garner interest for their distinct chemical properties, rendering them amenable to a diverse range of modifications that mediate context dependent regulation over the cysteine proteome and a range of cellular processes. The regulatory potential and druggability of most cysteines within the proteome is currently understudied and demands in-depth analysis of the cysteine proteome. The comparably low abundance of cysteines paired with the wide variety of possible cysteine modifications pose major challenges to their systematic interrogation. Since the landmark paper by Weerapana and Cravatt [1] in 2010, which established cysteine reactivity profiling by mass spectrometry, significant advances have begun to enable deeper analysis of the cysteine proteome, and mass spectrometry has become the main discovery tool to study protein cysteine biology. It is now possible to quantify up to ∼25 % of the 204,707 (out of 261,865) theoretically accessible human cysteines [2] within a single experiment. Most experimental strategies to interrogate the cysteine proteome follow general principles (Figure 1): 1) Derivatization of cysteines with electrophilic molecules, frequently coupled to selective reduction strategies, and 2) Cysteine peptide enrichment to achieve deep proteome coverage. Significant efforts have been made to enhancement all aspects of these workflows which now provide an opportunity to study of the fundaments of cysteine biology and to develop novel cysteine-targeted covalent drugs. Herein, we provide an overview of recent developments and highlight critical technological and scientific advances in cysteine proteomics.
As part of the continuous optimization of proteomic workflows, the recent implementation of solid-phase-enhanced sample-preparation (SP3) and/or high field asymmetric waveform ion mobility spectrometry (FAIMS) technologies have led to significant enhancements in cysteine coverage and reproducibility [2, ∗∗3, 4, ∗∗5, ∗6, 7, ∗∗8]. This was further complemented by integration of TMTpro-based multiplexing [3,5,6,8,9], as well as enhanced on- and off-line fractionation and the development of new mass spectrometers [2, ∗∗3, 4, ∗∗5, ∗6, 7, ∗∗8]. Circumventing the semi-stochastic nature of data dependent acquisition (DDA), Yang et al. [10] developed a data independent acquisition activity based protein profiling (DIA-ABPP) strategy. The approach demonstrated increased coverage and high reproducibility, when used for profiling of 4-HNE reactive cysteines, an electrophile library, and circadian changes in the liver cysteinome.
To overcome prevailing limitations in the selectivity, reactivity and adduct-stability of commonly used cysteine-reactive moieties, numerous novel derivatization strategies were recently developed. Tessier et al. [11] introduced TMS-ethynylbenziodoxolone (EBX) reagents to directly ethynylate cysteines with high chemoselectivity. Ethynylated cysteines (thioalkyne) can be reacted with azide-functionalized enrichment moieties. Concurrently, Motiwala et al. [12] applied heteroaromatic sulfones, which exhibit tunable reactivity and were functionalized for enrichment. In a different approach Tang et al. [13] developed heteroaromatic azoline thioethers (HATs), a highly reactive cysteine-selective chemotype, through systematic tuning of thioether probes. Optimized HATs displayed high hydrolytic stability, were azide-functionalized, and exhibited improved mass accuracy for cysteine profiling. HATs allow for chemical removal, yielding dehydroalanine, which can be conjugation to other nucleophiles. Moreover, Koo et al. [14] created N-Acryloylindole-alkynes (NAIAs), activated acrylamide-based warheads, which display high selectivity and fast reaction kinetics, whereas, in synthesizing the monosaccharide-based 1-OH-Az (3,4,6-O-Ac3ManNAz), Qin et al. [15] developed a distinct S-glycosylation-based chemotype. In addition, a robust SP3-enabled redox-proteomic workflow (SP3-Rox) was introduced by Desai et al. [7], relying on novel isotopic isopropyl-iodoacetamide-alkyne probes (L-/H-IPIAA) combined with a FragPipe-IonQuant analysis pipeline.
The development of novel enrichment strategies has culminated in ever-increasing coverage of the cysteine proteome. Initially introduced for simultaneous cysteine-based protein enrichment and phospho-proteome profiling [32,33], Xiao et al. [3], redesigned cysteine reactive phosphonate tags (CPT) to enable efficient proteome-wide labeling of cysteines and developed a novel multiplexed and stoichiometric redox proteomic technology, relying on IMAC enrichment paired with TMT multiplexing. This strategy was leveraged to generate the Oximouse dataset. Oximouse defines the redox proteome landscape in young and aged mice across 10 tissues, identifying 60.262 unique cysteines and quantifying ∼34.000 unique sites. CPT represents a highly efficient enrichment strategy, allowing for the deepest analysis of the redox regulated cysteine proteome from living tissues, and highlighting the utility of phosphonate-based tags [3,4,6,62]. A similar cysteine proteomics workflow, but based on desthiobiotin iodoacetamide (DBIA), was established by Kuljanin et al. [5]. The streamlined cysteine (SLC-ABPP) approach comprises TMT multiplexing, and offers a significant advantage over conventional isoTOP-ABPP strategies [2,5]. Another novel enrichment strategy relies on fluorous solid phase extraction (FSPE), which is used by the FluoroTRAQ approach, established by Zhang et al. [28], based on N-[(3-perfluorooctyl)propyl]iodoacetamide (FIAM), while Qin et al. [34] created the tridecafluoronyl iodoacetamide (TFIA)-based fluorous affinity tag(FAT)-switch method. In contrast, Xiao et al. [29] accelerated sample preparation with superTOP-ABPP, which utilizes resin, functionalized with azide groups and cleavable linkers for cysteine peptide enrichment. Moreover, Yan et al. [35] highlighted the potential of improved data processing pipelines, exemplified by the integration of click chemistry-derived labile fragment search algorithms.
To enhance the capabilities of established chemoproteomic workflows, Burton et al. [36] developed a silane-based cleavable isotopically labeled proteomics (sCIP) method paired with a SP3-FAIMS workflow. cCIP isobaric tags are based on iodoacetamide-alkyne (IAA) that is click-assembled with a biotin-linked reporter moiety. The strategy is extendable to MS2-level quantification, by using IAA isotopologues as balancer, creating isobaric sCIP, which yield dihydrooxazolium reporter ions upon fragmentation.
To date subcellular analyses of the cysteine proteome have been limited to database annotations, while failing to account for dynamic localization of proteins, a limitation with regard to the compartmentalized nature of redox processes. However, novel elegant strategies afford subcellular resolution by integrating location-specific biotinylation. Yan et al. [37] developed a strategy for compartment specific interrogation of the cysteinome (Cys-LoC) and redoxome (Cys-LOx), by targeting TurboID to different subcellular compartments for proximity ligation (PL), paired with the SP3-Rox [7] workflow and selective enrichment. In a similar approach Kisty et al. [38], combined a redox proteomic workflow [18] with targeting of TurboID to the cytosol and mitochondrial subcompartments. In addition, an APEX PL approach relying on endogenous H2O2 for APEX activation was established, which enables subcellular interrogation of cysteine redox states at local ROS hotspots. Furthermore, Yan et al. [39] establish a strategy to profile the cell surface cysteine proteome (Cys-Surf) by utilizing cell surface capture (CSC; biotinylation of glycoproteins) and selective enrichment, which unveiled unknown redox dynamics of the surface cysteinome. Taken together, these strategies show great potential to define redox processes with subcellular resolution.
Bak & Weerapana 2023 [17] developed the first chemoproteomic strategy for the systematic profiling of FeS (iron-sulfur) cluster proteins. Fe restriction and genetic strategies were leveraged to manipulate the binding status of FeS cluster proteins, followed by diagonal correlation of cysteine occupancy and protein abundance (reductive dimethylation) to determine FeS cluster binding. Thereby, this work extends the portfolio of cysteine-ligands that can be interrogated by chemoproteomics.
Many possible metabolite-derived cysteine modifications exist, however there is a dearth of information on the proteome-wide extend of these modifications. Fumarate and itaconate can undergo a Michael addition reaction with cysteine thiolates yielding S-succination [24] and S-2,3-dicarboxypropylation (S-itaconation) [40], respectively. Kulkarni et al. [24] used fumarate hydratase KO cells to establish a global map of fumarate-regulated cysteines with IAA-based chemoproteomics, which defined a unique amino acid signature of S-succination. Cysteine targets were additionally validated utilizing the clickable fumarate-alkyne. In contrast, Mills et al. [40] identified itaconate as anti-inflammatory metabolite acting via cysteine adduction of proteins in immunomodulatory pathways. To systematically profile targets of itaconate, Qin et al. [15] used the novel chemotype 1-OH-Az. This work was extended by Qin et al. [41] with development of the cell-permeable itaconate-alkyne (ITalk) probe, which identified 1131 cysteine targets of itaconate. Moreover, Coukos et al. [42] assessed modification of cysteines by methylglyoxal, a glucose metabolism byproduct, using IAA. Focusing on acylation, Kumar et al. [43] developed the acyl-resin-assisted capture (acyl-RAC) assay to build an atlas of the S-acylated plant proteome. In contrast, Ji et al. [44] employed nanographite fluoride-based solid-phase extraction for the global analysis of S-acylated proteins in human cells, and Zhou et al. [45] developed an enhanced acyl-biotin exchange strategy. Thus, novel chemoprotemic strategies help to define metabolite regulation over the proteome.
A major challenge in the interrogation of cysteine oxidation is the difficulty to distinguish diverse oxidative modifications. Selective reduction or chemoselective labeling overcome this shortfall but are frequently non-stoichiometric, a major limitation (see Table 1).
In order to enable selective profiling of disulfide-constituting cysteines, Qiang et al. [46] developed a strategy which relies on C-end degradation of linear peptides with carboxypeptidase Y, which is unable to fully cleave disulfide-linked peptides. This enriches disulfide-linked peptides and enables their identification when searching against a linearized sequence database.
Glutathione (GSH) is critical for cellular redox homeostasis and can modify proteins upon reaction with cysteine sulfenic acids, S-nitrosothiols, thiyl radical or via disulfide exchange [47]. To study glutathionylation predominantly glutaredoxin(Grx)-based selective mixed disulfide reduction, coupled to affinity capture and isobaric tagging have been employed [47]. This strategy has been adapted by Duan et al., 2020 [48] to quantify both, oxidation and glutathionylation, providing stoichiometric insights into the subcellular redox tone of macrophages. In contrast, VanHecke et al. [49] generated isotope-coded and clickable glutathione derivatives, enabling quantification of glutathionylation on hundreds of cysteines. This strategy was improved by Kukulage et al. [50] in developing G-ICAT (clickable glutathione-based isotope-coded affinity tags), which relies on heavy azido-glutathione-based endogenous glutathionylation followed by reaction of free thiols with azido-glutathione pyridine disulfide, enabling relative quantification of over 1500 glutathionylated cysteines.
Primarily dimedone-based chemoselective probes have been used for profiling of protein sulfenic acids (-SOH) [51]. Leveraging a novel chemoselective strategy, Shi et al. [52] utilized the unique properties of triphenylphosphonium (TPP) ylides (Wittig Reagents) for conjugation of sulfenic acids. Wittig alkyne (WYne) probes showed ∼1500-fold increased reactivity over dimedone-based probes, while maintaining selectivity, significantly improving coverage of sulfenylated cysteines. Importantly, upon base-promoted loss of the TPP moiety the derivative WYneN yields a cysteine modification equal to alkynyl iodoacetamide (IPM)-modified cysteine, allowing for hitherto impossible stoichiometric assessment of sulfenylation.
Protein sulfinic acids (-SO2H) have been poorly characterized due to their weak nucleophilicity and the lack of effective chemoselective labeling strategies. Akter et al. [53] developed DiaAlk (Di-tert-butyl azodicarboxylate-alkyne), a diazene-based compound, which reacts with sulfinic acid into a stable sulfonamide, whereas adducts with cysteine thiols are cleavable by DTT. From this, the first ever map and unique dynamics of the human S-sulfinylome were established. This strategy was later employed by Meng et al. [54], paired with analysis of sulfenylation and cysteine thiols to profile distinct cysteine redox forms across the proteome in C. elegans.
The proteomic interogation of cysteine S-nitrosation (-SNO), is facilitated by its susceptibility to chemoselective reduction with copper and ascorbate, while reduction with high concentrations of ascorbate is non-specific [55]. Traditionally, biotin-switch [56], cysteine-resin-assisted capture (SNO-RAC) [30], or triaryl phosphine-based trapping (SNOTRAP) [57] methods were used for SNO assessment, although recent proteomic strategies also utilized different iodoacetamide-, or maleimide-based strategies [20,22,28,33,34,58]. Notably, SNO-RAC was leveraged by Seth et al. [30] for SNO profiling in C. elegans, capturing ∼1000 target proteins, whereas the FAT-switch method by Qin et al. [34] afforded profiling of >1500 SNO targets, similarly to SNOTRAP by Yang et al. [57].
Interrogation of persulfidation (S-sulfhydration) is challenging due to its reactivity, instability and similarity to cysteine thiols [59]. Zivanovic et al. [59] developed a dimedone-based chemoselective tag-switch strategy, facilitating the interrogation of the global persulfidome. Cysteine thiols, persulfides and sulfenic acids are capped with NBF-Cl (4-chloro7-nitrobenzofurazan), generating a mixed disulfide from the persulfide, that is then switched with DCP-Bio1 (biotinylated dimedone derivate), enabling selective enrichment. From this, persulfides were identified as protective modification against overoxidation of cysteines in aging. An alternative strategy by Fu et al. [60] builds on the pKa difference between persulfide and cysteinyl thiol. This afforded efficient labeling of persulfides at a low pH, while cysteine thiols exhibit low nucleophilicity, facilitating deep profiling of the human persulfidome.
Cysteine residues are the predominant amino acid coordinating zinc. However, the non-covalent nature of coordination bonds and their concomitant lability impose a major challenge to the proteome-wide interrogation of zinc binding to proteins. Pace & Weerapana [61] pioneered an early chemoproteomics approach measuring cysteine accessibility of native proteins upon zinc manipulations, capturing a small fraction of the zinc binding proteome. In a new study from our group on bioRxiv, Burger et al. [62] advance this strategy using CPT-based chemoproteomics to quantify the zinc binding status of over 52,000 cysteines across the human proteome. This work amounts in a significant expansion of the experimentally defined zinc binding proteome.
Selenocysteine is a rare cysteine analogue containing selenium instead of sulfur. It has a unique isotopic distribution and lower pKa than cysteine, which increases reactivity and conveys its critical role in redox biology [63,64]. Canonical selenocysteine incorporation into proteins occurs co-translationally at UGA stop-codons, marked with a selenocysteine insertion sequence (SECIS) [63,64]. However, discovery of non-canonical selenoproteins remains challenging. The low pKa (∼5.2) of selenocysteine enables derivatization at a low pH, while precluding labeling of cysteines (pKa ∼8.5), allowing for selective selenoprotein enrichment. Bak et al. [63] employed this strategy combined with targeted MS, which enabled robust identification of known selenoproteins. Similarly, Guo et al. [64] developed SecMS, which utilizes low pH labeling to profile selenoproteins in cells and tissues. Additionally, generation of a SECIS-independent selenoprotein database facilitated the identification of non-canonical selenoproteins. In contrast, Gao et al. [65] developed selenium-encoded isotopic signature targeted profiling (SESTAR), a computational algorithm, which detects the unique isotopic envelop pattern of selenopeptides, to improve their identification. Employing a comparable strategy Jedrychowski et al. [66] identified non-canonical facultative selenation of specific sites across predominantly metabolic proteins and incorporation of selenomethionine at select methionine sites.
Tremendous technological advances in cysteine proteomics have facilitated a new era in the development of covalent drugs. A landmark-study by Backus et al. [26], defining ligandability of thousands of cysteines by screening a covalent fragment library, marked the beginning of a substantial acceleration in the field. Bar-Peled et al. [68] applied these powerful approaches to assess cysteine ligandability in KEAP1-mutant lung cancer cell and identified covalent small molecules that selectively target NRF2-dependent lung cancers. Another seminal study by Vinogradova et al. [25] leveraged cysteine reactivity and ligandability profiling, as well as activity-guided screening (immunomodulation), to establish a map of ligandable cysteines in primary human T cells, which identified inhibitory electrophiles preventing their activation. Chemoproteomic profiling was further developed by Kuljanin et al. [5] in integrating TMT multiplexing with a novel DBIA-based strategy for rapid and deep cysteine profiling. In a recent notable study Yang et al. [8] extended this work into a high-throughput SP3-TMT-enabled platform for deep reactive cysteine profiling, further pushing the boundaries of large scale electrophile screening. Further work by our group [6] employed deep CPT-based cysteine proteomics to develop a selective covalent creatine kinase (CK) inhibitor, driving cytotoxicity in CK-dependent acute myeloid leukemia by disruption of the creatine phosphagen system. Instead, Fu et al. [69], built on the paradigm that oxidizable cysteines frequently exhibit regulatory function and explore the sulfenylome as a distinct druggable space. Utilizing the chemoproteomic platform SulfenQ [70], the authors screened nucleophilic fragments for engagement with sulfenic acids, emphasizing the potential chemoselective drugs. To extend chemoproteomic profiling capabilities, Cao et al. [71] developed a paired copper and ascorbate-based click-chemistry and Suzuki-Miyaura cross-coupling tandem labeling approach for orthogonal multiplexed profiling of cysteine and lysine reactive probes. Furthermore, Cookson et al. [72] employed activity-based (biotinylated ubiquitin derivative) probes for profiling of electrophilic fragments against deubiquitinases (DUBs). In addition to these examples, numerous other studies screened cysteine ligandability, often as means to validate novel chemoproteomic strategies [2,5,10,18,27,71,73].
Although new technologies achieve deep coverage of the cysteinome, a major remaining challenge is the designation of functionality to cysteines. A novel strategy to this was recently developed by Li et al. [74], combining base-editing cysteine mutagenesis and chemoproteomics. This work establishes an atlas, defining the relevance of over 13,800 cysteines in 1750 cancer critical proteins. In contrast, Zhang et al. [75] integrated DBIA-based chemoproteomics and CRISPRi screening to interrogate cysteines subject to oxidation mediated by anti-cancer treatment. The study defined hundreds of cysteines that were commonly regulated by multiple drugs and identified a novel H2O2-sensing pathway. Additionally, two noteworthy studies were published on bioRxiv. Takahashi et al. [76] established DrugMap, a quantitative analysis of cysteine ligandability across 416 cancer cell lines covering 25 cancer types using DBIA-based chemoproteomics. Paired with novel cysteine set enrichment analysis (CSEA) and structural profiling this data revealed substantial heterogeneity of cysteine ligandability across cancers. Furthermore, Desai et al. [77] interrogated the landscape of gain-of-cysteine variants by integrating chemoproteomics, as well as exome and RNA sequencing across 11 cell lines. This determined ligandability of select target cysteines, highlighting the potential of developing proteoform-specific covalent molecules. Together this marks an unprecedented acceleration in proteomics-based covalent drug development.
The ever-increasing collection of cysteine proteomics datasets contains an untapped wealth of information, that is made accessible to researchers via novel interactive websites. Recent efforts have focused on integrating these datasets into unified resources (Table 2). As such Wang et al. [78] curated iCysMod, covering 8 types of cysteine PTMs, whereas Meng et al. [79] created CysModDB covering 12 distinct cysteine PTMs (Table 2). Similarly, Boatner et al. [80] accomplished integration of chemoproteomics data in developing CysDB, a formidable webapp, integrating numerous datasets with extensive annotations and analysis tools, empowering covalent drug development (Table 2), and facilitated by optimized database integration [73].
Comments (0)