When atomic coordinates are available, α-helices are characterized by backbone torsion angles, typically falling in a relatively narrow range (ϕ = − 63 ± 6°; ψ = − 42 ± 6°), and their H-bonding between the C=O of residue i and the HN of i + 4 (Kabsch and Sander 1983). In the absence of structural information, α-helices are readily identified by their characteristic deviations from random coil chemical shifts (Spera and Bax 1991; Wishart et al. 1991) or, if available, sequential NOE patterns (Wüthrich 1986). The H-bonding pattern in α-helices results in highly regular structures that often closely fit to an idealized α-helix (Table S1 for ideal helical parameters used in this study). As a consequence, RDCs for backbone N-H pairs, whose internuclear vectors differ by 15.8° in orientation from the helix axis, show a characteristic sinusoidal pattern as a function of residue number with a periodicity of 3.6 residues, known as the “dipolar wave” (Mesleh et al. 2003). Below, we used the neural-network based TALOS-N program (Shen and Bax 2013) for identifying α-helices of ≥ 5 residues in length. For our current purpose of finding close-to-ideal α-helices, we found it necessary to terminate helices at the last residue preceding a Pro residue, or to start them at the first residue following a Pro when such a residue is embedded in a longer, kinked α-helix.
RDC fitting of α-helicesAgreement between RDCs and atomic coordinates is limited by the precision at which RDCs can be measured as well as by uncertainties in the atomic coordinates. The latter often dominate the scatter between observed and best-fitted RDCs, obtained from a singular value decomposition (SVD) fit (Losonczi et al. 1999) of the RDCs to the internuclear vector orientations (Shen et al. 2023). The SVD fit can include weighting to account for the experimental error estimates, but in our examples below, the experimental errors are assumed to scale with the inverse of the dipolar interaction constants that are determined by the magnetogyric ratios and internuclear distances of the pairs of atoms. For example, the errors in 1DC′Cα are assumed to be five times smaller than for 1DNH, effectively giving 1DC′Cα and 1DNH equal weight in the analysis. The use of equal weights for fitting the various types of normalized RDCs, available for MBP, calmodulin, and monomeric MPro, is justified by comparable qualities of their SVD fits, indicating that the errors in the fit are dominated by coordinate uncertainty, not by measurement error of the RDCs. However, in the program Helix-Fit, as well as all our other RDC analysis software, the estimated error in the measured RDC will be used if specified in the input table.
For a well-ordered globular domain that contains multiple helices, separate fits of the RDCs to its various helices are expected to yield very similar alignment tensors, with the uncertainty in each alignment tensor determined by jackknifing. The jackknife procedure cyclically omits one RDC from the total set of N RDCs, and then carries out N SVD fits on the remaining N-1 RDCs, resulting in N alignment tensors. Similarly, jackknifing at the residue level can be carried out by cyclically leaving out all RDCs (e.g. 1DNH, 1DC′N, 1DCαC′, 2DHC′) measured for a given amide, a procedure useful for identifying whether a residue identified by TALOS-N as the terminal helix residue indeed is consistent with helical geometry.
We represent the uncertainty in the orientation and rhombicity of the final, averaged alignment tensor, < S > , by ε(S), which is related to the spread in their normalized scalar products (Sass et al. 1999):
$$\varepsilon \left( S \right) = \left( } \right)^/}} }\left[ - }\left( ,\, < S > } \right)} \right\}} \right]$$
(1a)
where RMS =\(\sqrt(_,\langle S\rangle )\right)}^/N}\) is the root-mean-square (RMS) deviation from unity for the normalized scalar products P(Si, < S >) of the N jackknifed five-dimensional Saupe matrix vectors, Si = [Szz, (1/\(\sqrt\))(Sxx-Syy), (2/\(\sqrt\))(Sxy), (2/\(\sqrt\))(Sxz), (2/\(\sqrt\))(Syz)]i=1..N, relative to the averaged Saupe matrix vector < S > , or P(Si, < S >) = Si· < S > /(|Si| ×|< S >|).
Analogously, the fractional uncertainty in the alignment strength is
$$\varepsilon \left( G \right) = \left( } \right)^/}} }\left( - \, < G > } \right)/ < G >$$
(1b)
with Gi given by (Clore and Garrett 1999):
$$G_ = \left\^} \left[ + Rh_^} } \right]/} \right\}^/}}$$
(1c)
where Da,i and Rhi are the magnitude and rhombicity of the alignment tensor, obtained when RDC i is removed from the fit. The fit is repeated N times and each SVD fit omits a different coupling from the N available RDCs. < G > represents the average over the N Gi values obtained by this jackknife procedure, with reported values normalized to the interaction strength of the 15N-1H backbone amide pair.
A jackknifed Qjk factor (Shen et al. 2023) is used to evaluate the SVD fit:
$$_=\sqrt\frac_D}_^-__^)}^}_^\left(4+3_^\right)/5\right\}}}$$
(2)
where \(_^\) and \(_^\) refer to the predicted and the measured values for coupling i when the SVD is carried out for all other N − 1 couplings, excepting i. When fitting a large number (N >> 5) of RDCs, Qjk approaches the standard Q value, derived when including all RDCs in the SVD fit. However, because the SVD fit includes five adjustable parameters, the standard method for deriving Q strongly overestimates the goodness of the fit when the number of RDCs is small (N < ~ 25). This problem is solved by the computationally more burdensome jackknifing procedure.
With five adjustable parameters in the SVD fit, the jackknifing procedure requires a minimum of six RDCs. For more robust results, in practice we only consider helices with at least eight RDCs. Therefore, if only 1DNH RDCs are measured, helices are required to be relatively long for such an analysis. However, since several other backbone RDCs can be obtained at high relative precision, we limit the minimum length of α-helical elements evaluated by Helix-Fit to five residues, i.e. including at least one i to i − 4 amide-to-carbonyl H-bond.
Helix-Fit can use either the atomic coordinates of the reference structure for SVD fitting or, controlled by a flag, uses the coordinates of an idealized helix that is best-fit superimposed on the heavy backbone atoms (N, Cα, and C′) of the residues selected.
Maltose binding proteinFor ligated maltose binding protein (MBP), there is close agreement between α-helices identified from crystallographic atomic coordinates (PDB entry 3MBP) (Quiocho et al. 1997) and from NMR backbone chemical shifts (BMRB entry 4354) (Gardner et al. 1998) (Fig. 1A). Small differences are confined to the N- and C-termini which may be recognized as helical by TALOS-N, even though their respective ϕ and ψ angles sometimes deviate by more than 30° from canonical helix values (Fig. 1A). For RDC-fitting purposes, we therefore evaluate whether removal of RDCs that report on the orientation of the N-terminal and C-terminal amide planes, identified as helical by TALOS-N, can greatly improve the quality of the RDC fit to the coordinates of an idealized helix (see Methods). For example, the Qjk value obtained from fitting helix 12 decreases from 0.53 to 0.33 after removal of the RDCs arising from the N-terminal amide plane of Q335; a similar improvement of Qjk from 0.30 to 0.19 is obtained for helix 8 after removing the RDCs related to the C-terminal amide plane of E281. The G values and their standard deviations can be compared for fits of the RDCs to the ideal helical coordinates and those in the actual X-ray structures (Fig. 1B). As can be seen, for MBP the spread in G values is small, and the normalized scalar products between the global alignment tensor of the entire protein, SGlobal, and fits to tensors obtained for the individual X-ray (SXray-helix) or idealized (Sideal-helix) α-helices are very large (Fig. 1C), confirming that the helices in ligated MBP are highly ordered.
Fig. 11DHN, 1DC′N and 1DCαC′ RDC analysis of α-helices in MBP. The generalized sampling parameter for these couplings is Ξ = 0.066. A Colored bars along the amino acid sequence refer to residues defined as helical by TALOS-N. Only helices with length ≥ 5 residues are used for RDC analysis, with some terminal residues (red) culled prior to final fitting due to their RDC incompatibility with α-helical structure. Helices < 5 residues are marked in blue and were not considered in the Helix-Fit analysis. B G values and their jackknifed standard deviations for the 13 helices in maltotriose-ligated MBP obtained when using idealized α-helical coordinates (orange) and X-ray atomic coordinates (PDB entry 3MBP, gray); the G value obtained when fitting all RDCs is plotted as the horizontal line, and the corresponding Qjk values (open circles) correspond to the scale at the right side of the panel. C Correlation of normalized scalar products between the alignment tensors obtained for the individual α-helices (labeled by helix number) and the tensor obtained from the full set of RDCs (Sglobal), covering the entire protein
CalmodulinCalmodulin is a two-domain protein, regulating the activity of more than one hundred targets, many of them kinases, in a Ca2+-dependent manner (Crivici and Ikura 1995). It contains two globular domains that are homologous in sequence, and each consists of two EF-hand Ca2+-binding motifs. The two domains are connected by a linker that is entirely α-helical in the crystalline state (Babu et al. 1988; Wilson and Brunger 2000). However, based on solution 15N relaxation analysis that showed independent, near-isotropic rotational diffusion of its two domains, this so-called “central helix” is highly disordered near its midpoint (Barbato et al. 1992). As expected, binding of a paramagnetic lanthanide in one of the two N-terminal Ca2+-binding sites imposes substantial magnetic field alignment for the N-terminal domain, but much weaker alignment on the flexibly linked C-terminal domain (Bertini et al. 2004).
A large set of 1H-15N, 1Hα-13Cα, 13C′i−1-Ni and 13Cα-13C′, as well as two-bond 1Hα-13C′ RDCs were previously reported for Ca2+-ligated calmodulin (Chou et al. 2001). For each calmodulin domain, the RDC data pointed to an, on average, narrower target-binding groove in solution than seen in the X-ray structure. Indeed, a best-fit of the RDCs in calmodulin’s four N-terminal domain α-helices to the 1 Å X-ray structure shows a rather poor fit (Qjk = 0.44) (Fig. 2A). Fitting the same RDCs to the experimental X-ray structure but with coordinates of each of the four helices replaced by best-fit superimposed ideal helices shows a comparably poor fit (Qjk = 0.43). However, separately fitting the RDCs to coordinates of individual idealized helices that are best-fit superimposed on the X-ray helices are considerably better (Fig. 2B–E), comparable to results obtained by fitting to the 1.0 Å X-ray coordinates of these four helices (SI Fig. S1). This result is consistent with the earlier observation that the average EF-hand interhelical angles in solution differ from those seen in the X-ray structure by ca 25° (Chou et al. 2001).
Fig. 2RDC analysis of α-helices in Ca2+-calmodulin’s N-terminal domain. A Fit of the normalized 1DNH (red ball), 1DHαCα (yellow ball), 1DC′N (green ball), 1DCαC′ (blue ball), and 2DHαC′ (pink ball) RDCs, reported by Chou et al. (2001) against values predicted by an SVD fit to the 1 Å X-ray structure (PDB entry 1EXR) (Wilson and Brunger 2000). B–E Individual SVD fits of normalized experimental RDCs to coordinates of idealized α-helices, best-fit superimposed on the corresponding backbone atoms (N, Cα, C′) of the X-ray coordinates. B Helix 1 (E6-F19); C Helix 2 (T29 to S38); D Helix 3 (E45–E54); and E Helix 4 (F65-R74). N refers to the number of backbone RDCs available for the SVD fits, and Qjk to the jackknifed Q values
RDC fits of the four individual N-terminal domain helices all exhibit G values well above the G value obtained when fitting RDCs of the entire N-terminal domain, regardless of using idealized coordinates or the X-ray coordinates for the SVD fits (Fig. 3A). This discrepancy in G values is primarily caused by a difference between the average relative helix orientations seen in the X-ray structure and those present in solution (Chou et al. 2001). These differences in average helix orientation are reflected in the well below unity value of the normalized scalar products of the alignment tensors obtained for the individual helices, both relative to one another and relative to a fit of each entire domain (Fig. 3B, C). Indeed, the variability in the interhelical EF-hand angles seen in high-resolution X-ray structures of complexes with different target sites highlights the importance of the flexible nature of these EF-hands, which allows calmodulin to fine-tune the interactions with its wide range of targets (Akke and Chazin 2001).
Fig. 3Alignment of helices in Ca2+-ligated calmodulin, oriented in 15 mg/ml Pf1 (Chou et al. 2001). A Generalized alignment strengths, G, obtained by eight separate SVD fits of its α-helices (E6-F19; T29-S38; E45-E54; F65-R74; E82-F92; A102-N111; D118-E127; Y138-M145) when using the X-ray structure (1EXR; grey) or coordinates of idealized helices that are best-fitted to the X-ray structure (orange). 1DNH, 1DHαCα, 1DC′N, 1DCαC′, and.2DHαC′ couplings were used (Ξ = 0.00). Error bars correspond to the ε(G) values derived by jackknifing (Eq. 1b). The left horizontal line corresponds to the <G> value obtained by simultaneously fitting RDCs of all four N-terminal domain helices to the X-ray structure (Da = 9.97 ± 0.17 Hz; Rh = 0.41 ± 0.03; G = 9.46 ± 0.15); the right horizontal line corresponds to the <G> value obtained by fitting RDCs of all four C-terminal domain helices to the X-ray structure (Da = 9.27 ± 0.17 Hz; Rh = 0.65 ± 0.02; G = 9.52 ± 0.19). B Qjk values obtained by fitting RDCs to the X-ray (grey) and idealized (orange) helical coordinates. C Normalized scalar product values P(Si, Sj) for the alignment tensors S obtained for N-terminal domain α-helices relative to one another, and relative to the alignment tensor obtained for a global RDC fit of the entire domain (i,j = ‘H1’, ‘H2’, ‘H3’, ‘H4’, and ‘Nterm’). The lower right half corresponds to use of idealized helical coordinates; the top left half corresponds to using X-ray coordinates. D Same as C but for the C-terminal domain (see also SI Fig. S2)
As previously reported (Chou et al. 2001), RDCs measured for calmodulin’s C-terminal domain are more consistent with the X-ray structure than those of the N-terminal domain, simply reflecting a smaller difference in the average EF-hand interhelical angles in solution relative to the crystalline state. This conclusion is also reflected in P(Si, Sj) values for the C-terminal domain helices (i,j = ‘H5’ to ‘H8’) (Fig. 3D) that are closer to unity than for the N-terminal domain (i,j = ‘H1’ to ‘H4’) (Fig. 3C). Remarkably, the G values of the N- and C-terminal domains are very similar to one another, despite the fact that these domains are flexibly linked. While the P(SNterm, SCterm) value for the alignment tensors of the N- and C-terminal domains, SNterm and SCterm, is low (0.08) when using the X-ray structure as a reference, reorienting the C-terminal domain such that its principal axis system coincides with that of the N-terminal domain raises this P(SNterm, SCterm) value to 0.98 (Fig. S3). It could be argued that the latter result is consistent with a static structure that adopts this alternate relative domain orientation. This incorrect conclusion highlights the fact that a P(SNterm, SCterm) value near unity does not prove the absence of large angular motions; however, a low P(SNterm, SCterm) value reflects a difference in static structure and/or relative domain motions. A statistically meaningful difference in G value for different domains requires interdomain motion, typically of substantial angular amplitude. For randomly oriented pairs of alignment tensors with random rhombicity, [Si, Sj], the P(Si, Sj) distribution spans the range of 0 to 1, with the relative probability of a P(Si, Sj) = α value approximately decreasing with cos(2α/π) (SI Fig. S4).
Monomeric SARS-CoV-2 MProA SARS-CoV-2 MPro construct, lacking the N-terminal residues S1-P9 that stabilize the dimerization of the native form of the enzyme, and further inactivated by the active site H41Q mutation (MPro10−306,H41Q), was expressed in D2O medium. For multiple β-sheet residues in the N-terminal domain and α-helical residues in the C-terminal domain, back exchange of the amide protons with the protonated solvent was incomplete, even after several days at neutral pH and room temperature, causing multiple backbone amide signals to be very weak or absent. Nevertheless, a sufficiently large set of 1DNH, 1DC′N, 1DCαC′, and 2DC′H RDCs was obtained (Table S2) that yielded well-defined alignment tensors for each of its six α-helices: one in the N-terminal domain (Y54-I59) and five in the C-terminal domain (R201-I213; L227-Y237; D245-L250; V261-Q273; and F294-Q299 (Fig. 4). There are several X-ray structures for monomeric MPro (2QCY (Shi et al. 2008); 2PWX (Chen et al. 2008); and 3F9E (Hu et al. 2009). These all pertain to the highly homologous (~ 96% identity, Zhang et al. 2011) previous SARS coronavirus isolate and display a different orientation of the C-terminal domain (I200-Q306) relative to the N-terminal catalytic domain from that seen in the native dimeric state (Fig. 4). As was observed for calmodulin, good SVD fits were obtained for RDCS of the individual helices measured in the monomeric state to coordinates of helices in the 1.2-Å X-ray structure of the native dimer, 7K3T (Andi et al. 2022) (Fig. 5A–F). On average, slightly lower quality fits and less well-defined alignment tensors, reflected in higher Qjk and ε(G) values, were obtained for fits to idealized helices (Fig. 5G).
Fig. 4Superposition of MPro subunit taken from the native homodimer X-ray structure, 7K3T (pink), and its R298A monomeric mutant, 2QCY (grey). The six 7K3T helices used for RDC analysis are shown in green
Fig. 5RDC analysis of α-helices in monomeric SARS-CoV-2 MPro10−306,H41Q. Fits of the normalized 1DNH (red ball), 1DC′N (green ball), 1DCαC′ (blue ball), and.2DHNC′ (pink ball) RDCs to the X-ray coordinates of homodimeric MPro (PDB entry 7K3T) are shown for the following helices: A Y54-I59; B T201-I213; CL227-Y237; D D245-L250; E V261-Q273; and F F294-Q299. The generalized sampling parameter for these couplings, Ξ, is 0.062. G G values obtained for the six helices when using 7K3T (grey) or idealized helices (orange) as reference structures; the reference <G> value of 7.83 obtained for the five C-terminal helices (Helix-2 to Helix-6), and < G > = 5.34 obtained for Helix-1 in the N-terminal domain, are plotted as horizontal lines. Qjk values (open circles) correspond to the scale at the right side of the panel. H Normalized scalar products P(Si, Sj) for all pairs of the alignment tensors of six helices, [Si, Sj] (i,j = “H1” to “H6”), obtained when fitting the helical RDCs to the coordinates of the homodimeric X-ray structure (7K3T) (upper-left half), and ideal helices (lower-right half). For the corresponding P(Si, Sj) values when using the monomeric MPro X-ray structures (2QCY; 2PWX; 3F9E), see SI Fig. S6
Remarkably, the generalized alignment magnitude, G, is considerably larger for the C-terminal domain than for the Y54-I59 helix in the N-terminal domain, indicative of large amplitude motions of this helix or of the entire N-terminal domain relative to the C-terminal domain (Fig. 5G). Although assignments for the N-terminal domain remain incomplete due to the above mentioned slow back exchange as well as conformational exchange broadening, a large number of yet to be fully identified RDCs was measured for the N-terminal domain. The distributions of normalized RDCs seen in the N-terminal domain appears considerably narrower than for the C-terminal domain (Fig. 6), pointing to a lower alignment strength (Clore et al. 1998a, b). However, the histogram RDC distribution expected for the N-terminal domain when using the alignment tensor obtained for its Y54-I59 helix is somewhat narrower than the observed distribution (Fig. 6A), suggesting that the helix orientation undergoes dynamic fluctuations relative to the N-terminal domain.
Fig. 6Histogram distributions of backbone RDCs in monomeric MPro, normalized to 1DNH. A Experimental RDC distribution for the N-terminal domain in blue, and in red the histogram expected for uniformly distributed vector orientations and the alignment tensor obtained for helix Y54-I59. B Analogous plot of C-terminal domain RDCs, and the expected histogram distribution if vectors were uniformly distributed with an alignment strength and rhombicity obtained from simultaneously fitting all helical RDCs in this domain
For the C-terminal domain, its helices show very similar alignment strengths, indicative of a well-ordered domain which is consistent with the slow hydrogen exchange rates seen for many of its backbone amides. Mobility of helix Y54-I59 relative to the C-terminal domain differs strongly from what is observed when evaluating the RDCs previously measured for the homodimeric state (Robertson et al. 2021), which show very similar G values and high P(Si, Sj) values (i,j = ‘H1’ to ‘H6’) (Fig. S5).
For RDCs measured here in the monomeric state and for all of the X-ray structures evaluated (2QCY; 2PWX; 3F9E; and dimeric 7K3T), we find a low value of the normalized scalar products, P(Si, Sj), of the alignment tensor of the Y54-I59 helix (Si, i = ’H1’) relative to those of the C-terminal domain helices (Sj, j = ‘H2’ to ‘H6’) (Fig. 5H; SI Fig. S6), confirming that in solution the relative domain orientation differs from that in any of the X-ray structures and presumably is subject to large amplitude dynamics.
For the C-terminal domain helices, the P(Si, Sj) values (i,j = ‘H2’ to ‘H6’) are close to unity when using the 1.2 Å X-ray structure of the homodimer as a reference but, on average, somewhat lower when using the monomer X-ray structures as references (Fig. S6). However, superposition of the C-terminal domain of monomeric MPro X-ray structures on those of the homodimer shows close correspondence, with a Cα RMSD ≤ 0.64 Å. Therefore, these somewhat lower P(Si, Sj) values likely result from slightly less accurate atomic coordinate positions in these monomeric X-ray structures.
Comments (0)