Population-based prevalence and mutational landscape of von Willebrand disease using large-scale genetic databases

Global mutational spectrum of the VWF using population-based exome and genome sequencing data

We collected high-quality data from gnomAD including 141,456 subjects with different ethnicities (Table 1), i.e., Africans/African Americans (12,487 subjects), Latinos/Admixed Americans (17,720), Ashkenazi Jews (5,185), East Asians (9,977), Finnish Europeans (12,562), non-Finnish Europeans (64,603), South Asians (15,308) and also 3,614 additional persons without an assigned ethnicity. The gender distribution of participants was 54% males and 46% females.

Table 1 GnomAD database composition according to population details.

The mean depth of coverage per base in all VWF exons was generally greater than 30 for both exome and genome sequencing except for exon 26 (Supplementary Fig. 1). The lower coverage of exon 26 is primarily due to alignment of the sequences with human genome reference, being aligned with the pseudogene instead of the VWF. Since the minimum depth of coverage of gnomAD is set at 10 (DP > = 10), only genotypes that pass this threshold were included in our study, and exon 26 has a depth of coverage higher than this threshold. A total of 4,313 different genetic variants were identified within VWF in the gnomAD population. Following a conservative approach to classify variants as pathogenic (i.e., as responsible for VWD), we found 505 distinct VWF deleterious variants of which 287 (57%) have not been reported to be associated with VWD in the literature nor in VWD-related databases (Supplementary Table 1), whereas 218 (43%) had been already reported (Supplementary Table 2). The distribution of mutation types for 505 variants identified in the gnomAD is depicted in Fig. 1. Missense accounted for the majority of variants (n = 355, 70%) followed by frameshift (n = 53, 10%). Gene variants affecting stop codons including stop-gained (n = 40, 8%) and stop-loss (n = 1) as well as variants affecting a splicing site (n = 41, 8%) were also identified. There were also 14 inframe insdels (3%) and one synonymous variant (Fig. 1a). A similar distribution of mutation types was observed between novel (n = 287) and previously reported (n = 218) variants (Fig. 1a). Data on gene constraint provided by the gnomAD indicates that VWF seems to be intolerant to missense variants while being tolerant of synonymous and loss-of-function variants (Supplementary Table 3).

Fig. 1: Distribution of various mutation types for VWF genetic variants identified in the gnomAD, HGMD and LOVD databases.figure 1

a Identified pathogenic variants in the gnomAD population (n = 505) including novel predicted pathogenic variants (n = 287, b), and those already being reported to be associated with VWD (n = 218, c). d, e VWF variants (n = 927) that have been reported so far to be associated with VWD in LOVD (d) and HGMD (e).

Out of the 505 selected pathogenic variants, 244 (48%) were unique and each variant was identified in one subject only. The frequency of novel variants (n = 287) was much higher in non-Finnish Europeans (35%), Africans/African Americans (27%), Latinos/Admixed Americans (18%) and to a lesser rate in East Asians (13%). However, only 3% were identified in Ashkenazi Jews and 2% in Finnish Europeans. Among a total number of 282,912 alleles analyzed, 31,785 contained VWF pathogenic variants. Only 2.9% of the affected alleles were carrying of the novel variants. In the East Asian population, as many as 18.9% of affected alleles carried novel variants, whereas among other ethnicities the impact of novel variants was considerably lower (1.3–4%, Table 2).

Table 2 The number of affected alleles by already reported and novel variants identified in the gnomAD population.

Among the 141,456 participants in the gnomAD, 1206 were homozygotes for 26 different VWF pathogenic variants (Supplementary Table 4), the rest of those with pathogenic variants being heterozygotes or compound heterozygotes.

Mutational spectrum of the VWF in the HGMD and LOVD databases

When data analysis was extended to the two main databases containing VWD-associated variants, i.e., HGMD and LOVD, we found that 1024 different VWF variants have been so far associated with VWD, 927 of them being single nucleotide variant (SNV) and short insertions/deletions. Of the latter variants, 872 were found in HGMD and 608 in LOVD. Our findings show that the distribution of VWF mutation types in the gnomAD dataset was similar to those in the HGMD and LOVD and did not change between the novel and already reported variants (Fig. 1).

VWD type distribution in the gnomAD population and HGMD/LOVD datasets

In the gnomAD population, 218 of 505 different pathogenic variants have been already reported to be associated with VWD, of which 61% were responsible for quantitative VWF defects, including 36% for type 1 and 25% for type 3. For qualitative defects, 10% were type 2A, 10% type 2M, 7% type 2N, and 5% type 2B. About 7% of these identified variants were unclassified (UCs). Comparing these data with the so far reported variants in VWD, a higher proportion of genetic variants of type 1, 2M, 2N and UCs were found in the gnomAD population (Fig. 2).

Fig. 2: VWD type distribution of all the so far reported and gnomAD identified variants.figure 2

a According to our analysis 927 VWF variants (SNV and short insertions/deletions) have been reported so far in the two VWD-related databases (HGMD and/or LOVD) to be associated with VWD. Of which, 555 (60%) were reported in quantitative VWF defects including type 3 VWD (n = 345, 37%) and type 1 (n = 210, 23%). For type 2 VWD with qualitative VWF defects, 20% were reported in type 2A (n = 189), 8% in type 2M (n = 76), 5% in type 2B (n = 41) and 4% in type 2N (n = 36). Out of 4313 different VWF variants, we identified 505 pathogenic variants in the gnomAD population of which 287 were novel and 218 were already reported in patients with VWD. b For the latter group, the number of VWF variants identified was higher for type 1 (n = 78, 36%) than type 3 VWD (n = 54, 25%). Among type 2 variants identified in the gnomAD, 10% (n = 23) were type 2A, 10% (n = 21) 2M, 7% (n = 15) 2N and 5% type 2B (n = 12).

Domain distribution of VWF variants in the gnomAD, HGMD and LOVD datasets

The domain distribution of all VWF variants in the HGMD and LOVD datasets, all variants selected in the gnomAD (n = 505) and the novel variants in this database (n = 287) are shown in Fig. 3. Among all pathogenic variants in gnomAD and also among those novel, fewer variants were identified in the VWF D’-D3, A1-A2, and CK domains. However, more novel variants were identified in the D1-D2, A3, D4 and C1-C6 domains (Fig. 3). We further explored the location on VWF domains of different VWD types for all variants in the HGMD and LOVD datasets (Fig. 3). Type 1 and 3 VWD variants were spread all over the VWF domains, mostly at D1-D2, D’-D3 and C1-C6. For type 2A VWD, 45% of variants were located at the A2 domain and the rest at D1-D2 (15%), D’-D3 (18%), A1 (14%) and CK domains (5%). All variants of type 2B VWD were located at the A1 domain (85%) or D3-A1 junction (12%). Almost all type 2M variants were at the A1 (74%) or A3 domains (17%), with a few exceptions at the other remaining domains. The majority of type 2N variants were at the D’-D3 (89%), and the rest 11% at the VWFpp. The UCs were distributed throughout all domains.

Fig. 3: VWF domain distribution and the type of VWD for all the so-far reported (SNV and short insertions/deletions) and pathogenic variants selected from gnomAD.figure 3

a There were fewer variants in the D’-D3, A1, A2 and CK domains among all identified (n = 505) and novel variants (n = 287) in the gnomAD population compared with those of HGMD and LOVD datasets. b We further explored the location of different VWD types on VWF domains for all the so-far reported (SNV and short insertions/deletions) variants in the HGMD and LOVD datasets. Variants of type 1 and 3 VWD were found all over the VWF domains, mainly VWFpp (D1-D2 domain), D’-D3, D4 and C1-C6. In type 2A, 45% of variants were at the A2 domain and the rest were at the D1-D2 (15%), D’-D3 (18%), A1 (14%) and CK domain (5%). All variants of type 2B were at the A1 domain (85%) or D3-A1 junction (12%). Type 2M variants were located mostly at the A1 (74%) but also A3 domains (17%) with a few exceptions on the other domains. A majority of type 2N variants were at the D’-D3 (89%) and the rest 11% at the VWFpp. The unclassified VWF variants (UC) were distributed throughout the VWF domains.

Most frequent variants in the gnomAD population stratified by VWD type and ethnicity

The five most frequent VWF variants identified in each ethnic group are shown in Table 3. Several VWF variants previously associated with VWF deficiency were relatively common in different ethnicities: p.Arg2185Gln, p.Met740Ile, p.Pro2063Ser, p.His817Gln, p.Arg924Gln, p.Met576Ile, p.Thr2647Met, p.Gly967Asp, p.Thr1034del and p.Ser1731Thr. Generally, in all ethnicities type 1 variants were the most frequent (Table 3). Two type 2N variants were recurrent in Africans/African Americans (p.His817Gln, MAF = 0.115), Latinos/Admixed Americans (p.His817Gln, MAF = 0.0062), Finnish (p.Arg854Gln, MAF = 0.0056) and non-Finnish Europeans (p.Arg854Gln, MAF = 0.0053). For type 2M, p.Ser1731Thr was common in Ashkenazi Jews (MAF = 0.0209) and p.Val1439Met was one of the most frequent variants in Finnish Europeans (MAF = 0.0048). The type 2A variants p.Gly624Ser (MAF = 0.0048) and p.Gly1672Arg (MAF = 0.0018) were among the most frequent variants in East Asians. Type 2B variants, including p.Pro1266Leu (MAF = 0.0036) and p.Asn1231Ser (MAF = 0.0099) were among the most frequent in Finnish Europeans and South Asians. Also type 3 VWD variants were identified in Africans/African Americans (p.Thr1034del, MAF = 0.0152) and South Asians (c.1730-5 C > T, MAF = 0.0049).

Table 3 Most frequent ethnicity-specific variants identified in gnomAD with an already established association with VWF deficiency.

Ten different variants had a MAF > 0.01 (1%) in at least one population (Supplementary Table 4). Of them, five had an overall population MAF of >1% in Africans/African Americans (p.Arg2185Gln, p.Met740Ile and p.His817Gln), Latinos/Admixed Americans (p.Arg2185Gln, p.Met740Ile and p.Pro2063Ser), Ashkenazi Jews (p.Pro2063Ser), non-Finnish Europeans (p.Arg924Gln) and South Asians (p.Pro2063Ser). Linkage disequilibrium analysis revealed that the three more common variants in Africans/African Americans (p.Arg2185Gln, p.Met740Ile and p.His817Gln) did cosegregate within a common haplotype in 8% of the 1000 genomes project (Supplementary Fig. 2), whereas no combination of these 3 or even 2 variants were observed in other ethnicities.

Population-based prevalence of autosomal recessive- and dominant VWD

We calculated the worldwide and within population prevalence of VWD for both autosomal dominant and recessive forms, because VWD can be inherited in both patterns (type 1, 2A, 2B and 2M as dominant, type 3 and 2N as recessive). When we considered all identified pathogenic variants (n = 505), 13% of the gnomAD alleles carried VWF pathogenic variants in the heterozygous state and 0.48% in the recessive state (Table 4). The aforementioned overall frequency estimation was calculated after removing the 3 common variants in African/Americans (p.Arg2185Gln, p.Met740Ile and p.His817Gln). In African/American the frequencies of carriership and recessive forms were 17.2% and 0.90%, respectively. A similar estimated frequency was found for Latino/Admixed Americans (18.6% and 1.07%), South Asians (16.8% and 0.86%) and Ashkenazi Jews (15% and 0.67%), whereas a lower prevalence was estimated among East Asians (5.3% and 0.07%), Finnish (9.4% and 0.24%) and non-Finnish Europeans (11% and 0.34%). In the second approach meant to estimate the global prevalence of VWF alleles with pathogenic variants, analysis was limited only to the identified gnomAD variants previously described in VWD (n = 218). The analysis showed an estimation almost identical to the former approach (Table 4), indicating that the novel variants identified are very rare. Indeed, the novel variants identified in the gnomAD affected only about 3% of mutant VWF alleles (935 alleles of 31785, Table 2). To calculate the true global prevalence of dominant and recessive VWD types, we used only the variants reported to be associated with VWD in the gnomAD population (n = 218) with an already established autosomal dominant or recessive inheritance pattern. The global prevalence of dominant VWD was 7.4% for type 1, 0.3% for 2A, 0.3% for 2B and 0.6% for 2M. For the recessive VWD forms, it was 0.31% for 2N and 0.7% for type 3 (Table 5). The within-population prevalence of VWD subtypes is summarized in Table 5 and Supplementary Tables 510.

Table 4 Estimated global prevalence of carriership and recessive VWF variants.Table 5 Estimated global prevalence of autosomal dominant- and recessive von Willebrand disease (VWD).

Comments (0)

No login
gif