Table 1 presents a comprehensive summary of the scientific publications, documents, and other sources that were gathered by our team to develop this standardized evidence-based framework for classification of sequencing variants.
Table 1 Summary of sources used to develop the Hospital Israelita Albert Einstein Standards for Constitutional Sequence Variants ClassificationVariant databasesOur service utilizes several variant databases as a source of molecular, epidemiologic, and clinical information. These databases are widely recognized as valuable resources for the classification and interpretation of sequence variants and contain extensive information on the clinical significance, frequencies in different populations, and functional impact of known variants.
Databases for population frequencies include the Genome Aggregation Database (gnomAD) [18] and the Online Archive of Brazilian Mutations (ABraOM) [19]. Population frequency criteria are applied for allele frequencies in overall populations and subpopulations of gnomAD (African/African American (AFR); American Admixed/Latino (AMR); East Asian (EAS); Non-Finnish European (NFE); South Asian (SAS)) with no founder effect, more than 2,000 alleles tested and variants present in 5 alleles in the databases.
Databases for pathogenic variants and phenotypic and genotypic data include ClinVar [20]; Single-Nucleotide Polymorphism Database (dbSNP) [21]; Database of Chromosomal Imbalance and Phenotype in Humans using Ensembl Resources (DECIPHER); Online Mendelian Inheritance in Man (OMIM); Clinical Genome Resource (ClinGen) [4, 22, 23]; published articles; and our internal database, which currently includes genomic information from 13,609 exomes, 3,706 genomes and 2,737 targeted-gene panels.
Terminology of molecular findingsThe workgroup adopted the specific standard terminology proposed by ACMG/AMP for variant classification, including the terms 'pathogenic', 'likely pathogenic', 'variant of uncertain significance' (VUS), 'likely benign', and 'benign', to describe variants identified in Mendelian disorders.
Disease-Gene association and clinical impact of genesOur service uses a systematic approach to collect and evaluate available scientific evidence to establish the clinical validity of gene-disease associations for all genes. To achieve this, we followed the gene-disease classification framework proposed by ClinGen, including “limited”, “moderate”, “strong”, and “definite” evidence, and classified genes with contradictory evidence as “disputed” or “refuted” [24]. In cases where ClinGen had already curated a gene, we utilized their classification as a standard. For genes not previously classified by ClinGen, our team followed the same approach to curate the gene-disease association validity.
Overall, variant classification is applied only for those genes whose clinical validity is defined as at least limited by the ClinGen group or our internal assessment. Additionally, the molecular impact of a variant and its consequent classification are assessed according to its effect on the primary transcript, which is prioritized in our service in the following order of transcript references: MANE Select, RefSeq Select, MANE Clinical Plus, and RefSeq. If only RefSeq transcripts are described, the largest transcript is chosen as the primary transcript.
Criteria nomenclatureThese standards have adopted the use of two sets of criteria originally proposed by the ACMG/AMP: (1) pathogenic criteria include PVS1, PS1–4, PM1–6 and PP1–5; (2) benign criteria include BA1, BS1–4 and BP1–6. These criteria are divided into five categories, namely population frequency data, variant type and location, case-level data, functional and computational data, and renewable source [3]. The weight for each criterion was modified based on the latest evidence from the literature and professional judgment. A comprehensive summary of all criteria and their categories and weights is presented in Fig. 1.
Fig. 1Adapted from Harrison et al. [3]
Criteria categories and their strength levels for the classification and interpretation of sequence variants. This scheme shows the five categories of all criteria, their direction (benign or pathogenic), and the corresponding strength. The third line shows the scaled odds of pathogenicity using the Bayesian statistical reasoning approach [5]. Pathogenic criteria include PVS1, PS1–4, PM1–6, and PP1–5, while benign criteria include BA1, BS1–4, and BP1–6. Each criterion is followed by possible weight modifications (stand-alone [A], very strong [VS], strong [S], moderate [M], or supporting [P]). Criteria marked with “*” are not used by our team.
Genes with specific criteria modulation and classification defined by variant curation expert panels (VCEPs)The classification process for numerous genes involves specific rules curated by VCEPs. Our service strictly adheres to the criteria modulation and variant classification workflow for all genes curated by their respective VCEP. The lists of all genes curated by a VCEP and their gene-specific rules are compiled in the ClinGen CSpec Registry UI, a centralized database storing approved Criteria Specifications from VCEPs in a structured, machine-readable format. The latest version of the CSpec Registry can be accessed online at https://cspec.genome.network/cspec/ui/svi.
Cancer genesThe process of criteria modulation and variant classification for cancer genes has particular specifications that differ from our general rules. These differences are described below. The list of cancer genes to which these specifications apply is compiled in Additional file 1: Table S1. It is important to note that our internal specifications for cancer genes do not apply to genes curated by VCEPs, including the APC, ATM, CDH1, DICER1, PALB2, PIK3CA, PTEN, RUNX1, and TP53 genes. For these genes, our service strictly follows the criteria modulation and variant classification workflow curated by their respective VCEP, as stated above.
The hospital Israelita Albert Einstein standards for Constitutional sequence variants classification: version 2023Below, we outline our general rules for classifying sequence variants. It is worth mentioning again that these general rules do not apply to genes curated by VCEPs, as they follow their own specific workflow, which is not entirely covered in this article. Additionally, we provide specific modulations for cancer genes.
Our standards are based on the ACMG/AMP guidelines, and we have retained the original nomenclature for each criterion followed by its corresponding weight modulation. For clarity, we first provide the original nomenclature of each criterion, its definition (D), any possible weight modulations (W), conflicts with other criteria (C) and internal modifications and adaptations (MA) that have been incorporated into our standards.
Population frequency data1A) BA1
D: Allele frequency is above 5%.
W: BA1_A
C: BA1, BS1 and PM2 are mutually exclusive.
MA: This represents the only criterion that can be used as a single piece of evidence to classify a variant as benign according to allele frequencies from the databases previously listed. The BA1_A criterion is employed if the variant is not included in the ClinGen exception list for BA1 (clinicalgenome.org/site/assets/files/3460/ba1_exception_list_07_30_2018.pdf) according to the following requirements:
1For a subset of 61 genes with specific recommendations from ClinGen, the frequency thresholds are specified in Additional file 1: Table S2 [16].
2For the remaining genes, BA1_A is applied for an allele frequency ≥ 0.05.
1B) BS1
D: Allele frequency is greater than expected for the disorder.
W: BS1_S
C: BS1, BA1 and PM2 are mutually exclusive.
MA: We have used a conservative approach for the general use of this criterion. The requirements for the use of BS1 and its corresponding weight are as follows:
1For a subset of 61 genes with specific recommendations from ClinGen, the frequency thresholds are specified in Additional file 1: Table S2.
2For the remaining genes, BS1_S is used if the allele frequency is ≥ 0.1% for AD or ≥ 1% for AR, AD/AR or XL (Table 2).
Table 2 Frequency thresholds for the BS1_S, PM2_P and PM2_M criteria1C) PM2
D: Absent from controls or at an extremely low frequency if recessive
W: PM2_P; PM2_M
C: PM2, BA1 and BS1 are mutually exclusive
MA: The PM2 criterion is used according to the following requirements (also shown in Table 2):
1For a subset of 61 genes with specific recommendations from ClinGen, the thresholds of PM2 and its corresponding weight are specified in Additional file 1: Table S2.
2For cancer genes (Additional file 1: Table S1), PM2_M is used when the variant is absent in AD conditions or with a frequency < 0.004% in AR, AD/AR or XL conditions; PM2_P is used when PM2_M conditions are not met and the allele frequency is < 0.004% for AD or < 0.04% for AR, AD/AR or XL.
3For the remaining genes, PM2_M is applied when the variant is absent from controls for AD conditions or at a frequency < 0.001% for AR, AD/AR or XL conditions; PM2_P is used when PM2_M conditions are not met and the allele frequency is < 0.001% for AD or < 0.01% for AR, AD/AR or XL (Table 2).
Variant type and location2A) BP1
D: Missense variant in a gene for which primarily truncating variants are known to cause disease
W: Not applied, except for curated variants from a recognized VCEP
MA: Due to the challenges associated with determining the deleterious effects of rare missense variants that have not been subjected to functional studies or are not located within critical domains, our group has decided not to apply this criterion at present.
2B) BP3
D: In-frame deletions/insertions in a repetitive region without a known function.
W: BP3_P
C: BP3, PM4 and PVS1 are mutually exclusive.
MA: Due to the inherent difficulties in accurately defining repetitive regions and critical domains, particularly in genes with limited evaluation, our group has currently chosen not to routinely employ this criterion. However, in exceptional cases where compelling evidence is available, the use of this criterion may be considered.
2C) BP7
D: A synonymous (silent) variant for which splicing prediction algorithms predict no impact on the splice consensus sequence or the creation of a new splice site, and the nucleotide is not highly conserved
W: BP7_P
C: PP3, BP4 and BP7 are mutually exclusive
MA: Our current approach for applying the BP7 criterion involves utilizing a combination of a SpliceAI score ≤ 0.2 and a GERP score < 0.
2D) PP2
D: Missense variant in a gene that has a low rate of benign missense variation and in which missense variants are a common mechanism of disease
W: PP2_P
C: PM1 and PP2 may overlap, and their simultaneous use requires caution
MA: The PP2_P criterion is used when both of the following requirements are met, based on the gnomAD missense constraint Z score:
1The gene has at least three previously reported pathogenic missense variants.
2The gnomAD missense constraint Z score for the region harboring the variant is > 3.09 [25].
2E) PM1
D: Located in a mutational hot spot and/or critical and well-established functional domain (e.g., active site of an enzyme) without benign variation
W: PM1_M; PM1_S
C: PM1 and PP2 may overlap, and their simultaneous use requires caution; PM1 and PM5 may overlap, and their simultaneous use requires caution
MA: The PM1 criterion is used according to the following requirements, based on the DECIPHER database:
1PM1_S for cysteine substitutions that result in an uneven number of cysteine residues within an EGF-like repeat in NOTCH3.
2PM1_S for glycine substitutions in COL1A1 or other collagen genes.
3PM1_S for cysteine or histidine substitutions in C2H4 zinc fingers (such as GLI3).
4PM1_M for substitutions within a region with DECIPHER missense constraint < 0.4 for the remaining genes [22].
5PM1_M for variants in a region with adequate, sufficient evidence from the literature supporting it as a hotspot or an important functional domain for the gene.
2F) PM4
D: Protein length changes due to in-frame deletions/insertions in a nonrepeat region or stop-loss variants.
W: PM4_P; PM4_M
C: PM4, BP3 and PVS1 are mutually exclusive; variants should meet PM2_P or PM2_M for PM4 to be applied at any level
MA: The PM4 rule is not applicable to repetitive regions, defined as having more than 3 identical sequences (bases or sets of bases). Based on the principle that larger deletions/insertions in nonrepeating regions offer stronger evidence for pathogenicity, we have made the following modifications:
1PM4_P is employed for cases involving the insertion or deletion of 1 or 2 amino acids.
2PM4_M is employed for insertions or deletions of 3 or more amino acids.
2G) PM5
D: Novel missense change at an amino acid residue where a different missense change determined to be pathogenic has been observed before.
W: PM5_P; PM5_M
C: PM1 and PM5 may overlap, and their simultaneous use requires caution. PM5 and PS1 are mutually exclusive
MA: PM5 is only used if there is sufficient supporting evidence that the molecular mechanism of pathogenicity is solely due to the missense effect; therefore, caution is recommended for variants with functional studies indicating aberrant splicing or a deleterious splicing prediction (SpliceAI delta score ≥ 0.8). The PM5 criterion is applied under the following conditions:
1PM5_P is applied for a variant that occurs in the same codon with a different missense variant that has been independently classified as likely pathogenic in only one previous report.
2PM5_M is applied if the different missense variants in the same codon have been independently classified as pathogenic in at least one previous report or likely pathogenic in two or more independent reports.
2H) PS1
D: Same amino acid change (referred to as equivalent missense) as a previously established pathogenic variant regardless of nucleotide change
W: PS1_M; PS1_S
C: PS1 and PM5 are mutually exclusive
MA: PS1 is used only if there is sufficient supporting evidence that the molecular mechanism of pathogenicity is solely due to the missense effect; therefore, caution is recommended for variants with functional studies indicating aberrant splicing or a deleterious splicing prediction (SpliceAI delta score ≥ 0.8). The PS1 criterion is used according to the following conditions:
1PS1_M is applied if the equivalent missense variant has been independently classified as likely pathogenic in only one previous case.
2PS1_S is applied if the equivalent missense variant has been independently classified as pathogenic in at least one previous case or likely pathogenic in two or more independent cases.
2I) PVS1
D: Null variant (nonsense, frameshift, canonical splice sites, initiation codon, single or multiexon deletion) in a gene where loss of function is a known mechanism of disease
W: PVS1_P; PVS1_M; PVS1_S; PVS1_VS
MA: The ClinGen haploinsufficiency score has been adopted as the primary tool for determining whether haploinsufficiency is the underlying disease mechanism for each gene [26]. In cases where ClinGen curation is unavailable, alternative sources such as the ExAC/gnomAD probability of LoF intolerance score (pLI > 0.9), observed/expected score (o/e < 0.35), OMIM, and available literature are utilized to assess the disease mechanism [25]. To guide the application of the PVS1 criterion, we adopted the decision tree guidelines proposed by Tayoun et al. [8] for appropriate modulation.
Case-level data3A) BS2
D: Observed in a healthy adult individual for a recessive (homozygous), dominant (heterozygous), or X-linked (hemizygous) disorder, with full penetrance expected at an early age.
W: BS2_S
MA: The BS2 criterion is employed according to the following specifications:
1.BS2_S is used for in-house situations when individuals have a positive genotype but no corresponding phenotype for high-penetrance, early-onset conditions (such as cases with variants believed to be de novo in the proband but confirmed in healthy parents)[27].
2.For cancer genes (Additional file 1: Table S1), BS2_S is used when the variant is found in homozygosity in at least one individual from control databases.
3.For genes associated with rare AR or AD/AR diseases with complete penetrance, BS2_S is used when the variant is found in homozygosity in at least one individual from control databases.
4.For genes associated exclusively with high-penetrance, pediatric-onset AD conditions (only genes included in the “green list” of the Severe Pediatric Disorders Database, available at panelapp.genomicsengland.co.uk/panels/921/), BS2_S is employed if the variant is present in at least five alleles.
5.For genes associated with XLR or XL diseases, BS2_S is used when the variant is present in hemizygosity or homozygosity in at least one control subject.
6.For genes associated exclusively with XLD conditions, BS2_S is used when the variant is found in heterozygosity in at least 5 control subjects.
3B) BS4
D: Lack of segregation in affected members of a family.
W: BS4_S
C: BS4 and PP1 are mutually exclusive.
MA: The BS4_S criterion is utilized specifically for genes linked to high-penetrance, early-onset conditions where individuals possess a positive phenotype but lack a corresponding genotype.
3C) BP2
D: Observed in trans with a pathogenic variant for a fully penetrant dominant gene/disorder or observed in cis with a pathogenic variant in any inheritance pattern
W: Not applied, except for curated variants from a recognized VCEP
MA: Due to the findings of recent molecular studies, which consistently indicate that phenotypes associated with recurrent variants can have broader manifestations or atypical findings, our group has made the decision not to utilize the BP2 criterion at present. Caution is advised when considering the use of this criterion that relies solely on the presence of a rare variant in cis or trans with a pathogenic variant, as we believe that it should not be considered as evidence for benignity.
3D) BP5
D: Variant found in a patient with an alternate molecular cause for the disease.
W: BP5_P
MA: The BP5_P criterion may be employed exclusively when analyzing variants linked to AD highly penetrant, childhood-onset diseases. Specifically, it is only applied in cases where there is a clear alternate genetic cause (case with an alternate primary finding) for the observed phenotype and the variant is deemed unlikely to contribute to or modify the expressivity of the primary finding.
3E) PP1
D: Cosegregation with disease in multiple affected family members in a gene definitively known to cause the disease
W: PP1_P; PP1_M; PP1_S
C: PP1 and BS4 are mutually exclusive. PP1, PS4 and PM3 may overlap, and their simultaneous use requires caution
MA: This criterion is used exclusively to count meioses of multiple affected family members (it may include more than one family if more than one affected individual is reported for every family). The number of meioses is used to modulate the weight, as follows:
3F) PP4
D: Patient’s phenotype or family history is highly specific for a disease with a single genetic etiology.
W: PP4_P; PP4_S
MA: To apply the PP4 criterion, the following conditions must be met: a) the test performed must be comprehensive, encompassing all relevant genes and molecular mechanisms, including copy number analysis, that could potentially contribute to the observed phenotype; b) the variant should be rare in the absence of other candidate disease-causing variants; and c) the family history should align with the expected pattern of inheritance. The PP4 criterion is then utilized in the following manner:
1PP4_P is applied when all these circumstances are satisfied.
2If there is additional clinical evidence, such as pathognomonic muscle biopsy, biochemistry, or an 'exclusive' clinical diagnosis, the criterion can be upgraded to PP4_S.
3G) PM3
D: For recessive disorders, detected in trans with a pathogenic variant.
W: PM3_P; PM3_M; PM3_S; PM3_VS
C: PM3, PS4 and PP1 may overlap, and their simultaneous use requires caution.
MA: PM3 is the primary criterion used to count probands for AR conditions and has been modified to a scoring system (SS) that incorporates clinical reports from the literature to modulate the PM3 weight. For every unrelated affected individual (eventually including the assessed proband), a score of 1.0 is applied when the variant is observed in trans with a known pathogenic or likely pathogenic variant, a score of 0.5 if the phase is unknown, and a score of 0.5 if the variant is found in homozygosity (downgraded to 0.25 if the parents are consanguineous). PM3 may be used in the following situations (Additional file 1: Table S3):
1PM3_P for SS ≥ 0.5.
2PM3_M for SS ≥ 1.0.
3PM3_S for SS ≥ 2.0.
4PM3_VS for SS ≥ 4.0.
3H) PM6
D: Assumed to be de novo, but without confirmation of paternity and maternity.
W: PM6_P; PM6_M; PM6_S; PM6_VS
C: PM6 and PS2 may overlap, and their simultaneous use requires caution.
MA: PM6 is the criterion used to count assumed de novo events and has been modified to an SS that modulates its weight, according to the quantitative approach proposed by ClinGen. For every unrelated affected individual (eventually including the assessed proband), a score of 1.0 is applied when the variant is assumed to be de novo in a phenotype highly specific for the gene, a score of 0.5 if the phenotype is consistent with the gene but is not highly specific, a score of 0.25 if the phenotype is consistent with the gene but is not highly specific and has high genetic heterogeneity, and a score of zero if the phenotype is not consistent with the gene. PM6 may be used in the following situations (Additional file 1: Table S4):
1PM6_P for SS ≥ 0.5.
2PM6_M for SS ≥ 1.
3PM6_S for SS ≥ 2.
4PM6_VS for SS ≥ 4.
3I) PS2
D: De novo (both maternity and paternity confirmed) in a patient with the disease and no family history.
W: PS2_P; PS2_M; PS2_S; PS2_VS
C: PS2 and PM6 may overlap, and their simultaneous use requires caution.
MA: PS2 is the criterion used to count confirmed de novo events and has been modified to an SS that modulates its weight, according to the quantitative approach proposed by ClinGen. For every unrelated affected individual (eventually including the assessed proband), a score of 2.0 is applied when the variant is assumed to be de novo in a phenotype highly specific for the gene, a score of 1.0 if the phenotype is consistent with the gene but is not highly specific, a score of 0.50 if the phenotype is consistent with the gene but is not highly specific and has high genetic heterogeneity, and a score of zero if the phenotype is not consistent with the gene. PM6 may be used in the following situations (Additional file 1: Table S5):
1PS2_P for SS ≥ 0.5.
2PS2_M for SS ≥ 1.
3PS2_S for SS ≥ 2.
4PS2_VS for SS ≥ 4.
3 J) PS4
D: The prevalence of the variant in affected individuals is significantly increased compared to the prevalence in controls.
W: PS4_P; PS4_M; PS4_S
C: PS4, PM3 and PP1 may overlap, and their simultaneous use requires caution.
MA: This criterion has been modified to incorporate clinical reports from the literature. It may be used in the following situations (Additional file 1: Table S6):
1PS4_S is applied when case‒control studies demonstrate a statistically significant increased frequency of a variant in affected individuals compared to controls (odds ratio or relative risk > 5 and confidence interval not including 1).
2PS4_S is applied for known founder variants (pathogenic variant observed at high frequency in a specific population).
3PS4_S is applied for variants curated as pathogenic by recognized ClinGen expert panels.
4PS4 is the criterion for counting probands for AD conditions and may be applied for very rare variants that fulfill the PM2_M or PM2_P criteria; are associated with dominant conditions; and are observed in previously described, unrelated patients with a confirmed phenotype with the following weights:
4a) For cancer genes (Additional file 1: Table S1):
4a1) PS4_P for 2-5 probands.
4a2) PS4_M for 6-9 probands.
4a3) PS4_S for ≥ 10 probands.
4b) For the remaining genes:
4b1) PS4_P for 1-2 probands.
4b2) PS4_M for 3-4 probands.
4b3) PS4_S for ≥ 5 probands.
Functional and computational data4A) BS3
D: Well-established in vitro or in vivo functional studies show no damaging effect on protein function or splicing.
W: BS3_P; BS3_M; BS3_S
C: BS3 and PS3 are mutually exclusive.
MA: For BS3, we adhered to the recommendations and structured approach proposed by Brnich et al. [9] for the assessment of functional assays in variant interpretation and the use of different levels of strength according to assay validation.
4B) PS3
D: Well-established in vitro or in vivo functional studies supportive of a damaging effect on the gene or gene product.
Comments (0)