Investigating DNA methylation as a potential mediator between pigmentation genes, pigmentary traits and skin cancer

1 INTRODUCTION

The incidence of skin cancer, comprising malignant melanoma, basal cell carcinoma (BCC) and squamous cell carcinoma (SCC), has increased rapidly in the past decades (Cahoon et al., 2015; Hu et al., 2009; Rees et al., 2014). Melanoma is the most aggressive of skin cancers although it has a low incidence, while non-melanoma skin cancer (NMSC) shows high incidence yet considerably lower mortality rates compared to melanoma (Apalla et al., 2017). To date, recognized risk factors for melanoma and NMSC include fair skin, light-coloured eyes, red hair, freckles and melanocytic naevi as well as genetic variants, some of which underlie these skin pigmentation and sun sensitivity phenotypes (Gordon, 2013). Even though there are missense and nonsense polymorphisms in pigmentation genes strongly associated with skin cancer, particularly in the gene MC1R (Nasti & Timares, 2015), it is not fully understood how non-coding variation in these genes relates to malignancy.

DNA methylation (DNAm) is an epigenetic modification with a prospective role in cancer aetiology. The association of global whole-blood DNA hypomethylation with cancer is well known and has also been described for melanoma (Cappetta et al., 2015; Shen et al., 2017). In addition, recent studies have shown the presence of hypomethylation in skin biopsy samples that were exposed to sunlight or artificial ultraviolet radiation (Grönniger et al., 2010; Holzscheck et al., 2020; Vandiver et al., 2015). Inter-individual DNAm variation at specific sites, measured in peripheral blood, has been uncovered as a predictor of a number of complex trait risk factors as well as all-cause mortality (McCartney et al., 2018). DNAm has the potential to be used as a molecular biomarker to diagnose disease or to assess prognosis in those affected by disease and is a promising candidate leading towards the realization of personalized medicine (Nikolouzakis et al., 2020).

This study investigated whether DNAm plays a role in the relationship between genetic variants, pigmentation-related skin cancer risk factors and skin cancer. First, we examined the effect of genetic variation on DNAm levels in peripheral blood across 10 regions robustly associated with pigmentation traits and skin cancer, in 36 cohorts of European descent. We then analysed the association of pigmentation SNP-associated DNAm sites with sun exposure and pigmentation phenotypes in participants of the Avon Longitudinal Study of Parents and Children (ALSPAC). Finally, using summary data-based Mendelian randomization (SMR), we explored whether SNP-related DNAm was likely to causally underlie the expression of genes associated with pigmentation phenotypes and skin cancer. The purpose of this work was to carry out an integrative analysis to evaluate the joint contribution of genetic and epigenetic risk factors to skin cancer susceptibility. As this is the first instance of such an analysis involving genes associated with pigmentation traits and skin cancer, the clinical relevance of our findings will depend on subsequent replication and further investigation of the genes and pathways that we described here.

2 MATERIALS AND METHODS

The different stages of this study are depicted in Figure 1.

image

Stages of analysis implemented and datasets used in this study. From the 391 unique DNA methylation (DNAm) sites associated with pigmentation SNPs, 25 were selected for further analysis as depicted. GoDMC, Genetics of DNA methylation consortium; ALSPAC, Avon Longitudinal Study of Parents and Children; CIT, causal inference test; SMR, summary data-based Mendelian randomization; eQTL browser, blood eQTL browser; CAGE, Cap Analysis of Gene Expression; GTEx, Genotype-Tissue Expression consortium; UKB, UK Biobank

2.1 Identification of DNAm sites associated with pigmentation SNPs

The genetics of DNA methylation consortium (GoDMC) was created to study the genetic basis of DNAm variation and bring together resources and researchers with expertise in the epigenetics field (www.godmc.org.uk). One of its aims was to carry out a meta-GWAS of DNAm, as measured on Illumina 450k or EPIC Beadchips. Results from a meta-GWAS involving 36 European cohorts (N = 27,750) were used here. We provide a brief description of the analysis in the Methods S1.

Using GoDMC data, we searched for DNAm sites that were strongly associated (p < 1 × 10-5) with well-known pigmentation-related SNPs previously identified via genome-wide association studies (GWAS) or candidate gene studies (reviewed in reference Pavan & Sturm, 2019), located within the genes (or their surrounding regions) ASIP (rs1015362, rs4911414, rs619865), BNC2 (rs2153271), IRF4 (rs12203592, rs12210050), HERC2 (rs12913832), MC1R (rs1110400, rs11547464, rs11648785, rs1805005, rs1805006, rs1805007, rs1805008, rs1805009, rs2228479, rs258322, rs4785763, rs885479), OCA2 (rs1800401, rs1800407), SLC24A4 (rs12896399), SLC24A5 (rs1426654), SLC45A2 (rs16891982, rs28777) and TYR (rs1042602, rs1393350).

We uncovered 874 strong SNP-DNAm associations across 30 genomic regions, which included 391 unique DNAm sites. No data were available for SNPs in SLC24A5 and SLC45A2, neither for SNPs rs1110400, rs11547464, rs1805006 and rs1805009 in MC1R and rs1393350 in TYR. Some of these polymorphisms have low minor allele frequencies (MAF) and GoDMC only included SNPs with MAF > 1%. The other SNPs may not have been part of the genotyping platforms, successfully imputed, or strongly associated with any DNAm site (p > 1 × 10-5).

Variance in DNAm explained by SNPs was estimated as 2*β2*MAF*(1 − MAF), where β is the effect size and MAF is the minor allele frequency.

Linkage disequilibrium (LD) r2 values were obtained with LDlink (https://ldlink.nci.nih.gov/).

2.2 DNA methylation and pigmentation/sun exposure phenotypes in the Avon Longitudinal Study of Parents and Children (ALSPAC)

Cohort description and information regarding the collection of DNAm data in ALSPAC can be found in the Methods S1.

2.2.1 Regression analysis

The pigmentation/sun exposure phenotypes evaluated included the following: skin reflectance, freckles, moles, sunburning, tanning ability, hair colour and eye colour. We examined DNAm at the nearest time point with respect to when the phenotypes were measured. We assessed the association of cord blood DNAm and pigmentation traits measured before age 7 [i.e. skin reflectance (49 months), freckles (49 and/or 61 months), red hair (15 and 54 months), eye colour (54 months), tanning ability (69 months)]; the association of DNAm in childhood (~7 years old) with sunburning from birth to age 12, and total number of moles at 15 years old; and the association of DNAm in adolescence (15–17 years old) with total number of moles at 15 years old, red hair at 18 years old and tanning ability at ~25 years old. Description of most of these phenotypes has been provided in an earlier work (Bonilla et al., 2014). Data on hair colour and tanning ability in young adults were collected in a recent ALSPAC questionnaire (Life@25+), using the same scales employed in past questionnaires (Bonilla et al., 2014).

We tested the association of SNP-associated DNAm sites with pigmentation and sun exposure phenotypes using t tests, one-way ANOVA tests, linear and logistic regression models. Regressions were adjusted for age, sex and the first 10 genetic principal components to account for population stratification. Pairwise correlation between DNAm sites was examined in ALSPAC children at age 7. All analyses were carried out with the statistical package Stata v15.

2.2.2 Mediation analysis

The causal inference test (CIT) approach (Millstein et al., 2009), implemented in the R package ‘cit’, was employed to investigate the causal direction between pigmentation SNPs, DNAm and red hair colour. Regressions were adjusted for sex, age and the top 10 genetic principal components.

2.3 Summary data-based Mendelian randomization (SMR)

In order to investigate the potential for DNAm at selected sites to be causally related to gene expression and phenotypes such as pigmentation characteristics and skin cancer, we carried out a Mendelian randomization analysis with colocalization, implemented in the summary data-based Mendelian randomization (SMR) method (Zhu et al., 2016). We used the platform Complex Trait Genetics Virtual Lab (CTG-VL, https://genoma.io/; Cuellar-Partida et al., 2019) to run the statistical package SMR (Lloyd-Jones et al., 2017; Qi et al., 2018; Zhu et al., 2016). This method uses summary data from GWAS, mQTL or eQTL studies to distinguish causal or pleiotropic associations between DNAm and gene expression, or between either of these and a phenotype, from a situation where the traits are caused by different genetic variants that are in strong LD (see figure 1b of Zhu et al., 2016). If the former is true, that is a single genetic variant underlies DNAm and the other tested phenotype, the traits are said to be colocalized. The different SMR analyses run are described below. Unless otherwise reported, the SMR analyses p-value thresholds were as follows: p SMR < 5 × 10-7/5x10-6 and p HEIDI ≥ 0.05.

2.3.1 SMR of whole-blood DNA methylation and gene expression

Peripheral blood DNAm summary data were extracted from McRae et al. (McRae et al., 2018) while summary data on gene expression in blood were obtained from three different sources: the Genotype-Tissue Expression consortium (GTEx, https://gtexportal.org/home/), Cap Analysis of Gene Expression (CAGE; Lloyd-Jones et al., 2017), and the blood eQTL browser (https://genenetwork.nl/bloodeqtlbrowser/; Westra et al., 2013). All expression datasets used the hg19 genome assembly.

2.3.2 SMR of gene expression in skin, pigmentation/sun exposure traits and skin cancer

We ran SMR to assess the association of gene expression in sun-exposed and sun-unexposed skin, obtained from GTEx, with pigmentary and skin cancer phenotypes.

Data were obtained by CTG-VL from the UK Biobank and trait definitions can be found in the biobank website (https://www.ukbiobank.ac.uk/). Pigmentation characteristics analysed were skin colour (UKB ID#1717, N = 356,530), ease of skin tanning (#1727, N = 353,697), childhood sunburn occasions (#1737, N = 269,734), and black (N = 15,809) and blonde (N = 41,178) hair colour (#1747, N = 360,270). We also considered diagnosed malignant melanoma (#ICD10:C43, N = 1,672, total N = 361,194), self-reported malignant melanoma (N = 2,898), self-reported basal cell carcinoma (N = 3,441), self-reported squamous cell carcinoma (N = 449; all #20001, total N = 361,141) and having melanocytic naevi (#ICD10:D22, N = 3,501 cases, total N = 361,194).

2.3.3 SMR of whole-blood DNA methylation, pigmentation/sun exposure traits and skin cancer

We also investigated the potential colocalization of genetic variants underlying DNAm with pigmentation traits and skin cancer. DNAm utilized in this analysis was measured in blood (McRae et al., 2018).

2.4 Heritability of the DNA methylation sites associated with pigmentation SNPs

We checked the heritability of DNAm sites using the resource provided by the Complex Disease Epigenetics Group (www.epigenomicslab.com/online-data-resources/; Hannon, Knox, et al., 2018) that employs twin data to report the variance in whole-blood DNAm explained by an additive genetic component, a shared environmental component and a unique environmental component.

3 RESULTS 3.1 Identification of DNA methylation sites associated with pigmentation SNPs

We summarized available functional information on the analysed SNPs in Table 1.

TABLE 1. Pigmentation SNPs investigated in GoDMC in association with DNA methylation (DNAm) changes SNP Chromosome Position Genea Effect alleleb Other allele EAFc DNAm sites associated Other informationd rs12203592 6p25.3 396321 IRF4 T C 0.188 3 eQTL (sun-exposed skin) rs12210050 6p25.3 475489 IRF4 T C 0.173 7 rs2153271 9p22.2 16864521 BNC2 T C 0.592 4 eQTL (whole blood) rs1042602 11q14.3 89011046 TYR A C 0.262 5 NP_000363.1:p.Ser192Tyr rs12896399 14q32.12 92773663 SLC24A4 T G 0.468 6 rs1800407 15q13.1 28230318 OCA2 T C 0.081 1 NP_000266.2:p.Arg419Gln/ NP_001287913.1:p.Arg395Gln rs1800401 15q13.1 28260053 OCA2 G A 0.946 2 NP_000266.2:p.Arg305Trp rs12913832 15q13.1 28365618 HERC2 G A 0.744 3 eQTL (whole blood) rs258322 16q24.3 89755903 CDK10 A G 0.118 98 eQTL (sun-exposed and sun-unexposed skin and whole blood) rs1805005 16q24.3 89985844 MC1R T G 0.121 26 rhc (NP_002377.4:p.Val60Leu) rs2228479 16q24.3 89985940 MC1R A G 0.089 163 rhc (NP_002377.4:p.Val92Met)/eQTL (sun-exposed and sun-unexposed skin and whole blood) rs1805007 16q24.3 89986117 MC1R T C 0.086 97 RHC (NP_002377.4:p.Arg151Cys)/ eQTL (sun-exposed and sun-unexposed skin and whole blood) rs1805008 16q24.3 89986144 MC1R T C 0.083 97 RHC (NP_002377.4:p.Arg160Trp)/ eQTL (sun-exposed and sun-unexposed skin and whole blood) rs885479 16q24.3 89986154 MC1R A G 0.105 110 rhc (NP_002377.4:p.Arg163Gln)/ eQTL (sun-exposed skin) rs4785763 16q24.3 90066936 AFG3L1P A C 0.332 104 eQTL/ eQTL (sun-exposed and sun-unexposed skin and whole blood) rs11648785 16q24.3 90084561 DBNDD1 C T 0.689 67 rs4911414 20q11.22 32729444 ASIP T G 0.342 24 rs1015362 20q11.22 32738612 ASIP C T 0.727 27 eQTL (sun-exposed and sun-unexposed skin and whole blood) rs619865 20q11.22 33867697 EIF6 A G 0.109 29 rs16891982 5p13.2 33951693 SLC45A2 G C 0.938 n/a NP_001012527.2:p.Leu374Phe rs28777 5p13.2 33958959 SLC45A2 A C 0.956 n/a rs1393350 11q14.3 89011046 TYR A G 0.244 n/a rs1426654 15q21.1 48426484 SLC24A5 A G 0.997 n/a NP_995322.1:p.Thr111Ala rs1805006 16q24.3 89985918 MC1R A C 0.010 n/a RHC (NP_002377.4:p.Asp84Glu) rs11547464 16q24.3 89986091 MC1R A G 0.009 n/a RHC (NP_002377.4:p.Arg142His) rs1110400 16q24.3 89986130 MC1R C T 0.008 n/a RHC (NP_002377.4:p.Ile155Thr) rs1805009 16q24.3 89986546 MC1R C G 0.008 n/a RHC (NP_002377.4:p.Asp294His) aIn this study, for simplicity, SNPs on chromosome 16q24.3 are considered part of the MC1R genetic region, and SNPs on chromosome 20q11.22 are considered part of the ASIP genetic region. bThe effect allele is the allele associated with a lighter skin, hair and eye colour, and skin cancer susceptibility. cEffect allele frequency from GoDMC for SNPs with associated DNAm sites. Effect allele frequency from 1,000 Genomes for SNPs not available in GoDMC. dRHC = high penetrance red hair colour variant, rhc = low penetrance red hair colour variant. eQTL information obtained from the GTEx Portal v8 release.

Pigmentation SNPs that were strongly associated with DNAm sites in GoDMC are shown in Table S1.

We selected DNAm sites with the most reliable effects for further analysis as follows: sites associated with at least six of the MC1R region SNPs, three of the ASIP region SNPs or showing the strongest association with pigmentation SNPs in the other genes tested. Additionally, chosen DNAm sites had to display consistent associations with the allele that increased fair pigmentation for a minimum of five out of eight SNPs in MC1R and two out of three SNPs in ASIP (i.e. the allele increasing fair pigmentation had to always increase or always decrease DNAm at the same site for a majority of the SNPs tested in that gene). Finally, DNAm sites had to be associated with SNPs in ALSPAC participants consistently across the time points where DNAm was measured (i.e. birth, childhood, adolescence, pregnancy, middle age). Based on these criteria, we followed up a total of 25 DNAm sites: 17 in MC1R, 3 in ASIP and one each in IRF4, BNC2, TYR, SLC24A4 and HERC2. DNAm sites that were selected for further analysis are shown in Table 2.

TABLE 2. Pigmentation SNP-associated DNA methylation (DNAm) sites selected for follow-up DNAm site Gene region Gene chr chr position # Associated SNPs Associated in ALSPAC at all time points? cg26114043 MC1R a INTU 4 128544375 7 Y cg09806625 IRF4 EXOC2 6 611523 1 Y cg03291755 BNC2 BNC2 9 16868891 1 Y cg05041596 TYR NAALAD2 11 89867385 1 Y cg04136915 SLC24A4 intergenic 14 92721383 1 Y cg14091419 HERC2 HERC2 15 28356810 1 Nb cg05504729 MC1R ANKRD11 16 89399490 6 Y cg08845973 MC1R SPG7 16 89592071 6 Y cg09738481 MC1R CPNE7 16 89653835 6 Y cg01097406 MC1R intergenic 16 89675127 7 Y cg04240660 MC1R CHMP1A 16 89714849 6 Y cg03605463 MC1R LOC284241 16 89740564 6 Y cg04287289 MC1R FANCA 16 89883240 6

Comments (0)

No login
gif