SNPs in FAM13A and IL2RB genes are associated with FeNO in adult subjects with asthma

Asthma is a complex chronic disorder of the conducting airways, which is related to an immunological reaction, inflammation of bronchial walls, and increased mucus secretion [1]. Asthma involves the interaction among multiple genetic, environmental, and lifestyle factors [2], and it presents with different phenotypes. The most prevalent phenotype is type 2 inflammation (TH2)-associated asthma, which is strongly linked to atopy, allergy, and the response to corticosteroids [1, 3]. In TH2-associated asthma, the immune process initiates with the development of TH2 cells, which produce interleukin (IL) 4, IL5, and IL13 cytokines. These cytokines are responsible for both the stimulation of the allergic and eosinophilic inflammation, and the epithelial and smooth-muscle changes that contribute to asthma pathobiology [3]. Pro-inflammatory cytokines [interferon gamma (IFNγ), IL1β, IL13, and tumour necrosis factor alpha (TNFα)] induce the production of nitric oxide (NO) in the airway epithelial cells by promoting the expression of the enzyme TH2-regulated inducible NO synthase (iNOS) [4]. NO has different roles in asthma as both an endogenous modulator of airway function and a pro-inflammatory mediator [4].

NO levels can be measured in human breath. Fractional exhaled NO (FeNO) is a reliable, quantitative, non-invasive, simple, and safe biomarker for assessing airways inflammation in subjects with asthma [5]. FeNO is higher in males and increases with increasing age and height [6], but it is negatively associated with tobacco smoking and obesity [7]. Furthermore, previous genome-wide and genetic association studies have shown that different genes [8] and single nucleotide polymorphisms (SNPs) [917] are linked to FeNO.

Typically, genetic association studies consist of single-SNP-based tests under the assumptions that genetic variants independently contribute to a given phenotype and that the underlying genetic model of inheritance is additive. The main limitation of this approach is that single SNPs can only explain a small proportion in the genetic variation of complex traits, which results in the missing heritability problem [18]. Furthermore, true associations can be missed when assuming an additive genetic model in case the correct model is recessive or dominant, which also leads to a reduction in statistical power [19]. These shortcomings have encouraged the application of statistical learning methods that allow to jointly analyse a large number of SNPs in a high dimensional setting without an a priori specification of the underlying genetic model, such as gradient boosting machine (GBM) [19].

The present study is a candidate gene association analysis aimed at identifying SNPs that are associated with FeNO in adult subjects with asthma. To fulfil this purpose, we jointly analysed data from the Gene Environment Interactions in Respiratory Diseases (GEIRD) survey [20]. In the original survey (2008–2010), an overall panel of 384 SNPs tagging 53 candidate genes or gene regions (see table S1) was assessed. The selection of the 53 genes or gene regions was based on their association with asthma, chronic obstructive pulmonary disease (COPD), or allergic rhinitis, as observed in previous studies identified from literature [2123] (the full list of references is reported in table S2), or on their involvement in possible related biological pathways (such as inflammation, innate immunity and immunoregulation, oxidative stress and xenobiotic metabolism, regulation of protease-antiprotease equilibrium, and tissue remodelling) [24]. Then, we replicated our findings within the European Community Respiratory Health Survey (ECRHS) III ( www.ecrhs.org ) [2527].

2.1. GEIRD study

GEIRD is an Italian, multi-centre, (multi)case-control study on the role of genetic and modifiable factors in asthma, COPD, chronic bronchitis, and allergic rhinitis (the protocol is fully described elsewhere [20]). Briefly, the cases and the controls were identified in pre-existing cohorts [2528] and in new random samples of the general adult population through a two-stage process, which consists of a mailed screening questionnaire (stage 1) and a clinical examination for accurate phenotyping (stage 2) (figure 1). The participants in GEIRD stage 2 also provided blood samples for genetic data collection.

Figure 1. Selection of the asthma cases included in the genetic association analysis (GEIRD, Verona centre). COPD: chronic obstructive pulmonary disease; GEIRD: Gene Environment Interactions in Respiratory Diseases. aSubjects who did not fulfil the criteria for cases and controls.

Standard image High-resolution image

Asthma cases were the individuals who had reported at least one of the following two conditions:

(a)  

ever asthma;

(b)  

asthma-like symptoms [asthma attacks, wheezing, chest tightness, shortness of breath (SoB) at rest, SoB at night time, SoB following strenuous activities] or the utilization of anti-asthmatic drugs in the previous 12 months, having fulfilled at least one of the following clinical characteristics:

1.  

positive methacholine challenge test [provocative dose (PD20) <1 mg causing a 20% fall in forced expiratory volume in one second (FEV1)];

2.  

pre-bronchodilator (BD) airflow obstruction (AO) [FEV1/forced vital capacity (FVC) 29] or 1 >12% and >200 ml with respect to pre-BD FEV1 after 400 mcg of salbutamol);

3.  

pre- but not post-BD FEV1/FVC <LLN or <70%, and post-BD FEV1 ⩾80% predicted.

The subjects with current asthma were those who (i) had reported asthma-like symptoms or the utilization of anti-asthmatic drugs in the previous 12 months, or (ii) had pre-BD AO or (iii) had a positive methacholine challenge test. The criteria used to identify the cases of COPD, chronic bronchitis, or allergic rhinitis are described elsewhere [20]. The controls were the subjects without asthma, COPD, chronic bronchitis, and allergic rhinitis who had pre-BD FEV1 >70% predicted and pre-BD FEV1/FVC ⩾LLN and ⩾70%. The subjects not fulfilling the criteria for cases and controls were included in a residual group.

2.2. FeNO and genetic protocols

In GEIRD, FeNO (exhalation flow rate of 50 ml s−1) was measured according to international guidelines [30] by using a chemiluminescence analyser (CLD88, Ecomedics, Switzerland). FeNO measurements were expressed as 'part per billion' (ppb) absolute values. Blood samples were collected and stored for genomic DNA extraction according to standardized international protocols [20]. The selection of the 384 SNPs in the 53 genes or gene regions [31] was based on including SNPs tagging most of haplotype variability in the CEU population (HapMAp phase II) and SNPs from literature ( www.ncbi.nlm.nih.gov/snp ). These SNPs were chosen by STAMPA application (GEVALT software; acgt.cs.tau.ac.il/gevalt/#ver2) and constitute the optimal set of tag-SNPs that are representative of a given genomic region with high linkage disequilibrium (LD) and maximum prediction accuracy. Candidate gene SNPs were analysed using a high degree genotyping method (GoldenGate Genotyping assay, Illumina).

2.3. Study subjects

In the present analyses, only the participants in GEIRD stage 2 from the Verona centre were included due to the availability of their genetic data. In this centre, 1322 cases and controls were identified at the clinical stage and 997 of these individuals were genotyped (figure 1). Genetic data from all the genotyped subjects were used for additional SNP quality checks. Of all the cases of asthma who had provided genetic data in the original study in Verona (342 subjects), 264 patients with a FeNO measurement were included in this genetic association analysis.

The appropriate ethics committee ('Comitato Etico per la Sperimentazione dell'Azienda Ospedaliera Istituti Ospitalieri di Verona') approved the GEIRD survey in Verona and all the aspects of the research project were fully explained to the participants, who gave their written informed consent.

2.4. Genetic association analysis

Of the 384 SNPs assessed in the original survey (GEIRD), 221 SNPs tagging 50 genes or gene regions (see table S1) met the following criteria and were included in the present analysis:

1.  

genotype failure rate ⩽5% in the 997 genotyped subjects;

2.  

genotype failure rate ⩽5% in the 342 asthma cases;

3.  

minimum genotype frequency ⩾5% in the 342 asthma cases;

4.  

allele frequencies needed to respect Hardy–Weinberg equilibrium (HWE) in the 303 controls (the SNPs not available for the controls were excluded from the analysis) [32]. P-values for testing deviation from HWE were corrected for the false discovery rate by using the Benjamini–Yekutieli procedure [33, 34].

In order to identify the SNPs that are associated with FeNO, a two-step approach was adopted: GBM [19, 35] was used to select the SNPs (step 1) that were simultaneously included as covariates in a multivariable linear regression model for significance testing (step 2). In both steps, natural log-transformed FeNO (log-FeNO) was the normally distributed outcome and the covariates were age, sex, and the SNPs (the genotype data for all SNPs were coded without assuming an a priori genetic model: 0 = reference = homozygous with higher allele frequency, 1 = heterozygous, 2 = homozygous with lower allele frequency). Other covariates (such as cigarette smoking, obesity, or allergic rhinitis) were not analysed because these variables are not confounders of the relationship between SNPs and FeNO.

At step 1, GBM was used to rank-order the 221 SNPs according to their variable importance measure (VIM), which quantifies the total contribution of each SNP to the prediction of log-FeNO. Because VIM in GBM is biased for SNPs in LD, the 221 SNPs were divided into ten overlapping subsets of low correlated SNPs (within-subset correlation <0.1) [36], and GBM was applied to each subset in parallel ('gbm' package in R software; cran.r-project.org/web/packages/gbm) [37]. The tuning of GBM hyper-parameter was performed by ten-fold cross-validations, which suggested a shrinkage rate of 0.01, at least five observations per node of each tree, a bagging fraction of 0.8, and a training fraction of one. Five hundred trees were used to build the first deep learning model and the interaction depth was set equal to one. An aggregate VIM was obtained as the median of the VIMs computed in the overlapping subsets. The 15 SNPs that had the highest aggregate VIM were selected for step 2 because of the relatively small sample size.

At step 2, the GBM-selected SNPs were included as covariates in a multivariable linear regression model for significance testing. The strength of the association between each SNP and log-FeNO was measured through the beta regression coefficient, which represents the difference in the expected log-FeNO (Δlog-FeNO) between the heterozygous genotype (or the homozygous genotype with lower allele frequency) and the reference for a given SNP, with the genotype of the other SNPs held constant. A sensitivity analysis was performed to evaluate the generalizability of the results to the European population, repeating the genetic association analysis (step 2) by including only the asthma cases with both parents born in Europe.

2.5. Replication analysis

The GBM-selected SNPs in GEIRD were tested for association with log-FeNO in a replication sample of 296 asthma cases who had participated in ECRHS III (from 15 centres located in Estonia, France, Germany, Norway, Spain, Sweden, and United Kingdom) [27]. ECRHS is a population-based, cohort study that recruited subjects aged 20–44 at baseline (ECRHS I; 1991–1993) [25]. The study subjects answered a mailed screening questionnaire (stage 1) and a 20% 'random sample' of the responders underwent a detailed clinical examination (stage 2). The follow-up of the participants in ECRHS I stage 2 took place in 1998–2002 (ECRHS II) [26] and 2010–2013 (ECRHS III) [27]. A standardized clinical interview, lung function, and laboratory tests were performed on all occasions. Blood samples for genotyping were collected in ECRHS II, and FeNO (exhalation flow rate of 50 ml s−1) was measured in ECRHS III. Additional information regarding the ECRHS III survey, including ethics approvals, was included as supplementary material.

The definition of asthma in ECRHS III was comparable with that used in GEIRD. Asthma cases were the subject who had fulfilled at least one of the following criteria:

1.  

ever asthma OR (asthma attacks/asthma-like symptoms/anti-asthmatic drugs in the previous 12 months AND PD201/FVC 29] or

2.  

ever asthma OR (asthma attacks/asthma-like symptoms/anti-asthmatic drugs in the previous 12 months AND PD20 <1 mg) OR (asthma attacks/asthma-like symptoms/anti-asthmatic drugs in the previous 12 months AND pre-BD FEV1/FVC <LLN or <70%) at ECRHS II;

3.  

ever asthma OR (asthma attacks/asthma-like symptoms/anti-asthmatic drugs in the previous 12 months AND pre-BD FEV1/FVC <LLN or <70% AND post-BD FEV1 >12% and >200 ml with respect to pre-BD FEV1 after 400 mcg of salbutamol) OR (asthma attacks/asthma-like symptoms/anti-asthmatic drugs in the past 12 months AND pre- but not post-BD FEV1/FVC <LLN or <70% AND post-BD FEV1 >80% predicted) at ECRHS III.

The subjects with current asthma at ECRHS III were those who (i) had reported asthma attacks, asthma-like symptoms, or anti-asthmatic drugs in the previous 12 months or (ii) had pre-BD FEV1/FVC <LLN or <70%.

A 2-level (subject: level 1 unit; centre: level 2 unit) random-intercept linear regression model, with age, sex, and all the GBM-selected SNPs as fixed-effect covariates, was used to account for the ECRHS hierarchical data structure. One-sided p-values were computed for the beta regression coefficients that were statistically significant in GEIRD and were in the same direction in GEIRD and ECRHS III.

All statistical analyses were performed by using R software (version 3.6.2; The R Foundation for Statistical Computing, Vienna, Austria) and STATA software (release 16; StataCorp, College Station, Texas, USA).

3.1. Main characteristics of the asthma cases

The 264 asthma cases identified in GEIRD and included in the genetic association analysis (step 1) had a median age of 42.8 years (female 47.7%) and a median BMI of 24.0 (table 1). The vast majority (96.6%) of these patients had both parents born in Europe. Past, current light, and current heavy smokers were 29.2%, 11.0%, and 12.5%, respectively. Of the subjects who reported asthma in their life, about 43% developed the disease before the age of 10. Sixty-two per cent of the asthma cases had current asthma, those with allergic rhinitis were 54.6% and those with chronic cough or phlegm were 12.1%. The median pre-BD FEV1, FVC, and FEV1/FVC % predicted was 96.8, 100.5, and 93.9, respectively, and the median FeNO was 20.1 ppb. The distribution of the demographic and clinical variables, and smoking habits was not significantly different between the 264 study subjects and the 122 eligible patients in GEIRD who were excluded from the analysis due to missing information on genetic data and/or FeNO (see table S3).

Table 1. Main characteristics of the asthma cases included in the genetic association analysis (GEIRD dataset) and in the replication analysis (ECRHS III dataset).

  Genetic association analysis    Step 1Step 2Replication analysis P-valuebSample, n  264245a296—Females, % 47.746.958.80.006Age (years), median (IQR) 42.8 (35.7, 49.5)42.4 (35.2, 49.4)54.0 (48.0, 59.0)<0.0001European-born parentsc, %Both96.696.7—— Only one1.51.2—  None0.80.8—  Unknown1.11.2— BMI, median (IQR) 24.0 (21.9, 26.7)23.9 (21.7, 26.6)26.7 (23.5, 30.3)<0.0001Tobacco smoking, %Never47.446.542.40.197 Past29.229.835.6  Current light11.011.47.5  Current heavy12.512.214.6 Age of asthma onset (years)d, %0–942.942.021.2<0.001 10–1917.919.118.7  ⩾2037.136.652.5  Unknown2.12.37.6 Current asthma, % 61.760.869.30.040Allergic rhinitise, %Absent44.744.951.40.092 Present54.654.348.7  Unknown0.80.80.0 Chronic cough or phlegmf, %Absent87.187.475.00.001 Present12.111.824.0  Unknown0.80.81.0 Pre-BD FEV1% predicted, median (IQR) 96.8 (87.6, 109.0)97.2 (88.1, 108.9)85.6 (75.9, 99.5)<0.0001Pre-BD FVC % predicted, median (IQR) 100.5 (91.9, 110.5)100.4 (92.4, 110.7)96.3 (86.8, 106.3)<0.001Pre-BD FEV1/FVC % predicted, median (IQR) 93.9 (88.7, 101.2)93.8 (89.1, 101.1)86.2 (82.7, 96.2)<0.0001FeNO (ppb), median (IQR) 20.1 (12.2, 40.2)20.0 (12.3, 39.7)17.0 (12.0, 25.5)0.004

Gene Environment Interactions in Respiratory Diseases; ECRHS: European Community Respiratory Health Survey; IQR: interquartile range; BMI: body mass index; pre-BD: pre-bronchodilator; FEV1: forced expiratory volume in one second; FVC: forced vital capacity; FeNO: fractional exhaled nitric oxide; ppb: part per billion. a Of the 264 asthma cases included in the genetic association analysis at step 1 (gradient boosting machine, GBM), 19 patients with a missing value in at least one of the GBM-selected SNPs were excluded from the analysis at step 2. b Pearson chi-squared test, Fisher's exact test, or Wilcoxon rank-sum test were used when needed to compare the distribution of the main characteristics between the 245 GEIRD patients included in the genetic association analysis at step 2 and the 296 ECRHS III patients included in the replication analysis. c Not measured in ECRHS III. d Information obtained from the patients who reported asthma in their life. e Having reported any nasal allergies, including hay fever. f Having reported cough and/or phlegm from the chest, usually in winter and on most days for as long as three months each year.

Of the 264 asthma cases included in the genetic association analysis at step 1 (GBM), 19 patients with a missing value in at least one of the GBM-selected SNPs were excluded from the analysis at step 2. Compared to the 245 GEIRD patients included in the genetic association analysis (step 2), the 296 asthma cases identified in ECRHS III and assessed in the replication analysis had a higher percentage of females (58.8% vs 46.9%, p = 0.006), a higher age (median: 54.0 vs 42.4 years, p < 0.0001), a higher BMI (median: 26.7 vs 23.9, p < 0.0001), an older age of asthma onset (p < 0.001), a higher percentage of current asthmatics (69.3% vs 60.8%, p = 0.040), a higher percentage of subjects with coexisting cough or phlegm (24.0% vs 11.8%, p = 0.001), a lower pre-BD FEV1% predicted (median: 85.6 vs 97.2, p < 0.0001), a lower FVC % predicted (median: 96.3 vs 100.4, p < 0.001), a lower FEV1/FVC % predicted (median: 86.2 vs 93.8, p < 0.0001), and a lower FeNO (median: 17.0 vs 20.0, p = 0.004) (table 1).

3.2. Genetic association and replication analyses

At step 1, the following 15 SNPs were selected by GBM (table 2): rs2735014 (HLA-G; VIM = 44.54), rs2523793 (HLA-G; VIM = 27.93), rs13022785 (TNS1; VIM = 19.48), rs1419779 (NPSR1; VIM = 17.35), rs987314 (FAM13A; VIM = 13.97), rs2069812 (IL5; VIM = 13.50), rs1063320 (HLA-G; VIM = 13.21), rs2869546 (CHRNA3; VIM = 12.76), rs174579 (FADS2; VIM = 8.56), rs11639224 (IREB2; VIM = 8.15), rs944725 (NOS2; VIM = 7.33), rs647041 (CHRNA5; VIM = 5.83), rs953569 (HAVCR1; VIM = 5.66), rs1610696 (HLA-G; VIM = 4.97), and rs3218258 (IL2RB; VIM = 4.84).

Table 2. SNPs identified in the genetic association analysis (GEIRD dataset) and tested in the replication analysis (ECRHS III dataset). The statistically significant associations are reported in bold.

  Genetic association analysis   Step 1Step 2Replication analysisGene or gene regionSNPGenotypeSample (n = 264)VIMSamplea (n = 245)Δlog-FeNOb [95% CI]Two-sided p-valueGenotypecSample (n = 296)Δlog-FeNOd [95% CI]Two-sided p-valueOne-sided p-valuee HLA-G rs2735014GG13444.541220.00—CC1630.00——  TG107 101−0.14 [−0.36, 0.08]0.225AC120−0.02 [−0.19, 0.14]0.773—  TT22 220.16 [−0.27, 0.58]0.468AA13−0.14 [−0.56, 0.28]0.508— HLA-G rs2523793GG13427.93————————  AG103 ————————  AA22 ———————— TNS1 rs13022785TT12319.481150.00—TT1160.00——  TC109 101 0.27 [0.06, 0.48] 0.012 TC1380.04 [−0.12, 0.20]0.6120.306  CC31 290.06 [−0.26, 0.38]0.712CC420.16 [−0.07, 0.40]0.165— NPSR1 rs1419779AA11217.351040.00—AA1210.00——  AG131 124−0.02 [−0.22, 0.18]0.853AG139−0.05 [−0.21, 0.11]0.508—  GG21 17 −0.63 [−1.03, −0.24] 0.002 GG36−0.09 [−0.34, 0.16]0.4690.235 FAM13A rs987314CC8313.97750.00—GG1140.00——  TC123 115 0.29 [0.07, 0.51] 0.011 AG1350.14 [−0.02, 0.30]0.076 0.038   TT57 550.23 [−0.04, 0.49]0.098AA470.20 [−0.02, 0.43]0.073— IL5 rs2069812CC12413.501150.00—GG1310.00——  TC106 99−0.14 [−0.34, 0.06]0.165AG135−0.15 [−0.30, 0.004]0.056—  TT33 31−0.29 [−0.60, 0.01]0.058AA30−0.02 [−0.27, 0.23]0.874— HLA-G rs1063320CC7313.21700.00—CC760.00——  GC123 115−0.08 [−0.38, 0.21]0.585GC145−0.04 [−0.27, 0.18]0.716—  GG68 600.11 [−0.30, 0.52]0.596GG750.01 [−0.30, 0.32]

Comments (0)

No login
gif