A multiomics bioinformatics pipeline was used to predict potential GSH loci in the chicken genome (Gallus gallus domesticus) (Fig. 1). Genome data viewer (Fig. 2A), Hi-C data (Fig. 2B and Additional file 1), and RNA-seq data (Fig. 2C) were exploited to predict the potential GSH loci in the chicken genome. Based on two well-known GSH loci, HIPP (so-called H11) and Gt (ROSA) 26Sor (so-called ROSA26) which are validated GSH loci in several organisms including mice, humans, and pigs, we first evaluated the genes surrounding these intergenic loci. HIPP and ROSA26 intergenic loci are surrounded by EIF4ENIF1/DRG1 and THUMPD3/SETD5 genes, respectively in mouse, human, and pig.
Fig. 2Bioinformatic analysis for predication of genome safe harbor loci in the chicken genome. A-a The schematic presentation of the validated HIPP locus including its flanking genes in the mouse, human, and pig genomes. A-b The schematic presentation of the validated ROSA locus including its flanking genes in the mouse, human, and pig genomes. A-c, A-d The schematic presentation of the potential cHIPP and cROSA loci in the chicken genome. Flanking genes around the validated HIPP locus (i.e., DRG1/ EIF4ENIF1) have been exactly the same as the genes found around the predicted cHIPP locus in the chicken genome, but the genes surrounding the validated ROSA locus (i.e., THUMPD3/SETD5/SRGAP3) have been relatively the same as the genes seen around the predicted cROSA locus. A-e The schematic presentation of the non-GSH cOVA locus in the chicken genome. B The coordinates of DRG1/EIF4ENIF1, THUMPD3/SRGAP3, and OVAL genes relative to the location of TADs, extracted from the chicken Hi-C data, and visualized by JUICEBOX online software (adopted from ref [48]). C The expression levels of the flanking genes in several tissues and developmental stages, adopted from the Gene Expression Atlas. ⇨: (adopted from ref [49]). ➲: (adopted from ref [50]). TPM avg.: transcripts per million averages. E: embryonic day. PN: post natal day
Our survey using the genome data viewer of NCBI revealed that the EIF4ENIF1/DRG1 genomic arrangement in the chicken genome (Fig. 2A-c) was similar to those in the indicated organisms (Fig. 2A-a). Pairwise alignment (EMBOSS Water algorithm) was used to find the percentage of identity and similarity of the intergenic sequences between the EIF4ENIF1/DRG1 genes in the chicken genome with the same intergenic sequence in the mouse, human, and pig genomes. Results showed that this locus in chicken had 35.9%, 44.6%, and 40.9% similarity with the corresponding region in the mouse, human, and pig genomes, respectively (data not shown).
Contrary to what we found in the mouse, human, and pig genomes (Fig. 2A-b), the SETD5 gene was not adjacent to the THUMPD3 in the chicken genome (Fig. 2A-d). So, we were unable to use the intergenic sequence located between THUMPD3/SETD5 genes as a potential intergenic region. We noticed that the arrangement and order of SRGAP3/THUMPD3 genes in the genome of the mouse, human, and pig (Fig. 2A-b) was exactly similar to those in the chicken genome (Fig. 2A-d). Thus, two regions were chosen as GSH candidates in the chicken genome without considering the similarity of sequences with other organisms; i) the intergenic region (14327 bp) between SETD5 and PLNXB3 genes (data not shown), and ii) the intergenic region (20105 bp) between THUMPD3 and SRGAP3 genes (Fig. 2A-d). The former contains two “LOC” genes (unpublished/undetermined genes; data not shown) and the latter contains one “LOC” gene. It has been demonstrated that some unidentified coding or non-coding genes may reside in the intergenic regions and affect the expression of the integrated transgene [51]. Thus, we chose the upstream region of the THUMPD3 gene which is a wide intergenic region (Fig. 2A-d) and compared it with the upstream region of the SETD5 (data not shown). This is a gene-poor intergenic region compared with the upstream intergenic region of the SETD5 which is a gene-rich region. Consciously, we decided to integrate the transgene into the unpublished/undetermined gene named “LOC121106669” (the targeted site is located 7742 bp upstream of the THUMPD3 gene inside the “LOC121106669” gene).
Evaluating the chicken genome TADs revealed that both cHIPP and cROSA loci were located inside the individual TADs (Fig. 2B-a, B-c, and Additional file 12a, b). Also, the cOVA locus resides inside an individual TAD (Fig. 3B-b and Additional file 12c). On the other hand, chicken RNA-seq data were adopted to evaluate the expression levels (transcript per million; TPM) of the genes flanking the intergenic locus of interest (Fig. 2C-a, C–c). Since the expression levels of DRG1 and THUMPD3 outweighed those of EIF4ENIF1 and SRGAP3 genes, respectively, we decided to target the cHIPP and cROSA loci near these genes. TPM average for the DRG1 gene was 76.16 and 65.16 in tissues and developmental stages, respectively. TPM average for the EIF4ENIF1 gene was much less (22.33 in several tissues and 30.75 for developmental stages) (Fig. 2C-a). TPM average for the THUMPD3 gene was 26.67 and 71.64, in tissues and developmental stages, respectively. This average for the SRGAP3 gene was 28.77 and 43.49 in tissues and developmental stages, respectively (Fig. 2C-c). For the OVA gene, the TPM average was 6.8 in developmental stages, but no expression was reported in tissues. Low TPM is only observed in testis for OVALY, while it is below the cutoff in other tissues. Also, TPM in developmental stages is low for the OVALY gene (Fig. 2C-b). Hence, cROSA and cHIPP were nominated as the potential GSH loci, and cOVA was used as a non-GSH locus. Also, we evaluated OVA gene expression in DF1 cells and found that this locus is not transcriptionally active in DF1 cells (data not shown).
Fig. 3Transgene Expression from the strong heterologous promoter is not entirely locus-dependent. CRISPR-mediated integration of CMV-driven EGFP and promoter-less DsRed2 in the predicted GSH loci and non-GSH locus was performed in DF1 cell lines. A Schematic depiction of CMV-EGFP-expressing heterogeneous cell pools at the end of MTH2. (B-a, C-a, D-a) CRISPR-mediated integration of DsRed2-CMV-EGFP in cROSA, cHIPP, and cOVA loci. (B-b, B-c, C-b, C–c, D-b, D-c) Light and fluorescence microscope images of the cells expressing EGFP heterogeneously driven by the CMV promoter (Scale bar: 100μm). (B-d, C-d, D-d) Flow cytometry results from EGFP-expressing cROSA, cHIPP, and cOVA cells (each in triplicate). Non-transfected cells were used as the negative control. No expression signal was detected in the red channel. E Mean fluorescent intensity (MFI) index of the cOVA group was higher than that in the cHIPP and cROSA groups. F The integrated density (ID) index of the cOVA group was higher than that in the cHIPP and cROSA groups. G The copy number (CN) of EGFP transcripts in the cOVA group was higher than that in the cHIPP and cROSA groups. **: p < 0.05, ***: p < 0.005, and ****: p < 0.0001 are statistically significant. Avg.: The average expression of EGFP. Exp: Experiment. N: Number
Transgene Expression from the Strong Heterologous Promoter is not Entirely Locus-DependentIn the first preliminary study, to evaluate the predicted GSH loci, a construct containing DsRed2-CMV-EGFP-IRES-PACr was inserted into two predicted cROSA and cHIPP loci as well as the non-GSH cOVA locus of chicken DF1 cells. Heterogenous cell pools (in triplicate for each locus) were generated by 1-week puromycin selection, followed by two-month culture without selection (Additional file 7A and Fig. 3A, B-a, C-a, D-a).
CRISPR-mediated knock-ins of construct harboring strong heterologous promoter in the designated loci were verified by 5’/3’ junction PCR (Additional file 9A-a, A-d, B-a, B-d, C-a, C-d), restriction enzyme digestion of the amplicons (Additional file 9A-c, B-c, C–c), and Sanger sequencing (Additional file 10A-a, B-a, C-a). When cells were transfected with a gRNA-free Cas9 vector (-gRNA), no integrations were observed, judged by 5’/3’ junction PCR (Additional file 9A-a, A-d, B-a, B-d, C-a, C-d) and a lack of EGFP flourscence (data not shown).
At the end of MTH2, correctly-knocked-in heterogenous cell pools for each locus/replicate were evaluated by flow cytometry to estimate the percentage of EGFP-positive cells. Results showed that 19.47%, 21.01%, and 19.81% of cells targeted in cROSA, cHIPP, and cOVA loci, respectively were EGFP-positive (Fig. 3B-d, C-d, D-d). The EGFP expression was highly variable in heterogenous cell pools, indicated by wide histograms (Fig. 3B-d, C-d, D-d). Also, no expression of the promoter-less DsRed2 was detected in any of the three loci (red square), judged by flow cytometry (Fig. 3B-d, C-d, D-d). MFI index showed that the expression of CMV-EGFP inserted in the cOVA locus was significantly higher than that for the CMV-EGFP knocked-in reporter in the cROSA and cHIPP loci (p < 0.0001 and < 0.05, respectively; Fig. 3E). Analyzed images captured from each locus (Fig. 3B-b, B-c, C-b, C–c, D-b, D-c, and Additional file 13) showed that the ID index of CMV-EGFP inserted in the cOVA locus was significantly higher than that for the CMV-EGFP knocked-in reporter in the cROSA and cHIPP loci (p < 0.005 and < 0.0001, respectively; Fig. 3F). The results of qPCR showed that the copy number of EGFP transcripts transcribed from the cOVA locus was significantly higher than those transcribed from cROSA and cHIPP loci (p < 0.05 and p < 0.0001, respectively) (Fig. 3G).
Collectively, these data suggested that in the presence of a strong heterologous promoter, a non-GSH locus could support transcription higher than a GSH locus. So, it may be inferred that the transgene expression under a strong heterologous promoter is not entirely locus-dependent and is mostly promoter-dependent.
Transgene Expression from the Weak Heterologous Promoter is Principally Locus-DependentWe expected that predicted GSH loci to support the elevated expression of a transgene under the control of a strong heterologous promoter. However, the EGFP expression from the non-GSH cOVA locus greatly outweighed the EGFP expression from the predicted GSH loci. Therefore, we assumed that the presence of a strong heterologous promoter unpredictably affects the expression of the integrated transgene. Thus, in the second preliminary study, we set out to evaluate the expression of EGFP under the control of a weak promoter integrated into the predicted GSH loci of cROSA and cHIPP, as well as the non-GSH cOVA locus (Additional file 7A; Fig. 4A, B). To this end, we generated three new targeting vectors named ∆VR, ∆VH, and ∆VO (Fig. 4B; C-a, D-a, E-a) in which EGFP was under the control of ∆CMV. Heterogenous cell pools harboring DsRed2-∆CMV-EGFP were cultured for two months. The 5’/3’ junction PCR (Additional file 9A-b, A-e, B-b, B-e, C-b, C-e), restriction enzyme digestion of the amplicons (Additional file 9A-c, B-c, C–c), and Sanger sequencing (Additional file 10A-b, B-b, C-b) were performed to verify knocked-in ∆VR, ∆VH, and ∆VO in the designated loci. In the absence of locus-specific gRNAs, 5’/3’ junction PCR did not verify knock-ins in the experimental groups (Additional file 9A-b, A-e, B-b, B-e, C-b, C-e).
Fig. 4Transgene Expression from the weak heterologous promoter is principally locus-dependent. CRISPR-mediated integration of ∆CMV-driven EGFP and promoter-less DsRed2 in chicken predicted GSH loci and non-GSH locus was performed in DF1 cell lines. A Schematic depiction of ∆CMV-EGFP-expressing heterogeneous cell pools at the end of MTH2; B Schematic illustration of CMV and ∆CMV promoter as well as negatively- and positively-regulated transcription factor response elements (TFREs). C-a, D-a, E-a) CRISPR-mediated integration of DsRed2-∆CMV-EGFP in cROSA, cHIPP, and cOVA loci verified by 5’/3’ junction PCR, restriction enzyme digestion of the amplicons, and Sanger sequencing; C-b, D-b, E-b) flowcytometry results from EGFP-expressing cROSA, cHIPP, and cOVA cells have been achieved in three individual experiments (each in triplicates). Non-transfected cells have been used as a negative control. Average expression of EGFP for cROSA, cHIPP, and cOVA cell pools has been shown (green square). Expression of EGFP has been detected in the green channel. No expression signal has been detected in the red channel (red square); C–c, C-d, C-e, D-c, D-d, D-e) comparison of integrated density (ID) index, mean fluorescent intensity (MFI) index, and copy numbers (CN) of EGFP transcripts have been conducted among the main experimental groups (i.e., cROSA and cHIPP) against the control group (cOVA). C-f, D-f, E-c, C-g, D-g, E-d) Fluorescence microscope images of the cells expressing ∆CMV-driven EGFP and CMV-driven EGFP heterogeneously (Scale bar: 100um). ns: non-significant, ***: p < 0.005, and ****: p < 0.0001 are statistically significant. Avg.: The average expression of EGFP. N: Number
At the end of MTH2, 23.17%, 21.65%, and 25.07% of cells harboring the transgene in cROSA, cHIPP, and cOVA loci, were EGFP-positive, respectively (Fig. 4C-b, D-b, E-b). In contrast to the use of CMV, ∆CMV could improve MFI index and ID index in favor of GSH loci. MFI index of ∆CMV-EGFP inserted in the cHIPP locus was significantly higher than that for the ∆CMV-EGFP inserted in the cOVA locus (p < 0.0001; Fig. 4D-c), but there was no statistically significant difference between MFI of ∆CMV-EGFP inserted in the cROSA and cOVA loci (Fig. 4C–c). Highly variable levels of EGFP expression in heterogenous cell pools were observed, as demonstrated by wide histograms (Fig. 4C-b, D-b, E-b). Moreover, no expression of promoter-less DsRed2 was detected in all loci (red square), judged by flow cytometry (Fig. 4C-b, D-b, E-b). ID index was calculated by analyzing images captured from each locus (Additional file 14). The ID index findings supported MFI index (Fig. 4C-d, D-d). The copy number of transcripts from ∆CMV-driven EGFP inserted in the cROSA and cHIPP loci was significantly higher than those transcribed from the cOVA locus (p < 0.005 and p < 0.0001, respectively; Fig. 5C-e, D-e).
Fig. 5Transgene expression from the weak heterologous promoter is consistent and homogeneous in the potential GSH loci of isogenous cell clones. Clonally-expanded isogenous cells harboring the ∆CMV-driven EGFP in the potential GSH loci were able to consistently express the transgene. A-a, B-a) Schematic depiction of the process in which clonally-isolated cells were cultured for about six months. Offset (A-b, B-b for cROSA; A-e, B-e for cHIPP; A-h, B-h for cOVA at the end of MTH4 and MTH6) and overlay (A-c, B-c for cROSA; A-f, B-f for cHIPP; A-i, B-i for cOVA at the end of MTH4 and MTH6) illustrate the EGFP expression levels for correctly-targeted isogenous cell clones targeted at the cROSA (clones R2, R5, R8), cHIPP (clones H1, H4, H6), and cOVA (clones O3, O5, O8) loci. Shifting the peak to the right in the offsets shows an increase in the expression of EGFP (arrows show the high density of EGFP-positive cell clones). The MFI index in the cROSA (A-d, B-d) and cHIPP (A-g, B-g) clones with cOVA clones at the end of MTH4 and MTH6 were compared. Green squares show the average expression of EGFP for cROSA, cHIPP, and cOVA clones in the green channel. No expression signal was detected in the red channel (red square). The integrated density (ID) index was compared using the ImageJ (A-j, B-j) and GNUastro (A-k, B-k) software. The copy number of EGFP transcripts (A-l, B-l), and the expression levels of EGFP (A-m, B-m) were determined in the main experimental groups (i.e., cROSA and cHIPP) versus the control group (cOVA) at the end of MTH4 and MTH6. ns: non-significant, ***: p < 0.005, and ****: p < 0.0001 are statistically significant. Avg.: The average expression of EGFP. N: number. Integrated density by imageJ (ID by Im.J). Integrated density by GNUastro (ID by Gnu). Copy Number (CN) by qPCR. Expression by Western Blotting (Exp. by WB)
Comparison of EGFP expression and transcription status in heterogeneous cell pools harboring the CMV-driven EGFP or ∆CMV-driven EGFP integrated into designated loci confirmed that the strong activity of the CMV promoter has been significantly reduced when the promoter changed to ∆CMV (p < 0.0001), judged by MFI index, ID index, and qPCR (Additional file 15A, B, C). Also, fluorescence microscopy images showed a reduction in fluorescence intensity when the weak promoter was used (Fig. 4C-f, C-g, D-f, D-g, E-c, E-d). The only exception was the MFI results of the cHIPP locus, as there was no significant difference in the MFI index between CMV-driven EGFP and ∆CMV-driven EGFP (Additional file 15B-a). Overall, these results highlighted the beneficial effects of the weak heterologous promoter for evaluating and finding potential GSH loci.
Transgene Expression from the Weak Heterologous Promoter is Consistent and Homogenous in the Potential GSH Loci of Isogenous Cell ClonesWe reasoned that the expression of the transgene under the control of a weak promoter in the potential GSH loci might be more locus-dependent and homogenous in isogenous cell clones. To this end, isogenous cell clones were isolated from the heterogenous cell pools harboring ∆CMV-driven EGFP (integrated into GSH and non-GSH loci) which were in culture for more than 2 months (Additional file 7A). Furthermore, the R2, R5, and R8 clones (cROSA clones that contain DsRed2-∆ CMV-EGFP-IRES-PACr in the cROSA locus), the H1, H4, and H6 clones (cHIPP clones that contain the same cassette in the cHIPP locus), and the O3, O5, and O8 clones (cOVA clones that contain the same cassette in the cOVA locus) were expanded and analyzed at the end of MTH4 (Fig. 5A-a) and MTH6 (Fig. 5B-a). After isolation of single-cell clones by limit-diluting method (Additional file 8A), they were screened for bi- or mono-allelic knock-ins (Additional file 8B), and were subjected to 5’/3’ junction PCR (Additional file 8C) with further validation by restriction enzyme digestion (Additional file 8D) and Sanger sequencing (Additional file 10A-c, B-c, C–c). To confirm single-copy transgene knock-in, the copy number of EGFP transcripts transcribed from the GSH loci and non-GSH locus was determined (Additional file 8E).
Evaluation of the homogenous expression of EGFP in the correctly-targeted isogenous cell clones showed that cHIPP clones had highly uniform levels of EGFP expression compared to the cROSA clones, as demonstrated by narrow histograms in the offset graph (Fig. 5A-b, A-e, B-b, B-e). Although EGFP expression was homogeneous in the cOVA clones, transgene silencing occurred over time, judged by shifting the peak to the left in the offset graph from MTH4 to MTH6 (Fig. 5A-h, B-h). The average expression of EGFP (green square) in the MTH4 for cHIPP, cROSA, and cOVA were 98.27%, 94.54%, and 95.82%, respectively (Fig. 5A-c, A-f, A-i); while it was 92.05%, 94.95%, and 67.31% in the MTH6, respectively (Fig. 5B-c, B-f, B-i). Moreover, no expression of the promoter-less DsRed2 (red square) was detected in all loci during the six-month culture of these cells (Fig. 5A-c, A-f, A-i, B-c, B-f, B-i). Also, the results of our findings showed that integration of the transgene in the candidate GSH loci does not alter the morphology and doubling time of cells, either in targeted heterogeneous or in isogenous cells (Additional file 15D, E, F, G).
At the end of MTH4, the comparison of the MFI index of GSH loci with that of the non-GSH locus showed that the MFI index of cROSA clones Fig. 5A-d, p < 0.005) as well as cHIPP clones (Fig. 5A-g, p < 0.0001) were significantly higher than that of cOVA clones. At the end of MTH6, the same comparison was made and results showed that the cHIPP clones have maintained their superiority of transgene expression over cOVA clones and are consistently expressing the transgene (Fig. 5B-g), but cROSA clones showed reduced EGFP expression to almost near the expression level of that in cOVA clones (Fig. 5B-d). At the end of MTH4, analysis of captured images from each locus (Additional file 16) showed that both cROSA (p < 0.005) and cHIPP (p < 0.0001) clones had significantly higher ID index than cOVA clones (Fig. 5A-j) (analyzed by imageJ and GNUastro softwares). However, the comparison of this index between cROSA and cOVA clones showed no significant differences, judged by GNUastro software (Fig. 5A-k). qPCR results showed that the copy number of EGFP transcripts transcribed from the cROSA and cHIPP loci (p < 0.0001) was significantly higher than those from the cOVA locus (Fig. 5A-l). Moreover, western blot analyses confirmed that the expression levels of EGFP in cROSA and cHIPP clones were higher than those in cOVA clones (Fig. 5A-m and Additional file 17).
At the end of MTH6, results were similar to what was found in MTH4 (Fig. 5B-j, B-k, B-l, B-m). The only exception was that the ID index of cROSA clones was reduced compared with that in MTH4, and no significant differences with the ID index of cOVA clones were detected (Fig. 5B-j). The coefficient of variation (CV) of data extracted from ID index for each locus at the end of MTH4 and MTH6 was compared to evaluate the homogeneity of expression of the transgene (Additional file 15A-d, B-d, C-d). Among all loci, the cHIPP locus supported the homogenous expression of the knocked-in transgene more than the other loci. Although data showed that the cROSA locus can support the long-term and stable transgene expression better than the cOVA locus, CV values for both loci increased over time, indicating the heterogeneity in the expression of the transgene (Additional file 15A-d, B-d, C-d).
To determine whether isolated isogenous cell clones harbor mono-copy or multi-copies of the EGFP transgene in the genome, a standard curve was plotted using a serial dilution of the mix containing EGFP plasmid and the haploid equivalent of chicken genomic DNA (ratio 1:1). Ct values of isogenous cell clones indicated that EGFP transgene has been integrated into the genome of isogenous cell clones in a mono-copy manner (Additional file 8E).
Altogether, these results demonstrated that consistent and sustainable expression of a transgene could be achieved using weak promoters integrated into a GSH locus. Among evaluated GSH loci, the cHIPP locus supports the consistent and homogenous expression of the transgene better than cROSA locus.
Comments (0)