Analysis of 55 Kidd ancestry SNPs in Qatari population using ForenSeq Universal software & STRUCTURE software

The results of this study revealed that there were no significant (P > 0.05) deviations from (HW) Hardy-Weinberg observed for the 55AISNPS for Qatari population. The set of 55 AISNPs in FROG-kb, has an extensive allele frequency data for all 140-reference population samples. The completeness of the data permits the likelihoods ratios to be calculated for all of these 140 populations for DNA profile for the 55 AISNPs. STRUCTURE software allowed development of cluster membership values at K = 6, 7, 8, 9, 10 and 11 in 140 reference populations (Fig. 1). The data showed the maximum likelihood run of the most frequently occurring (for 10 of 20 runs) cluster pattern at K = 9. This study included 140 reference populations with 8184 individuals including Qatari population [Pakstis A.J. Gurkan C. Dogan M. Balkaya H.E. Dogan S. Neophytou P.I. Cherni L. Boussetta S. Khodjet-El-Khil H. ElGaaied A.B.A. Salvo N.M. Genetic relationships of European, Mediterranean, and SW Asian populations using a panel of 55 AISNPs.]. Of the 8184 individuals analysed, 72.3 % had all 55 AISNP genotypes currently; 91.0 % of individuals had no more than three missing profiling. The new Qatari population data generally showed a very strong similarity to previously analysed populations of the Middle Eastern population who were well known to be closely related in the geographical location [Rodriguez-Flores J.L. Fakhro K. Agosto-Perez F. Ramstetter M.D. Arbiza L. Vincent T.L. Robay A. Malek J.A. Suhre K. Chouchane L. Badii R. Indigenous Arabs are descendants of the earliest split from ancient Eurasian populations.]. The overall pattern of the results showed that Qatari population was similar to surrounding groups like the Kuwaiti, U.A.E, and Saudi populations as shown in Fig. 1(A). A more focused STRUCTURE analysis was also done for reduced number of population data using sixty-nine populations data set by omitting the 25 populations from East Asia, Pacific and Americas. With the additional analysis, using less number of population, the STRUCTURE software could allow for further investigations if any new cluster patterns could be recognized. The nearby regions remaining in the initial analysis of 140 populations had distraction due to the strong clusters in East Asia, Pacific and Americas [Pakstis A.J. Kang L. Liu L. Zhang Z. Jin T. Grigorenko E.L. Wendt F.R. Budowle B. Hadi S. Al Qahtani M.S. Morling N. Mogensen H.S. Themudo G.E. Soundararajan U. Rajeevan H. Kidd J.R. Kidd K.K. Increasing the reference populations for the 55 AISNP panel: the need and benefits.]. For the 69 population analyses, the individual bar plots for the most frequently occurring cluster at each K level from was from 6 to 7 (Fig. 1B). The results show that the most common cluster pattern showing the highest likelihood run in this more focused STRUCTURE analysis. The cluster pattern observed for North African, Southwest Asian populations and sub-Saharan at K = 10, k = 11 in this analysis focused on 140 populations. This was similar to what was reported in earlier publication of Paskstis et al. (2019).Fig. 1

Fig. 1(A) A Diagram showing the STRUCTURE results for estimated cluster member ship values at K = 11, in 140 reference populations including Qatari data for 55 Ancestry Informative SNPs. The new recently populations including the Qatari population generally show strong similarity to previously analyzed populations that were known to be closely related (Population abbreviations in Table 1 Supplementary material). (B) A Diagram showing the STRUCTURE population bar plot results for estimated cluster membership values at K = 5, K = 6 in 69 reference populations. For the runs with maximum likelihood out of 20 runs for cluster K = 7. There are 69 population bars and each population bar has the same width. The height extent of each colour within a population bar corresponds to the average estimated cluster membership for all the individuals in the population.

55 AISNPs data of three Qatari individuals were analysed in FROG-kb for the 55 AISNP panel. The likelihood ratio was estimated as the probability of the greatest population divided by the probability of the specified population. The highest ranked likelihood results for the top 15 populations were within one order of magnitude of each other. 45 populations were listed by the probability calculated by FROG-kb for the three Qatari individuals. Using the rule of likelihood, the population with the utmost probability of this genotype becomes the most likely population of origin and likelihood ratios indicated how much more likely the best population was compared to other population. Table (1C) showed that the ratio of 100 or more in the populations is significantly less likely to be the ancestry of the sample.

For the Qatari individuals, the populations within one order of magnitude were mostly from the Mediterranean area [Rodriguez-Flores J.L. Fakhro K. Agosto-Perez F. Ramstetter M.D. Arbiza L. Vincent T.L. Robay A. Malek J.A. Suhre K. Chouchane L. Badii R. Indigenous Arabs are descendants of the earliest split from ancient Eurasian populations.]. The high-ranking populations for the Qatari population individuals were from the Middle Eastern, Africa and South Asian regions. Tunisian individuals clustered together or with populations from the nearby Middle East [Cherni L. Pakstis A.J. Boussetta S. Elkamel S. Frigi S. Khodjet-El-Khil H. Barton A. Haigh E. Speed W.C. Ben Ammar Elgaaied A. Kidd J.R. Kidd K.K. Genetic variation in Tunisia in the context of human diversity worldwide.]. Thus, populations with non-significant likelihood ratios for ancestry assignment were more specific than using only the single most likely population. The Qatari populations have similar geographic relatedness with the origin of Middle Eastern population and much more similar to the groups of UAE, Kuwaiti and Saudi Arabian populations. FROG-kb for individual ancestry assignment at the population levels was found to be quite precise in assignment the samples to relevant population groups. The results from FROG-kb provided relative likelihoods of ancestry from different populations for user-specified genotypes. Such results are only probable and are not absolute values. The Qatari population assignment data on the STRUCTURE results revealed that the population of origin was most likely to be very close to the Middle Eastern related populations among the reference populations when compared using FROG-kb. The ForenSeq™ Universal Software did not include any Middle Eastern populations so the ancestry of these samples showed that the Qatari individuals clustered with the American Admixed origin, which was not the correct result. The software needs to be updated with a more comprehensive reference database with bigger number of populations/samples. For forensic applications, it is necessary to have an accurate biogeographic ancestry assignment and for this sufficient number of AISNPs need to be used and a reasonable reference data on the appropriate and suitable populations is required.

Comments (0)

No login
gif