Strategies for pairwise searches in forensic kinship analysis

1. IntroductionInferring the relationship between pairs of individuals is central to many forensic applications. Examples include mass fatality incidents, which can be the result of accidental catastrophes like air crashes with a list of known victims [Olaisen B. Stenersen M. M. M. Identification by DNA analysis of the victims of the August 1996 Spitsbergen civil aircraft disaster.] or shipwrecks without passenger lists [Bertoglio B. Grignani P. Di Simone P. Polizzi N. De Angelis D. Cattaneo C. Iadicicco A. Fattorini P. Presciuttini S. Previderè C. Disaster victim identification by kinship analysis: the Lampedusa October 3rd, 2013 shipwreck., Olivieri L. Mazzarelli D. Bertoglio B. De Angelis D. Previderè C. Grignani P. Cappella A. Presciuttini S. Bertuglia C. Di Simone P. Challenges in the identification of dead migrants in the Mediterranean: the case study of the Lampedusa shipwreck of October 3rd 2013.]. Other applications are natural disasters like tsunamis, where the number of victims is unknown [Some mathematical problems in the DNA identification of victims in the 2004 tsunami and similar mass fatalities.] and terrorism-related events [Issues and strategies in the DNA identification of World Trade Center victims.]. The aim is to link DNA samples from the scene to putative victims (e.g. individuals reported missing since the event) and is known as disaster victim identification (DVI). There are various other important applications like searching for relationships among individuals in mass graves of archaeological relevance [Parsons T.J. Huel R.M.L. Bajunović Z. Rizvić A. Large scale DNA identification: The ICMP experience., Palomo-Díez S. Esparza-Arroyo A. Tirado-Vizcaíno M. Velasco Vázquez J. López-Parra A.M. Gomes C. Baeza-Richer C. Arroyo-Pardo E. Kinship analysis and allelic dropout: a forensic approach on an archaeological case., Palomo-Díez S. López-Parra A.M. Gomes C. Baeza-Richer C. Esparza-Arroyo A. Arroyo-Pardo E. Kinship analysis in mass graves: evaluation of the Blind Search tool of the Familias 3.0 Software in critical samples.]. We may also check databases collected to estimate population statistics such as allele frequencies. Duplicates and close relatives should be excluded prior to the statistical analysis to avoid biased estimates of allele frequencies [Pemberton T.J. Wang C. Li J.Z. Rosenberg N.A. Inference of unexpected genetic relatedness among individuals in hapmap phase iii.].As these cases involve unidentified DNA samples, a first step in the investigation is to screen the data for related samples. This initial step is referred to as a blind search [Egeland T. Kling D. Mostad P. Relationship inference with Familias and R: Statistical methods in forensic genetics.]. It is helpful to first position the topics that we are addressing in the wider context of database searching. Assume that there is a case database of DNA profiles. This could comprise profiles obtained from a crime scene, a disaster site or a burial site. In addition, there may be a reference database of DNA profiles like a national database of convicted offenders. There are various searches that can be performed to detect pairwise relationships as illustrated in Fig. 1: Fig. 1

Fig. 1Different database searches. 1. Direct search: Search for direct matches between or within databases. 2. Familial search: Search for related individuals between databases. 3. Blind search: search for related individuals within databases.

In a blind search, comparisons are performed among all pairs of DNA samples. A likelihood ratio (LR), comparing the relationship specified by H1 to the one specified by H0, is computed for each pair. The LRs summarise the statistical DNA evidence. For pre-specified threshold values t0 and t1, small values of LR t0 are often interpreted as supporting H0, while large values of LR > t1 favour H1. A blind search typically involves a large number of comparisons. If there are n profiles in the database, the number of comparisons is n(n − 1)∕2, e.g. 4950 comparisons for 100 profiles. The implications of this high number of pairwise comparisons in a blind search are of key concern in this paper. Also, it is not obvious how the thresholds t0 and t1 should be specified. Conventional thresholds used in paternity testing, for example, may not apply. The false positive rate FPR = P(LR > t1∣H0) and false negative rate FNR = P(LR t0∣H1) should both be close to 0. Even if these error rates are small for each comparison, the probability that errors occur when many comparisons are done may be considerable. Determination of thresholds and optimisation of search strategies have been discussed in connection with database searches and familial searching [Kruijver M. Meester R. Slooten K. Optimal strategies for familial searching.]. The classical statistical theory of multiple testing [The positive false discovery rate: a Bayesian interpretation and the q-value.] is also relevant.Current implementations of blind search are limited to fairly simple outbred pedigree structures connecting the two individuals of interest. For example, Familias [Egeland T. Kling D. Mostad P. Relationship inference with Familias and R: Statistical methods in forensic genetics.], a freely available kinship software package, accommodates parent offspring (PO), sibling (S), half sibling (H), first cousin (FC) and second cousin (SC) [Kling D. Tillmar A.O. Egeland T. Familias 3–extensions and new functionality., Egeland T. Mostad P.F. Mevåg B. Stenersen M. Beyond traditional paternity and identification cases: selecting the most probable pedigree.]. We model general pairwise relationship, possibly with inbreeding, using the Jacquard coefficients [Genetic information given by a relative.]. By including X-chromosomal markers, some additional relationships can be addressed. For instance, paternal and maternal half sisters can be distinguished.Prior, non-DNA, information can sometimes be important. For instance, two individuals of the same age cannot possibly constitute a parent-offspring pair even if the DNA profiles suggest otherwise. To formally include prior information, we require a Bayesian approach. In the Bayesian framework we start out with a set of prior probabilities, reflecting our belief in the hypotheses, before considering any genetic data. Our belief in each hypothesis is then updated by incorporating the DNA information. Informative priors can contribute additional information to the genetic data and this will be reflected in the posterior probabilities. A more general prior distribution for pedigrees has been discussed elsewhere [Structured incorporation of prior information in relationship identification problems.].

Our paper is structured as follows. We first review the parametric representation of relationships and the corresponding parametric likelihood and likelihood ratio, for both autosomal and X-chromosomal markers. A review of the Bayesian approach to kinship testing is given, before we return to the likelihood ratio and its properties. These properties are incorporated when presenting the theory for evaluating the performance of a blind search. We then introduce the data used in the results section and give a brief description of our implementation. We provide several examples and conclude with a discussion of the challenges and the advantages of the work we present.

5. Results

The first example shows that the LRs in a blind search are not independent. The second example demonstrates how to evaluate the performance of a blind search such as we present in the third example. We then carry out a blind search on X-chromosomal markers before showing how inbreeding can be accommodated.

5.1 Correlation between LRs in a blind searchIn this example, we show by simulation a case where the LRs of a blind search are correlated. Consider the pedigree in Fig. 6 and the hypotheses H1 stating a sibling relationship and H0 unrelated. Let LR1,3 denote the likelihood ratio when individual 1 is compared to 3 and define LR2,3 analogously. We use 10 independent loci, each with 10 alleles and equal allele frequencies of 0.1. Note that the LRs are random variables. We simulate 1000 sets of DNA profiles for the three shaded individuals of the pedigree in Fig. 6. The values of LR1,3 and LR2,3 are computed for each simulation. The results are shown in the scatter plot in Fig. 6, the red line denoting a regression line.Fig. 6Fig. 6Figure corresponding to the correlation discussion in . Left: Scatter plot of LRs of simulated data for two siblings and an unrelated individual. Red line shows regression line. Right: Pedigree used for simulation of data, identifying the id labels 1, 2 and 3 in left panel.

The estimated correlation between the logarithmic values of LR1,3 and LR2,3 is 0.484. This shows that the LRs are not independent. In other words, the outcome of different comparisons cannot be interpreted independently if one individual is involved in several comparisons. We elaborate on the implications of this correlation in the discussion.

5.2 From FWER to choice of LR thresholdIn Section 5.3, we carry out a blind search among 65 individuals, genotyped for a set of 27 STR loci. Here, we present the preliminary evaluation required to obtain optimal LR thresholds for that search.The first step is to decide on an acceptable value of α. From this value of α we can decide on an upper limit of the FPR and then the corresponding optimal LR threshold. For a blind search of n = 65 individuals, with the requirement that α≤0.05, Equation (8) gives an upper limit for the false positive rate of FPR0.05 = 2.404 ⋅ 10−5.

The next step is to analyse how the FPR and TPR relate to each other for this particular set of markers. This depends on what hypotheses we test in the blind search. In the following example, we consider the hypotheses H1: PO, H2: S, H3: H/U/G and H4: FC, all against H0: UN. We therefore consider these hypotheses here when estimating FPR and TPR.

Fig. 7 shows ROC curves for the different hypotheses. The values for FPR^ and TPR^ are estimated from simulated data, as described in Section 3.2. For H1: PO, we only obtained estimates of FPR smaller than 10−7, with a corresponding estimated TPR of 0.999 or higher. This shows that the LR comparing PO to UN is high when the true relationship is PO and low otherwise. Parent offspring and unrelated individuals are easily distinguished as expected and so we have omitted this curve from the graph.Fig. 7Fig. 7ROC curves for the analysis performed in . The hypothesis H1 stating S, H and FC and H0 unrelated, using 27 STR markers. ROC curves from simulated data. A threshold of 11 corresponds to an estimated false positive rate of about 0.01 and an estimated true positive rate of about 0.26 for FC. The right figure shows the same estimated ROC curves and the line FPR = TPR, with an untransformed first axis.

The ROC curves show the estimated properties of a single computation of the LR, for the respective hypotheses, for this specific set of STR markers. The curves do not depend on the number of individuals in the blind search.

The last step in the performance analysis is to identify the optimal threshold, by minimizing ER(t), with the constraint FPR^≤FPRα. The highest optimal thresholds for α = 0.05 and α = 0.1 are listed in Table 3.Table 3Optimal thresholds for different relationships, with corresponding FPR^ and TPR^, for the analysis performed in Section 5.2. For α = 0.05 (left table) and α = 0.1 (right table), for blind search with n = 65 individuals.5.3 Blind search with real dataIn this example we do a blind search on the data described in Section 4.1. The data set contains 65 DNA profiles. A blind search among these profiles results in 2080 pairwise comparisons. We want to test the hypotheses H1: PO, κ1 = (0, 1, 0), H2: S, κ2 = (0.25, 0.5, 0.25), H3: H/U/G, κ3 = (0.5, 0.5, 0), H4: FC, κ4 = (0.9375, 0.0625, 0), against H0: UN, κ0 = (1, 0, 0). In the previous section, we obtained optimal thresholds for blind searches with these hypotheses (Table 3). A stepwise mutation model is implemented in the evaluation of PO.Table 4 summarises the blind searches performed on the real data. This table is possible to construct because we know the true relationship for each pair from the pedigree information. In practice, only the sum of the last three rows (for each relationship) would be known.Table 4Results of the blind search among n = 65 individuals in Section 5.3 with α = 0.05 where N1 denotes the total number of pairs in the sample with the tested relationship, TP is the number of these pairs with a LR above the threshold, and FP is the number of unrelated individuals with a LR above the threshold. The last row gives the number of other (differently related) pairs with an LR above the threshold.

For PO, we are left with a list of 47 hits. 43 of these are true PO, while 4 of the 47 hits are pairs of individuals with another relationship. 3 pairs with true PO relationship are not detected. By lowering the threshold, the remaining 3 pairs could have been detected. However, the probability of obtaining false positives increases by decreasing the threshold. For S, only one true sibling pair is not detected and there is only one false positive. However, the list of hits contains 66 pairs of individuals, 53 of these having another relationship.

We conclude that the summary in Table 4 is consistent with the performance evaluation shown in Table 3. PO can easily be distinguished from UN. The more distant the tested relationship, the lower the power to distinguish it from unrelated. With the obtained optimal thresholds, the number of false positives stays low as desired. For each hypothesis tested, the list of pairs warranting further investigation comprises those in the final row of Table 4, i.e., those who do not have the tested relationship and who are also not unrelated.5.4 Analysis of posterior probabilitiesThe result of each of the blind searches performed in Section 5.3 is a list of pairs with a LR above the threshold. Some pairs of individuals may appear in several of the lists, while other pairs may not be present in any of the lists. In this example, we turn to Bayesian analysis to further investigate specific pairs.Table 5 shows LR values for 7 pairs from the above blind search. Values above the LR thresholds are in bold font. The rightmost column gives the true relationship. Only the first two pairs have LR values above the thresholds given in the left table of Table 3 corresponding to α = 0.05. For pairs 3 to 7, the LRs are low, some below 1, indicating that a UN relationship is more plausible than the alternative hypothesis.Table 5LR values for seven pairs of the blind search in Section 5.3. Values for H, U and G are the same and shown in the column H/U/G. Values smaller than 10−6 are set to 0.Next we calculate posterior probabilities to see if it is possible to infer a relationship for the different pairs. LR thresholds are not required for this. Table 6 shows posterior probabilities for the different hypotheses, with flat prior probabilities, i.e., πi = 1∕7 for i = 0, . . . , 6. The highest probability for each pair is in bold and corresponds to the true relationship for several of the pairs. For example, the LRs in Table 5 comparing S, H/U/G and FC against UN for the second pair were all above the relevant LR thresholds. The posterior probability of S is close to 1, now making it possible to correctly infer this relationship. For pairs 3, 4 and 5, the highest posterior probabilities are just below 0.3. Even though the corresponding relationship is the most probable, a posterior probability of 0.3 is maybe not high enough to allow firm conclusions to be drawn.Table 6Posterior probabilities, computed from the LR values of Table 5, when applying a flat prior, i.e., πi = 1∕7 for i = 0, . . . , 6, as described in Section 5.4. Values for H, U and G are the same and shown in the column H/U/G. Probabilities smaller than 10−6 are set to 0.The relationships H, U and G are indistinguishable in the parametric framework presented in Section 2. Also posterior probabilities with a flat prior as in Table 6 can not differentiate between them. Additional information, preferably objective, needs to be considered.

Suppose now that we have knowledge of how many pairs of the different relationships are present among the DNA profiles. This could be the case in a plane crash with a known passenger list. There are 1867 unrelated pairs, 46 parent-offspring pairs, 13 sibling pairs, 4 half sibling pairs, 33 avuncular pairs, 27 grandparental pairs and 21 first cousin pairs. The remaining 69 pairs have other more distant relationships not investigated here. The prior probabilities are then π0 = 0.928 (UN), π1 = 0.023 (PO), π2 = 0.006 (S), π3 = 0.002 (H), π4 = 0.016 (U), π5 = 0.013 (G) and π6 = 0.010 (FC).

Posterior probabilities using these more informative priors are shown in Table 7. The prior probability of a PO relationships is π1 = 0.023, i.e., there is a chance of 2.3% that a pair of individuals has a PO relationship. The corresponding posterior probability for the first pair is 0.999. The genetic data give such strong support to PO, that even though the prior probability is low, the posterior probability of this relationship is approximately 1.Table 7Posterior probabilities with informative priors, as described in Section 5.4. Probabilities smaller than 10−6 are set to 0.

In this blind search (as in most other blind searches), most pairs of individuals are unrelated, making the prior probability of UN close to 1 and the others low. This requires the LRs for the other relationships to be high in order to be supported by the posterior probabilities. For the relationships H/U/G and FC, the LR of the true relationships against UN is typically low. The combination of priors and LRs makes the posterior probability of UN high while the posterior probability of the true relationship remains low.

For this reason, this particular set of prior probabilities, even though objective, does not help us to distinguish between the H, U and G relationships in these data.

5.5 Blind search with X-chromosomal markers

Because a male has only one X-chromosome, paternal half sisters (HSP) must inherit the same X-chromosome from their common father. Their second X-chromosomes, inherited from their respective mothers, are not IBD (since their mothers are unrelated), and hence, the IBD coefficients for a HSP relationship are κ = (0, 1, 0). The IBD coefficients for maternal half siblings (HSM), whether considering X-chromosomal or autosomal markers, are κ = (0.5, 0.5, 0). In the following example, we show with simulated data how X-chromosomal markers can distinguish between HSP and HSM.

We simulated genotypes for 12 X-chromosomal STR markers, for the shaded individuals in Fig. 8. Genotypes are simulated for each locus independently, by gene-dropping through the pedigree structure. More specifically, genotypes are sampled for the founders of the pedigree (the parents) according to the allele population frequencies and passed down through the pedigree assuming the rules of Mendelian inheritance. Only the resulting genotypes of the offsprings are kept for the applications in this example. Table 8 presents the average posterior probabilities over 100 simulations, for the relationships PO, S, HSP, HSM and UN, for the six possible comparisons between the individuals A, B, C and D. A flat prior πi = 1∕5 for i = 0, …, 4 is assumed.Table 8Posterior probabilities averaged over 100 simulations for the comparisons between the four daughters in Fig. 8.The evidence in favour of C-D being HSM, shown in bold in Table 8, could not be obtained using autosomal markers. Since we are using a flat prior, the LR comparing maternal to paternal half sibs can be found from the posterior probability ratio, 0.81327/0.01916 = 42.4. This value may not be decisive on its own, but supplements other evidence. Note that HSP cannot be distinguished from PO using X-chromosomal markers alone as the row for the comparison A-C confirms. Age information, autosomal marker data or other non-DNA data may solve such cases.5.6 Half siblings with inbred founderComputations of LRs and posterior probabilities are restricted to a limited set of predefined pedigree relationships in many current software implementations. The parametric form of the LR given in (3) enables us to compute LRs and do blind search for any pairwise relationship. In this example we show how background inbreeding can be modelled and how this can be taken into account in the Bayesian framework.Assume a set of DNA profiles among which we want to do a blind search. The number of profiles is not important. The pedigrees connecting the individuals are unknown, but we know that the individuals come from a population where inbreeding is common. We consider the hypotheses H1: PO, H2: S, H3: H and H4: H with founder inbreeding f = 0.25, all against H0: UN. The relationship in H4 is shown in Fig. 9. Individuals A and B are outbred paternal half siblings, with the father being inbred with an inbreeding coefficient f = 0.25. This value of f corresponds to extreme inbreeding where the parents of the father are siblings. The IBD coefficients for the half sibling relationship are κ = (0.375, 0.625, 0).Fig. 9Fig. 9Half sibling pedigree with founder inbreeding assumed in the analysis in .

We consider one pair with true relationship H4. A total of 100 simulations of DNA profiles for this pair is performed. LRs and posterior probabilities, with a flat prior πi = 1∕5, i = 0, …, 4 are computed for each simulation. Mean values of the posterior probabilities for the hypotheses H0, . . . , H4, are: p¯1=0.017, p¯2=0.094, p¯3=0.374, p¯4=0.495 and p¯0=0.019. It can be seen that the mean posterior probability of hypothesis H(f) is about 0.5, making it possible to distinguish it from the half sibling relationship without inbreeding.

The coefficient of inbreeding in this example is quite high. Lower values of f make the pair genetically more similar to half siblings without inbreeding, and distinguishing these relationships becomes harder without additional information. This high degree of inbreeding may be more relevant for non-human applications.

6. Discussion

The topic of this paper is blind search, a procedure used to search for pairwise relationships among a set of unidentified DNA profiles. Each pairwise comparison is similar to a kinship test performed, for instance, to resolve a paternity dispute. In the paper, we focus mainly on issues related to multiple testing. For this reason we will not discuss Hardy-Weinberg equilibrium and other assumptions that our applications share with other applications in forensic genetics. For instance, it is not obvious how evidence from different DNA sources like autosomal markers and X-chromosomal markers should be combined. However, this challenge is no different for a blind search than for a kinship test and is therefore not addressed here.

Case workers must decide on how the results of a blind search should be evaluated and reported. The context, or specific application, is obviously not irrelevant. In a DVI application, a false identification is likely to be a more serious error than missing an identification. To account for this, the metric for determining the threshold in (7) allows a weight to be specified which would penalise false identifications. Other applications, such as screening a database for relatives prior to estimating allele frequencies, may not require a weighting for errors. If costs can be specified for the possible errors, optimal decision rules can be derived as explained in Chapter 8.1 of [Egeland T. Kling D. Mostad P. Relationship inference with Familias and R: Statistical methods in forensic genetics.]. However, there is hardly ever an objective way to balance the two errors that can occur and so specification of weights or costs may not be a viable option. We have used the unweighted form of the metric throughout. We have only considered binary decisions (corresponding to t0 = t1) as stated in the beginning of Section 2.4. One could drop this requirement and declare a test to be inconclusive if t0 t1. In this case, a cost for making no decision would have to be added and (7) modified accordingly.We only presented one method to determine an optimal threshold based on the distance illustrated in Fig. 4 although several alternatives are available [Finding the optimal cut-point for Gaussian and Gamma distributed biomarkers.]. Results using different approaches were practically identical for the examples we presented and so we chose not to discuss the thresholds based on the other metrics. Furthermore, other approaches to obtain ROC curves could be considered. For instance, there are several ways to smooth ROC curves. It is also possible to provide confidence bands for the ROC curves and study the impact of assumptions. This has been explored in previous papers [Robin X. Turck N. Hainard A. Tiberti N. Lisacek F. Sanchez J. Müller M. pROC: an open-source package for R and S+ to analyze and compare ROC curves.]. Fig. 6shows that the LRs from a blind search may be correlated when the same individual is involved in two comparisons. This has several implications. In particular, the results of different comparisons cannot be interpreted independently. Intuitively, we may get a high LR if unrelated individuals A and B happen to share a rare allele. Another individual C, who is a close relative of A, is likely to share this allele IBD with A and so we can also expect a high LR when comparing B and C. Importantly, the methods used to control the overall error rate must allow for dependence and for this reason we used the Bonferroni bound (8) as an upper limit for the FWER. Another frequently used measure to control the overall error rate in multiple testing scenarios is the false discovery rate (FDR) [The positive false discovery rate: a Bayesian interpretation and the q-value.]. When controlling the FDR, the outcome of each test is based on p-values. However, conventional significance testing based on p-values are not recommended to evaluate the strength of DNA evidence in forensic genetics [Kruijver M. Meester R. Slooten K. p-Values should not be used for evaluating the strength of DNA evidence., Gjertson D.W. Brenner C.H. Baur M.P. Carracedo A. Guidet F. Luque J.A. Lessig R. Mayr W.R. Pascali V.L. Prinz M. ISFG: recommendations on biostatistics in paternity testing.].Furthermore, a blind search will not necessarily provide a globally consistent ‘solution’ in the sense that the LRs may support impossible combinations of relationships, like one individual having two mothers. An interesting extension to this paper would be to investigate alternative search strategies that may improve the results of a blind search. One strategy could be to do the search sequentially, where hypotheses to be tested depend on the outcome of the previous pairwise comparisons. For instance, if individuals A and B are classified as PO and A and C as PO, then it would be logical to test if B and C are siblings, half siblings or a grandparent-grandchild pair. There are also methods and software for pedigree reconstruction, see Chapter 8 of []. Finally, the true relationship may not be among the alternatives considered. This is also true for the Bayesian approach.

A Bayesian interpretation might seem more appropriate than the frequentist alternative for blind search applications than for a kinship case. The alternative, based on the LR, is designed to deal with only two hypotheses. If there are several hypotheses, a reference hypothesis must be specified. The posterior probabilities reported using a Bayesian approach make comparison of several competing hypotheses simpler as they are between 0 and 1. However, as always, a prior is needed for the Bayesian approach and the choice of prior may be crucial. If DNA is of poor quality, leading to few markers being typed, or if the competing hypotheses specify relationships that are very close to each other, conclusions may hinge on the choice of the prior.

An important aspect of this paper is the use of the parametric representation of relationships. This enables us to investigate any admissible pairwise relationship between two outbred individuals. By defining founder inbreeding in a pedigree structure, as shown in Fig. 3, background inbreeding can also be modelled [Handling founder inbreeding in forensic kinship analysis.]. Rather than proposing specific alternative relationships, we could simply estimate the coefficients describing the relationship. In the outbred case, these estimates can be plotted in the IBD triangle in Fig. 2 which would indicate where these relationships lie in relation to the well known relationships. For instance, pairs with estimates close to (0.25, 0.25) could be classified as siblings.

Throughout, we have restricted attention to pairwise testing. In principle, the blind search can be extended to search for relationship between triplets. However, the parametric approach based on the Jacquard coefficients then becomes impractical. The number of parameters needed to describe the relationship between three individuals increases, from 2 to 15 in the outbred case.

Issues to do with reporting DNA evidence are currently of key interest as evidenced by the so-called “DNA database controversy" (see [The DNA database search controversy revisited: bridging the Bayesian–frequentist gap.] and references therein). The main message of this paper is that there are also problems related to multiple testing in kinship analyses which cannot be ignored.B. Importance samplingImportance sampling is a method that can be used to approximate small probabilities. We first introduce the indicator function,

I(LR>t)=E(I(LR≥t))=0⋅P(LR<t)+1⋅P(LR≥t)=P(LR≥t)=FPR


It is therefore valid to say that FPR = E(I(LR≥t)). Then consider the expression for the expected value in a more general sense. The value of the function I is dependent on the value of the LR, which is a function of the genotypes G of the DNA profiles. The probability distribution of G is governed by the relationships that has generated the data. For this consideration, we assume that this relationship is either H0 or H1. Denote by X the values that I can take on. We then have

E(I(LR≥t))=∑jXj⋅P(Gj∣H0)≈1N∑si=1

Comments (0)

No login
gif