A detailed explanation and graphical representation of the Blinder-Oaxaca decomposition method with its application in health inequalities

Blinder-Oaxaca (B-O) decomposition method

Sometimes it is essential to decompose the mean difference in a specific continuous outcome between 2 groups (Group 1 and Group 2) to determine the factors contributing to that difference. For this purpose, multiple regression model can be employed. In terms of statistical measures, this particular decomposition method can be considered a combination of t-test and multiple regression models. Assuming that the outcome value (Y) is explained by K variables (x1, ….xk) in the linear regression model, the mean predicted outcome for group g (1 and 2) can be expressed as follows:

$$\overline^ = \beta_^ + \mathop \sum \limits_^ \beta_^ \overline_^$$

where \(\overline_\) is the mean value of each predictor and \(\beta\) is the estimated regression coefficient.

Thus, the mean difference in outcome between the 2 groups (1 and 2) is as follows:

$$\Delta \overline = \left( ^ - \beta_^ } \right) + \mathop \sum \limits_^ (\beta_^ \overline_^ - \beta_^ \overline_^ )$$

(1)

The mean difference of outcome is the sum of the effects of different components, including: (1) Average difference between the level of each observable variable (\(x_\)); (2) differential effects (\(\beta_\)) of these variables in the 2 comparison groups, and (3) basic difference which includes the effect of unknown variables that are not included in the model. One question worth asking is “How large is the contribution of each of model components to this difference?".

To answer this question, the levels of explanatory variables and regression coefficients in the two groups are alternately assumed identical to achieve the net effect of each component. In fact, a counterfactual approach is adopted to replace the coefficients and the variables levels of the equation for one group to corresponding values for the other group (reference). Accordingly, the expected change in a group mean outcome is obtained when this group gets the predictor values and regression coefficients of the reference. In this procedure, the contribution of each component can be estimated [9, 11].

If Group 1 (or its outcome) is selected as the reference, the expected change in predictors level and regression coefficients of Group 2 and subsequently the change in its outcome will be considered.

The equation exclusive to Group 1 can be reformulated from the perspective of Group 2 as follows:

$$\begin \overline^ & = \beta_^ + \mathop \sum \limits_^ \beta_^ \overline_^ \\ & = \beta_^ + \mathop \sum \limits_^ \left[ ^ + (\beta_^ - \beta_^ } \right)]\mathop \sum \limits_^ \left[ _^ + \left( _^ - \overline_^ } \right)} \right] \\ & = \beta_^ + \mathop \sum \limits_^ \beta_^ \overline_^ + \mathop \sum \limits_^ \beta_^ \left( _^ - \overline_^ } \right) + \mathop \sum \limits_^ \overline_^ (\beta_^ - \beta_^ ) + \mathop \sum \limits_^ \left( _^ - \overline_^ } \right)\left( ^ - \beta_^ } \right) \\ \end$$

The above equation involves \(\beta_^ = \beta_^ + (\beta_^ - \beta_^ )\) and \(\overline_^ = \overline_^ + \left( _^ - \overline_^ } \right)\), which can be replaced in Eq. 1 to decompose the mean difference in outcome into 4 components as follows:

figurea

The decomposition shown in this equation is formulated from the perspective of Group 2, when Group 1 is selected as the reference.

Accordingly, the predicted difference (D) can be decomposed into 4 components (B, E, C and I); in other words, the contribution of each component in the difference can be estimated:

1.

The first component (B) is attributed to basic differences. It includes the effects of unobservable variables not taken into account (i.e. not included in the model).

2.

The second component (E) indicates change in group 2’s mean predicted outcome when it meets the group 1 (Reference)’s covariates level:

$$\mathop \sum \limits_^ \beta_^ (\overline_^ - \overline_^ ) = \mathop \sum \limits_^ \left( ^ \overline_^ - \beta_^ \overline_^ } \right)$$

In other words, a portion of the difference (D) that explained by group differences in the level of observable explanatory variables (explained component). This portion is known as “endowments effect”.

3.

The third component(C) is a part of the difference represents a change in group 2’s mean predicted outcome when that group meets the regression coefficients of the other group:

$$\mathop \sum \limits_^ \overline_^ \left( ^ - \beta_^ } \right) = \mathop \sum \limits_^ \left( ^ \overline_^ - \beta_^ \overline_^ } \right)$$

It involves a portion of the difference (D) caused by the differential effect of the observable variables on outcome across the 2 comparison groups. It cannot be explained by the level of observable explanatory variables (unexplained component). This portion of the difference is known as “coefficients effect”.

4.

The fourth component (I) involves an interaction due to simultaneous effect of differences in endowments and coefficients [11, 12].

Figure 1 schematically displays the decomposition of group difference in mean predicted outcome from the perspective of Group 2, when Group 1 has been considered a reference (Eq. 2)

Fig. 1figure1

Decomposition of the group difference in mean predicted outcome (the interaction model) by selecting Group 1 as the reference (from the perspective of Group 2)

Similarly, when Group 2 is selected as the reference, the expected change in mean predicted outcome can be expressed from the perspective of Group 1 as follows:

figureb

To achieve the above Eq. 3, the covariates level and regression coefficients of Group 2 are reformulated from the perspective of Group 1 as follows

$$_}}^ \, = \,\upbeta _}}^ - \left( _}}^ - _}}^ } \right)$$

$$\overline_^ = \overline_^ - \left( _^ - \overline_^ } \right)$$

And then replace their corresponding values in the Eq. 1. Thus, the difference (D) in Eq. 3 can be decomposed into 4 partitions (B, E, C and I): The first component (B) and fourth component (I) are related to the same factors expressed in the following Eq. 2. The second component (E), however, measures expected change in group 1’s mean predicted outcome if this group has the group 2(Reference)’s covariates level:

$$\mathop \sum \limits_^ \beta_^ (\overline_^ - \overline_^ ) = \mathop \sum \limits_^ \left( ^ \overline_^ - \beta_^ \overline_^ } \right)$$

The third component (C) is similarly a part of difference measures the expected change in group 1’s mean predicted outcome when this group meets the regression coefficients of Group 2:

$$\mathop \sum \limits_^ \overline_^ \left( ^ - \beta_^ } \right) = \mathop \sum \limits_^ \left( ^ \overline_^ - \beta_^ \overline_^ } \right)$$

Figure 2 schematically depicts the decomposition conditions where Group 2 has been selected as a reference.

Fig. 2figure2

Decomposition of difference (the interaction model) by selecting Group 2 as the reference (from the perspective of Group 1)

In Eqs. 2 and 3 (Figs. 1 and 2) the first component (B), \(\left( ^ - \beta_^ } \right)\), denotes to the differences between two groups that cannot be explained by the observed covariates (X). In fact, this difference is due to unobserved variables. On the other hand, the coefficient component (part C) is also unexplained by those differences. Then, we can combine these two components (B and C) into unexplained part (U), yielding the three-fold decomposition,

$$\Delta \overline = \mathop \sum \limits_^ \beta_^ (\overline_^ - \overline_^ ) + \mathop \sum \limits_^ \overline_^ \left( ^ - \beta_^ } \right) + \mathop \sum \limits_^ (\overline_^ - \overline_^ )\left( ^ - \beta_^ } \right)$$

(4)

figurec

In other words, if we assume that there are no relevant unobservable explanatory variables, the total unexplained part (U) in the Eqs. 4 and 5 will be equal to the C component in the Eqs. 2 and 3.

In this approach, the difference in mean predicted outcome (D) contains three components (E, U and I): The first component (E) is explained by the difference in the level of the covariates, the second component (U) arises from the differential effect of all those covariates (the unexplained part), and the third component (I) involves an interaction caused by the simultaneous group differences in the covariates level and their coefficients.

Up to now we had postulated that one of the groups 1 or 2 has the best achievable outcome and the other group should reach to this outcome. Another approach is that we suppose that there is a nondiscriminatory condition (marked by a nondiscriminatory vector of coefficients) that both groups should reach to this condition. Therefore, this approach requires the definition of nondiscriminatory conditions or reference coefficients. Sometimes even this nondiscriminatory condition can be the situation of one of the comparison groups (which we can call it reference coefficient).

Suppose \(\beta^\) is the nondiscriminatory condition or reference coefficient, the overall equation for decomposition of \(\Delta \overline\) will be:.

figured

In this way the outcome difference has been decomposed into two components (two-fold decomposition). The first component is the part of the group difference that is explained by the differences in the levels of observed characteristics. This is also called “endowments effect”. The second component refers to the part of the gap that is due to differences of \(\) s with the non-discriminating \(^\). It also catches differences in level of unobservable variables and also their differential (discriminating) effects. This component determines the unexplained portion of the disparity. If all the unobserved covariates were in the model and measured, it comprised just the difference of \(\) s with the non-discriminating \(^\). This portion is sometimes considered as “discrimination effects”.

\(^\) is always between \(^}\) and \(^\) , or equal to both or one of them. Then, we have \(^} \ge ^ \ge ^}\) or \(^} \le ^ \le ^}\).

If \(^} > ^} > ^}\), we have positive discrimination "in favor of" group 1 and negative discrimination “against" group 2 and if \(^} ^ ^}\), then we have positive discrimination in "in favor of" group 2 and negative discrimination "against" Group 1.

There is also a case that one of the two groups experiences discrimination and the non-discriminating \(^\) will simply be the coefficients from the other group. In such case, \(^} ^ ^}\) or \(^} > ^ ^}\). If we replace \(^\) with \(^\) in the Eq. 6, we reach to the Eq. 7 and if we replace \(^\) with \(^\) we reach to the Eq. 8

$$\Delta \overline = \mathop \sum \limits_^ \beta_^ (\overline_^ - \overline_^ ) + \mathop \sum \limits_^ \overline_^ \left( ^ - \beta_^ } \right)$$

(7)

(8)

.

Therefore, we have a twofold decomposition of the difference in mean predicted outcome (D):

1.

The Unexplained component (Uc): This is exactly similar to the U part of the three-fold decomposition (Eq. 4 and 5). It arises from the differential effect of observable variables and also differential effect (\(\)) and level of unobservable variables. This determines the unexplained portion of the gap.

2.

The Explained component (Ec): This part is the combination of E and I parts of the three-fold decomposition (Eqs. 4 and 5). Although this component is called the explained component in two-fold decomposition in many texts but some part of it (the interaction part) is in fact the simultaneous difference of coefficients and covariates level in both groups. Hence, if somebody wants the crude explained component, three folds’ decomposition can provide this crude explained part [11, 13].

Therefore, Eqs. 7 and 8 can be considered a specific form of Eqs. 4 and 5, where components E and I have been integrated. Thus:

$$\mathop \sum \limits_^ \beta_^ (\overline_^ - \overline_^ ) + \mathop \sum \limits_^ (\overline_^ - \overline_^ )\left( ^ - \beta_^ } \right) = \mathop \sum \limits_^ \beta_^ (\overline_^ - \overline_^ )$$

and

$$\mathop \sum \limits_^ \beta_^ (\overline_^ - \overline_^ ) - \mathop \sum \limits_^ (\overline_^ - \overline_^ )\left( ^ - \beta_^ } \right) = \mathop \sum \limits_^ \beta_^ (\overline_^ - \overline_^ )$$

Figures 3 and 4 schematically demonstrate the decomposition conditions where the group 1 and group 2 coefficients has been selected as a reference, respectively.

Fig. 3figure3

Decomposition of outcome difference using the group 1 coefficients as the reference

Fig. 4figure4

Decomposition of outcome difference using the group 2 coefficients as the reference

It is not clear which regression coefficient should be selected as a reference (Eqs. 7and 8). This is known as "index problem” [9, 14,15,16,17,18]. Reimers [19] suggests using the average regression coefficients over both groups (\(\frac}\)), while Cotton [15] expresses the sum of coefficients weighted by each group size (\(\frac _}}^ + n^ _}}^ }}}}\)) for \(_}}^\). In this regard, Neumark proposes the use of regression coefficients from a pooled model over both groups as an estimate for nondiscriminatory conditions [16, 17].

Nonlinear extension of B-O decomposition method

Although the primary application of the proposed B-O decomposition is based on the linear regression model, several researchers, including Yun and Fairlie, have proposed a nonlinear version of decomposition [14, 20,21,22], which has been widely used in the decomposition of inequalities in the health sector [23,24,25,26,27].

As mentioned, the original B-O decomposition of the 2-group disparity in the average value of the response variable, Y, can be expressed as:

$$\Delta \overline = [\beta^ \left( ^ - \overline^ } \right)\left] + \right[\overline^ \left( - \beta^ } \right)]$$

where \(\overline\) is a row vector of average values of the explanatory variables and \(\beta\) is a vector of coefficient estimates for each group 1 and 2. In this case, the coefficient estimates of group 1, \(\beta^\), have been assumed to be as the reference. According to Fairlie[28], the decomposition for a nonlinear equation, \(Y = F\left( \right)\) can be expressed as follows:

figuref

where Ng is the sample size for group g (1 or 2), and \(\Delta }}\) represents the difference in "mean predicted probability of outcome" between two groups with N1 and N2 individuals. This alternative expression for the decomposition is used because in non-linear transformations of Y, \(\overline\) does not necessarily equal \(F\left( \beta } \right)\). The original B-O decomposition is a special case of Eq. 9 in which \(F\left( \beta } \right) = X_ \beta\) [28].

Similarly, another expression for the decomposition is:

$$\Delta \overline = \left[ }}\mathop \sum \limits_^ }} F\left( X_^ } \right) - \frac }}\mathop \sum \limits_^ }} F\left( X_^ } \right)} \right] + \left[ }}\mathop \sum \limits_^ }} F\left( X_^ } \right) - \frac }}\mathop \sum \limits_^ }} F\left( X_^ } \right)} \right]$$

(10)

where the vector of coefficient estimates for group 2 is used as the reference.

Detailed decomposition

In the detailed decomposition one can determine the relative contribution of each factor (X variables) to each one of explained and unexplained components. This can be achieved by sequentially substituting variables levels/coefficients of one group with those of another group while keeping other variables in the model constant.

Using linear regression based decomposition; the detailed decomposition is not a complicated task because each component is obtained simply by summing over the contribution of each predictor to the each component. In nonlinear method, however, performing the detailed decomposition is not as straightforward. In other words, the application of the original (linear) method to nonlinear decomposition models has some conceptual problems that affect the results [28,29,30,31]. The first problem is known as “identification problem”, that is, for nominal (categorical) variables, as the predictors, the decomposition estimates depend on the choice of the base (omitted) category. One solution, proposed by Yun [30], is computing normalized effects. It is equivalent to averaging the coefficients effects of a set of dummy variables while changing the reference groups.

Another problem is “path dependency”. Unlike linear models, nonlinear decomposition is sensitive also to the order of variables being included into the decomposition process (path dependency) [22, 28, 29, 32]. One solution to this issue has been suggested by Fairlie, which involves randomly ordering the variables across replications of the decomposition. This procedure requires one-to-one matching of individuals from the 2 comparing groups, thus there should be equal number of individuals in both groups \(\left( }^ = }^ } \right)\). Otherwise (which is usually the case), a random subsample of the majority group 1 (which is usually equal to the sample size of minority group 2) will be selected and then matched according to the predicted probability for the response variable of each person. In fact, the individual observations in each group will be separately arranged based on predicted probability and then matched according to ranking. This procedure will match the individual characteristics in both groups. Thus, the matched observations (one-to-one) will determine the contribution of each factor to the outcome difference. Thus, the multiple sub-samples (e.g. 100 or 1000 times) are selected and the mean estimate is considered as the final estimate [14, 24, 28, 33]. Using logit coefficient estimates (\(\beta^\)) from pooled sample or from the appropriate reference group, the independent contribution of \(}_}}\) to the gap can be expressed as:

$$\frac }}\mathop \sum \limits_^ }} F\left( ^ \beta_^ + \mathop \sum \limits_ X_^ \beta_^ } \right) - F\left( ^ \beta_^ + \mathop \sum \limits_ X_^ \beta_^ } \right)$$

or

$$\frac }}\mathop \sum \limits_^ }} F\left( ^ \beta_^ + \mathop \sum \limits_ X_^ \beta_^ } \right) - F\left( ^ \beta_^ + \mathop \sum \limits_ X_^ \beta_^ } \right)$$

One simpler strategy to overcome this issue involves using weights as well [22, 34, 35]. According to Yun [22], detailed decomposition using weights can be expressed as follows:

$$\overline^ - \overline^ = \sum\nolimits_} = 1}}^}} }_}_}} }}^ \left[ }\left( ^ } \right)}} - \overline}\left( ^ } \right)}} } \right] + \sum\nolimits_} = 1}}^}} }__}} }}^ \left[ }\left( ^ } \right)}} - \overline}\left( ^ } \right)}} } \right]} }$$

(11)

$$}\sum\nolimits_} = 1}}^}} }_}_}} }}^ = } \sum\nolimits_} = 1}}^}} }_}_}} }}^ = 1}$$

where \(}_}_}} }}^\) and \(}__}} }}^\) represent the weight of Kth variable in the linearization of the explained and unexplained components of inequality, respectively [22, 32]:

$$}_}_}} }}^ = \frac_}}^ \left( }}_}}^ - }}_}}^ } \right)}}} = 1}}^}} _}}^ \left( }}_}}^ - }}_}}^ } \right)}}$$

$$}__}} }}^ = \frac}}_}}^ \left( _}}^ - _}}^ } \right)}}} = 1}}^}} }}_}}^ \left( _}}^ - _}}^ } \right)}}$$

The Fairlie method mainly focuses on the explained portion of inequality without calculating the contribution of the differential effect from each factor to the unexplained part [14]. Nonetheless, that can be achieved through the practical technique proposed by Power et al. [32].

Implementation of the decomposition in related Software

The Oaxaca command package is available for Stata [9], R [36] and SAS Macro% BO_decomp [37] to perform the blinder-oaxaca decomposition. In addition, Stata provides several packages developed for implementation of various forms of Blinder-Oaxaca decomposition into non-linear models, including fairlie [38], gdecomp [39], mvdcmp [32], and nldecompose [40] (Table 1).

Table 1 Different Stata command packages for decomposition of outcome differences between the two groups

Comments (0)

No login
gif