Selection bias due to omitting interactions from inverse probability weighting

Abstract

Inverse probability weighting (IPW) is often used to adjust for selection bias, typically using a simple logit model without interactions as a missingness model. However, the size of the selection bias depends on the interaction between exposure and outcome in their effect on selection - implying that it may be important to include interactions in the IPW model. Via a simulation and a real-data application we compare the performance of IPW with and without interaction terms to estimate a regression coefficient. The simulation study shows that IPW including interactions gives less biased estimates than IPW without interactions in all scenarios studied. Importantly, IPW using a logit model with no interactions often gives estimates close to the complete case analysis (CCA) - perhaps giving false reassurance that results are robust to selection bias. The real-data application investigates the association between unemployment and sleep duration, using data from Understanding Society. IPW including interactions suggests that unemployment is associated with a reduction in sleep duration of around 23 (9, 38) minutes, compared to 27 (14, 40) minutes for IPW without interactions, and 31 (19, 43) minutes for CCA. We strongly recommend including interactions in missingness models to adjust for selection bias.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

LW, KT, RC and AG received funding for this project from the UK Medical Research Council Integrative Epidemiology Unit (funded by MC UU 00032/02) and the University of Bristol. RAH is supported by a Sir Henry Dale Fellowship that is jointly funded by the Wellcome Trust and the Royal Society (grant 215408/Z/19/Z).

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The University of Essex Ethics Committee has approved all data collection on Understanding Society main study, COVID-19 surveys and innovation panel waves, including asking consent for all data linkages except to health records. The ethical approval for our data access was granted.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

The R code used for the simulation study and the Stata code used for the real-data application of our paper are available at the GitHub repository https://github.com/Wen-wow/IPW_interaction. Access to Understanding Society (US) data for our real-data application was obtained under Study Number (SN) 6614; the data are available upon request to the US study on the website: https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=6614.

https://github.com/Wen-wow/IPW_interaction

https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=6614

Comments (0)

No login
gif