Look-alike modelling in violence-related research: A missing data approach.

Violence has been analysed in silo due to difficulties in accessing data and concerns for the safety of those exposed. While there is some literature on violence and its associations using individual datasets, analyses using combined sources of data are very limited. Ideally data from the same indiv...

Full description

Saved in:
Bibliographic Details
Main Authors: Estela Capelas Barbosa, Niels Blom, Annie Bunce
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0301155
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832540235319214080
author Estela Capelas Barbosa
Niels Blom
Annie Bunce
author_facet Estela Capelas Barbosa
Niels Blom
Annie Bunce
author_sort Estela Capelas Barbosa
collection DOAJ
description Violence has been analysed in silo due to difficulties in accessing data and concerns for the safety of those exposed. While there is some literature on violence and its associations using individual datasets, analyses using combined sources of data are very limited. Ideally data from the same individuals would enable linkage and a longitudinal understanding of experiences of violence and their (health) impacts and consequences. This paper aims to provide proof of concept to create a synthetic dataset by combining data from the Crime Survey for England and Wales (CSEW) and administrative data from Rape Crisis England and Wales (RCEW), pertaining to victim-survivors of sexual violence in adulthood. Intuitively, the idea was to impute missing information from one dataset by borrowing the distribution from the other. In our analyses, we borrowed information from CSEW to impute missing data in the RCEW administrative dataset, creating a combined synthetic RCEW-CSEW dataset. Using look-alike modelling principles, we provide an innovative and cost-effective approach to exploring patterns and associations in violence-related research in a multi-sectorial setting. Methodologically, we approached data integration as a missing data problem to create a synthetic combined dataset. Multiple imputation with chained equations were employed to collate/impute data from the two different sources. To test whether this procedure was effective, we compared regressions analyses for the individual and combined synthetic datasets on binary, continuous and categorical variables. We extended our testing to an outcome measure and, finally, applied the technique to a variable fully missing in one data source. Our results show that the effect sizes for the combined dataset reflect those from the dataset used for imputation. The variance is higher, resulting in fewer statistically significant estimates. Our approach reinforces the possibility of combining administrative with survey datasets using look-alike methods to overcome existing barriers to data linkage.
format Article
id doaj-art-28e8300aed394b1ab67121bdbb4330ca
institution Kabale University
issn 1932-6203
language English
publishDate 2025-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj-art-28e8300aed394b1ab67121bdbb4330ca2025-02-05T05:31:27ZengPublic Library of Science (PLoS)PLoS ONE1932-62032025-01-01201e030115510.1371/journal.pone.0301155Look-alike modelling in violence-related research: A missing data approach.Estela Capelas BarbosaNiels BlomAnnie BunceViolence has been analysed in silo due to difficulties in accessing data and concerns for the safety of those exposed. While there is some literature on violence and its associations using individual datasets, analyses using combined sources of data are very limited. Ideally data from the same individuals would enable linkage and a longitudinal understanding of experiences of violence and their (health) impacts and consequences. This paper aims to provide proof of concept to create a synthetic dataset by combining data from the Crime Survey for England and Wales (CSEW) and administrative data from Rape Crisis England and Wales (RCEW), pertaining to victim-survivors of sexual violence in adulthood. Intuitively, the idea was to impute missing information from one dataset by borrowing the distribution from the other. In our analyses, we borrowed information from CSEW to impute missing data in the RCEW administrative dataset, creating a combined synthetic RCEW-CSEW dataset. Using look-alike modelling principles, we provide an innovative and cost-effective approach to exploring patterns and associations in violence-related research in a multi-sectorial setting. Methodologically, we approached data integration as a missing data problem to create a synthetic combined dataset. Multiple imputation with chained equations were employed to collate/impute data from the two different sources. To test whether this procedure was effective, we compared regressions analyses for the individual and combined synthetic datasets on binary, continuous and categorical variables. We extended our testing to an outcome measure and, finally, applied the technique to a variable fully missing in one data source. Our results show that the effect sizes for the combined dataset reflect those from the dataset used for imputation. The variance is higher, resulting in fewer statistically significant estimates. Our approach reinforces the possibility of combining administrative with survey datasets using look-alike methods to overcome existing barriers to data linkage.https://doi.org/10.1371/journal.pone.0301155
spellingShingle Estela Capelas Barbosa
Niels Blom
Annie Bunce
Look-alike modelling in violence-related research: A missing data approach.
PLoS ONE
title Look-alike modelling in violence-related research: A missing data approach.
title_full Look-alike modelling in violence-related research: A missing data approach.
title_fullStr Look-alike modelling in violence-related research: A missing data approach.
title_full_unstemmed Look-alike modelling in violence-related research: A missing data approach.
title_short Look-alike modelling in violence-related research: A missing data approach.
title_sort look alike modelling in violence related research a missing data approach
url https://doi.org/10.1371/journal.pone.0301155
work_keys_str_mv AT estelacapelasbarbosa lookalikemodellinginviolencerelatedresearchamissingdataapproach
AT nielsblom lookalikemodellinginviolencerelatedresearchamissingdataapproach
AT anniebunce lookalikemodellinginviolencerelatedresearchamissingdataapproach