A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs

Abstract Ending poverty in all its forms everywhere remains the number one Sustainable Development Goal of the United Nations 2030 Agenda. Governments face challenges in measuring socioeconomic status with fine spatial resolution because traditional data collection methods, such as censuses and surv...

Full description

Saved in:
Bibliographic Details
Main Authors: Eduardo Cruz, Monica Villavicencio, Carmen Vaca, Lisette Espín-Noboa, Nervo Verdezoto
Format: Article
Language:English
Published: SpringerOpen 2025-01-01
Series:EPJ Data Science
Subjects:
Online Access:https://doi.org/10.1140/epjds/s13688-024-00515-9
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832594936705318912
author Eduardo Cruz
Monica Villavicencio
Carmen Vaca
Lisette Espín-Noboa
Nervo Verdezoto
author_facet Eduardo Cruz
Monica Villavicencio
Carmen Vaca
Lisette Espín-Noboa
Nervo Verdezoto
author_sort Eduardo Cruz
collection DOAJ
description Abstract Ending poverty in all its forms everywhere remains the number one Sustainable Development Goal of the United Nations 2030 Agenda. Governments face challenges in measuring socioeconomic status with fine spatial resolution because traditional data collection methods, such as censuses and surveys, are time-consuming, labor-intensive, performed at long intervals, and cover only a limited population. This work is a data-driven study to analyze the digital traces left by humans in supermarket transactions and model the relationship between consumption behavior and the average per capita income, proposing a proxy to estimate socioeconomic status at the urban neighborhood level. We analyze more than 20 million supermarket shopping transactions in Guayaquil, the most populated city in Ecuador. Using customer consumption data, we created a basket graph and fed it into a graph neural network to predict neighborhood socioeconomic status. The model was trained with spectral and spatial convolutional filters using cross-validation to select the best approach for the prediction. The results show that the Chebyshev spectral convolutional filter has the highest predictive power to predict the socioeconomic status of the neighborhood, with R 2 = 0.91 $R^{2}=0.91$ . Our proposed approach contributes to measuring socioeconomic status at the neighborhood level to support policymakers in making informed decisions about resource allocation according to the needs of different geographical areas.
format Article
id doaj-art-d04c11b13a8a4ecaa473bd3822dd4e08
institution Kabale University
issn 2193-1127
language English
publishDate 2025-01-01
publisher SpringerOpen
record_format Article
series EPJ Data Science
spelling doaj-art-d04c11b13a8a4ecaa473bd3822dd4e082025-01-19T12:13:54ZengSpringerOpenEPJ Data Science2193-11272025-01-0114111810.1140/epjds/s13688-024-00515-9A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNsEduardo Cruz0Monica Villavicencio1Carmen Vaca2Lisette Espín-Noboa3Nervo Verdezoto4Escuela Superior Politecnica del LitoralEscuela Superior Politecnica del LitoralEscuela Superior Politecnica del LitoralCentral European UniversityCardiff UniversityAbstract Ending poverty in all its forms everywhere remains the number one Sustainable Development Goal of the United Nations 2030 Agenda. Governments face challenges in measuring socioeconomic status with fine spatial resolution because traditional data collection methods, such as censuses and surveys, are time-consuming, labor-intensive, performed at long intervals, and cover only a limited population. This work is a data-driven study to analyze the digital traces left by humans in supermarket transactions and model the relationship between consumption behavior and the average per capita income, proposing a proxy to estimate socioeconomic status at the urban neighborhood level. We analyze more than 20 million supermarket shopping transactions in Guayaquil, the most populated city in Ecuador. Using customer consumption data, we created a basket graph and fed it into a graph neural network to predict neighborhood socioeconomic status. The model was trained with spectral and spatial convolutional filters using cross-validation to select the best approach for the prediction. The results show that the Chebyshev spectral convolutional filter has the highest predictive power to predict the socioeconomic status of the neighborhood, with R 2 = 0.91 $R^{2}=0.91$ . Our proposed approach contributes to measuring socioeconomic status at the neighborhood level to support policymakers in making informed decisions about resource allocation according to the needs of different geographical areas.https://doi.org/10.1140/epjds/s13688-024-00515-9Neighborhood socioeconomic statusItem embeddingBasket graphGraph neural networkSpectral convolutional filterPer capita income
spellingShingle Eduardo Cruz
Monica Villavicencio
Carmen Vaca
Lisette Espín-Noboa
Nervo Verdezoto
A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs
EPJ Data Science
Neighborhood socioeconomic status
Item embedding
Basket graph
Graph neural network
Spectral convolutional filter
Per capita income
title A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs
title_full A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs
title_fullStr A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs
title_full_unstemmed A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs
title_short A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs
title_sort new approach to estimate neighborhood socioeconomic status using supermarket transactions and gnns
topic Neighborhood socioeconomic status
Item embedding
Basket graph
Graph neural network
Spectral convolutional filter
Per capita income
url https://doi.org/10.1140/epjds/s13688-024-00515-9
work_keys_str_mv AT eduardocruz anewapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns
AT monicavillavicencio anewapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns
AT carmenvaca anewapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns
AT lisetteespinnoboa anewapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns
AT nervoverdezoto anewapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns
AT eduardocruz newapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns
AT monicavillavicencio newapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns
AT carmenvaca newapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns
AT lisetteespinnoboa newapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns
AT nervoverdezoto newapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns