A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs
Abstract Ending poverty in all its forms everywhere remains the number one Sustainable Development Goal of the United Nations 2030 Agenda. Governments face challenges in measuring socioeconomic status with fine spatial resolution because traditional data collection methods, such as censuses and surv...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2025-01-01
|
Series: | EPJ Data Science |
Subjects: | |
Online Access: | https://doi.org/10.1140/epjds/s13688-024-00515-9 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832594936705318912 |
---|---|
author | Eduardo Cruz Monica Villavicencio Carmen Vaca Lisette Espín-Noboa Nervo Verdezoto |
author_facet | Eduardo Cruz Monica Villavicencio Carmen Vaca Lisette Espín-Noboa Nervo Verdezoto |
author_sort | Eduardo Cruz |
collection | DOAJ |
description | Abstract Ending poverty in all its forms everywhere remains the number one Sustainable Development Goal of the United Nations 2030 Agenda. Governments face challenges in measuring socioeconomic status with fine spatial resolution because traditional data collection methods, such as censuses and surveys, are time-consuming, labor-intensive, performed at long intervals, and cover only a limited population. This work is a data-driven study to analyze the digital traces left by humans in supermarket transactions and model the relationship between consumption behavior and the average per capita income, proposing a proxy to estimate socioeconomic status at the urban neighborhood level. We analyze more than 20 million supermarket shopping transactions in Guayaquil, the most populated city in Ecuador. Using customer consumption data, we created a basket graph and fed it into a graph neural network to predict neighborhood socioeconomic status. The model was trained with spectral and spatial convolutional filters using cross-validation to select the best approach for the prediction. The results show that the Chebyshev spectral convolutional filter has the highest predictive power to predict the socioeconomic status of the neighborhood, with R 2 = 0.91 $R^{2}=0.91$ . Our proposed approach contributes to measuring socioeconomic status at the neighborhood level to support policymakers in making informed decisions about resource allocation according to the needs of different geographical areas. |
format | Article |
id | doaj-art-d04c11b13a8a4ecaa473bd3822dd4e08 |
institution | Kabale University |
issn | 2193-1127 |
language | English |
publishDate | 2025-01-01 |
publisher | SpringerOpen |
record_format | Article |
series | EPJ Data Science |
spelling | doaj-art-d04c11b13a8a4ecaa473bd3822dd4e082025-01-19T12:13:54ZengSpringerOpenEPJ Data Science2193-11272025-01-0114111810.1140/epjds/s13688-024-00515-9A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNsEduardo Cruz0Monica Villavicencio1Carmen Vaca2Lisette Espín-Noboa3Nervo Verdezoto4Escuela Superior Politecnica del LitoralEscuela Superior Politecnica del LitoralEscuela Superior Politecnica del LitoralCentral European UniversityCardiff UniversityAbstract Ending poverty in all its forms everywhere remains the number one Sustainable Development Goal of the United Nations 2030 Agenda. Governments face challenges in measuring socioeconomic status with fine spatial resolution because traditional data collection methods, such as censuses and surveys, are time-consuming, labor-intensive, performed at long intervals, and cover only a limited population. This work is a data-driven study to analyze the digital traces left by humans in supermarket transactions and model the relationship between consumption behavior and the average per capita income, proposing a proxy to estimate socioeconomic status at the urban neighborhood level. We analyze more than 20 million supermarket shopping transactions in Guayaquil, the most populated city in Ecuador. Using customer consumption data, we created a basket graph and fed it into a graph neural network to predict neighborhood socioeconomic status. The model was trained with spectral and spatial convolutional filters using cross-validation to select the best approach for the prediction. The results show that the Chebyshev spectral convolutional filter has the highest predictive power to predict the socioeconomic status of the neighborhood, with R 2 = 0.91 $R^{2}=0.91$ . Our proposed approach contributes to measuring socioeconomic status at the neighborhood level to support policymakers in making informed decisions about resource allocation according to the needs of different geographical areas.https://doi.org/10.1140/epjds/s13688-024-00515-9Neighborhood socioeconomic statusItem embeddingBasket graphGraph neural networkSpectral convolutional filterPer capita income |
spellingShingle | Eduardo Cruz Monica Villavicencio Carmen Vaca Lisette Espín-Noboa Nervo Verdezoto A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs EPJ Data Science Neighborhood socioeconomic status Item embedding Basket graph Graph neural network Spectral convolutional filter Per capita income |
title | A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs |
title_full | A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs |
title_fullStr | A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs |
title_full_unstemmed | A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs |
title_short | A new approach to estimate neighborhood socioeconomic status using supermarket transactions and GNNs |
title_sort | new approach to estimate neighborhood socioeconomic status using supermarket transactions and gnns |
topic | Neighborhood socioeconomic status Item embedding Basket graph Graph neural network Spectral convolutional filter Per capita income |
url | https://doi.org/10.1140/epjds/s13688-024-00515-9 |
work_keys_str_mv | AT eduardocruz anewapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns AT monicavillavicencio anewapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns AT carmenvaca anewapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns AT lisetteespinnoboa anewapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns AT nervoverdezoto anewapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns AT eduardocruz newapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns AT monicavillavicencio newapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns AT carmenvaca newapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns AT lisetteespinnoboa newapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns AT nervoverdezoto newapproachtoestimateneighborhoodsocioeconomicstatususingsupermarkettransactionsandgnns |