Adulteration detection in cactus seed oil: Integrating analytical chemistry and machine learning approaches

Economically motivated adulteration threatens both consumer rights and market integrity, particularly with high-value cold-pressed oils like cactus seed oil (CO). This study proposes a machine learning model that integrates analytical measurements, data simulations, and classification techniques to...

Full description

Saved in:
Bibliographic Details
Main Authors: Said El Harkaoui, Cristina Ortiz Cruz, Aaron Roggenland, Micha Schneider, Sascha Rohn, Stephan Drusch, Bertrand Matthäus
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:Current Research in Food Science
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2665927125000176
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Economically motivated adulteration threatens both consumer rights and market integrity, particularly with high-value cold-pressed oils like cactus seed oil (CO). This study proposes a machine learning model that integrates analytical measurements, data simulations, and classification techniques to detect adulteration of CO with refined sunflower oil (SO) and determine the detectable limit of adulteration without measuring a huge number of different mixtures. First, pure CO and SO samples were analyzed for their fatty acid, triacylglycerol, and tocochromanol content using HPLC or GC. The resulting oil composition data served as the foundation for further simulations. Monte Carlo (MC) simulations outperformed Conditional Tabular Generative Adversarial Networks (CTGAN) in simulating realistic oil compositions, with MC yielding lower Kullback-Leibler Divergence values compared to CTGAN. The MC-simulated data were then used to simulate larger datasets, a critical step for training and testing two classification models: Random Forest (RF) and Neural Networks (NN), as robust training cannot be achieved with small sample sizes. Both models achieved good classification accuracies, with RF achieving higher accuracy than NN, reaching 94% on simulated datasets and 90% on real-world samples with detectable adulteration levels as low as 1%. RF also offers better interpretability and is computational less demanding as compared to NN which makes it advantageous for authenticity verification in this study. Therefore, combining MC simulation with RF as a robust method for detecting CO adulteration is proposed. The proposed method, coded in Python and available as open-source, offers a flexible framework for continuous adaptation with new data.
ISSN:2665-9271