Adulteration detection in cactus seed oil: Integrating analytical chemistry and machine learning approaches

Economically motivated adulteration threatens both consumer rights and market integrity, particularly with high-value cold-pressed oils like cactus seed oil (CO). This study proposes a machine learning model that integrates analytical measurements, data simulations, and classification techniques to...

Full description

Saved in:
Bibliographic Details
Main Authors: Said El Harkaoui, Cristina Ortiz Cruz, Aaron Roggenland, Micha Schneider, Sascha Rohn, Stephan Drusch, Bertrand Matthäus
Format: Article
Language:English
Published: Elsevier 2025-01-01
Series:Current Research in Food Science
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2665927125000176
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576451286663168
author Said El Harkaoui
Cristina Ortiz Cruz
Aaron Roggenland
Micha Schneider
Sascha Rohn
Stephan Drusch
Bertrand Matthäus
author_facet Said El Harkaoui
Cristina Ortiz Cruz
Aaron Roggenland
Micha Schneider
Sascha Rohn
Stephan Drusch
Bertrand Matthäus
author_sort Said El Harkaoui
collection DOAJ
description Economically motivated adulteration threatens both consumer rights and market integrity, particularly with high-value cold-pressed oils like cactus seed oil (CO). This study proposes a machine learning model that integrates analytical measurements, data simulations, and classification techniques to detect adulteration of CO with refined sunflower oil (SO) and determine the detectable limit of adulteration without measuring a huge number of different mixtures. First, pure CO and SO samples were analyzed for their fatty acid, triacylglycerol, and tocochromanol content using HPLC or GC. The resulting oil composition data served as the foundation for further simulations. Monte Carlo (MC) simulations outperformed Conditional Tabular Generative Adversarial Networks (CTGAN) in simulating realistic oil compositions, with MC yielding lower Kullback-Leibler Divergence values compared to CTGAN. The MC-simulated data were then used to simulate larger datasets, a critical step for training and testing two classification models: Random Forest (RF) and Neural Networks (NN), as robust training cannot be achieved with small sample sizes. Both models achieved good classification accuracies, with RF achieving higher accuracy than NN, reaching 94% on simulated datasets and 90% on real-world samples with detectable adulteration levels as low as 1%. RF also offers better interpretability and is computational less demanding as compared to NN which makes it advantageous for authenticity verification in this study. Therefore, combining MC simulation with RF as a robust method for detecting CO adulteration is proposed. The proposed method, coded in Python and available as open-source, offers a flexible framework for continuous adaptation with new data.
format Article
id doaj-art-fd340d4ca1c246baa08f07ca3b02c819
institution Kabale University
issn 2665-9271
language English
publishDate 2025-01-01
publisher Elsevier
record_format Article
series Current Research in Food Science
spelling doaj-art-fd340d4ca1c246baa08f07ca3b02c8192025-01-31T05:12:23ZengElsevierCurrent Research in Food Science2665-92712025-01-0110100986Adulteration detection in cactus seed oil: Integrating analytical chemistry and machine learning approachesSaid El Harkaoui0Cristina Ortiz Cruz1Aaron Roggenland2Micha Schneider3Sascha Rohn4Stephan Drusch5Bertrand Matthäus6Max Rubner-Institut, Federal Research Institute for Nutrition and Food, Department for Safety and Quality of Cereals, Schützenberg 12, 32756, Detmold, Germany; Department of Food Chemistry and Analysis, Institute of Food Technology and Food Chemistry, Technische Universität Berlin, Berlin, Germany; Department of Food Technology and Food Material Science, Institute of Food Technology and Food Chemistry, Technische Universität Berlin, Berlin, Germany; Corresponding author. Max Rubner-Institut, Federal Research Institute for Nutrition and Food, Department for Safety and Quality of Cereals, Schützenberg 12, 32756, Detmold, Germany.Max Rubner-Institut, Federal Research Institute for Nutrition and Food, Zentralabteilung, Haid-und-Neu-Str. 9, 76131, Karlsruhe, Germany; BMEL Project KIDA, AI consultancy, GermanyMax Rubner-Institut, Federal Research Institute for Nutrition and Food, Zentralabteilung, Schützenberg 12, 32756, Detmold, Germany; BMEL Project KIDA, AI consultancy, GermanyJohann Heinrich von Thünen Institute - Federal Research Institute for Rural Areas, Forestry and Fisheries, Bundesallee 50, 38116, Braunschweig, Germany; BMEL Project KIDA, AI consultancy, GermanyDepartment of Food Chemistry and Analysis, Institute of Food Technology and Food Chemistry, Technische Universität Berlin, Berlin, GermanyDepartment of Food Technology and Food Material Science, Institute of Food Technology and Food Chemistry, Technische Universität Berlin, Berlin, GermanyMax Rubner-Institut, Federal Research Institute for Nutrition and Food, Department for Safety and Quality of Cereals, Schützenberg 12, 32756, Detmold, GermanyEconomically motivated adulteration threatens both consumer rights and market integrity, particularly with high-value cold-pressed oils like cactus seed oil (CO). This study proposes a machine learning model that integrates analytical measurements, data simulations, and classification techniques to detect adulteration of CO with refined sunflower oil (SO) and determine the detectable limit of adulteration without measuring a huge number of different mixtures. First, pure CO and SO samples were analyzed for their fatty acid, triacylglycerol, and tocochromanol content using HPLC or GC. The resulting oil composition data served as the foundation for further simulations. Monte Carlo (MC) simulations outperformed Conditional Tabular Generative Adversarial Networks (CTGAN) in simulating realistic oil compositions, with MC yielding lower Kullback-Leibler Divergence values compared to CTGAN. The MC-simulated data were then used to simulate larger datasets, a critical step for training and testing two classification models: Random Forest (RF) and Neural Networks (NN), as robust training cannot be achieved with small sample sizes. Both models achieved good classification accuracies, with RF achieving higher accuracy than NN, reaching 94% on simulated datasets and 90% on real-world samples with detectable adulteration levels as low as 1%. RF also offers better interpretability and is computational less demanding as compared to NN which makes it advantageous for authenticity verification in this study. Therefore, combining MC simulation with RF as a robust method for detecting CO adulteration is proposed. The proposed method, coded in Python and available as open-source, offers a flexible framework for continuous adaptation with new data.http://www.sciencedirect.com/science/article/pii/S2665927125000176Cactus seed oilAuthenticityMachine learningConditional generative adversarial networkMonte-CarloRandom Forest
spellingShingle Said El Harkaoui
Cristina Ortiz Cruz
Aaron Roggenland
Micha Schneider
Sascha Rohn
Stephan Drusch
Bertrand Matthäus
Adulteration detection in cactus seed oil: Integrating analytical chemistry and machine learning approaches
Current Research in Food Science
Cactus seed oil
Authenticity
Machine learning
Conditional generative adversarial network
Monte-Carlo
Random Forest
title Adulteration detection in cactus seed oil: Integrating analytical chemistry and machine learning approaches
title_full Adulteration detection in cactus seed oil: Integrating analytical chemistry and machine learning approaches
title_fullStr Adulteration detection in cactus seed oil: Integrating analytical chemistry and machine learning approaches
title_full_unstemmed Adulteration detection in cactus seed oil: Integrating analytical chemistry and machine learning approaches
title_short Adulteration detection in cactus seed oil: Integrating analytical chemistry and machine learning approaches
title_sort adulteration detection in cactus seed oil integrating analytical chemistry and machine learning approaches
topic Cactus seed oil
Authenticity
Machine learning
Conditional generative adversarial network
Monte-Carlo
Random Forest
url http://www.sciencedirect.com/science/article/pii/S2665927125000176
work_keys_str_mv AT saidelharkaoui adulterationdetectionincactusseedoilintegratinganalyticalchemistryandmachinelearningapproaches
AT cristinaortizcruz adulterationdetectionincactusseedoilintegratinganalyticalchemistryandmachinelearningapproaches
AT aaronroggenland adulterationdetectionincactusseedoilintegratinganalyticalchemistryandmachinelearningapproaches
AT michaschneider adulterationdetectionincactusseedoilintegratinganalyticalchemistryandmachinelearningapproaches
AT sascharohn adulterationdetectionincactusseedoilintegratinganalyticalchemistryandmachinelearningapproaches
AT stephandrusch adulterationdetectionincactusseedoilintegratinganalyticalchemistryandmachinelearningapproaches
AT bertrandmatthaus adulterationdetectionincactusseedoilintegratinganalyticalchemistryandmachinelearningapproaches