Privacy-Preserving Machine Learning (PPML) Inference for Clinically Actionable Models

Machine learning (ML) refers to algorithms (often models) that are learned directly from data, germane to past experience. As algorithms have constantly been evolving with the exponential increase of computing power and vastly generated data, privacy of algorithms as well as of data becomes extremel...

Full description

Saved in:

Bibliographic Details
Main Authors:	Baris Balaban, Seyma Selcan Magara, Caglar Yilgor, Altug Yucekul, Ibrahim Obeid, Javier Pizones, Frank Kleinstueck, Francisco Javier Sanchez Perez-Grueso, Ferran Pellise, Ahmet Alanay, Erkay Savas, Cetin Bagci, Osman Ugur Sezerman
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Homomorphic Encryption Privacy-Preserving Machine Learning XGBoost
Online Access:	https://ieeexplore.ieee.org/document/10878994/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850236146305466368
author	Baris Balaban Seyma Selcan Magara Caglar Yilgor Altug Yucekul Ibrahim Obeid Javier Pizones Frank Kleinstueck Francisco Javier Sanchez Perez-Grueso Ferran Pellise Ahmet Alanay Erkay Savas Cetin Bagci Osman Ugur Sezerman
author_facet	Baris Balaban Seyma Selcan Magara Caglar Yilgor Altug Yucekul Ibrahim Obeid Javier Pizones Frank Kleinstueck Francisco Javier Sanchez Perez-Grueso Ferran Pellise Ahmet Alanay Erkay Savas Cetin Bagci Osman Ugur Sezerman
author_sort	Baris Balaban
collection	DOAJ
description	Machine learning (ML) refers to algorithms (often models) that are learned directly from data, germane to past experience. As algorithms have constantly been evolving with the exponential increase of computing power and vastly generated data, privacy of algorithms as well as of data becomes extremely important due to regulations and IP rights. Therefore, it is vital to address privacy and security concerns of both data and model together with other performance metrics when commercializing machine learning models. Our aim is to show that privacy-preserving machine learning inference methods can safeguard the intellectual property of models and prevent plaintext models from disclosing information about the sensitive data employed in training these ML models. Additionally, these methods protect the confidentiality of model users’ sensitive patient data. We accomplish this by performing a security analysis to determine an appropriate query limit for each user, using the European Spine Study Group’s (ESSG) adult spinal deformity dataset. We implement a privacy-preserving tree-based machine learning inference and run two security scenarios (scenario A and scenario B) containing four parts with progressively increasing the number of synthetic data points, which are used to enhance the accuracy of the attacker’s substitute model. A target model is generated with particular operation site(s) in each scenario, and substitute models are built with nine-time threefold cross-validation using the XGBoost algorithm with the remaining sites’ data to assess the security of the target model. First, we create box plots of the test sets’ accuracy, sensitivity, precision, and F-score metrics to compare the substitute models’ performance with the target model. Second, we compare the gain values of the target and substitute models’ features. Third, we provide an in-depth analysis to check the inclusion of target model split points in substitute models with a heatmap. Finally, we compare the outputs of public and privacy-preserving models and report intermediate timing results. The privacy-preserving XGBoost model results are identical to the original plaintext model in the aforementioned two scenarios in terms of prediction accuracy. The differences between performance metrics of best-performing substitute models and target models are 0.27, 0.18, 0.25, 0.26 for scenario A, and 0.04, 0, 0.04, and 0.03 for scenario B for accuracy, sensitivity, precision, and F-score, respectively. The differences between target model accuracy and the mean accuracy values of models in each scenario on the substitute models’ test dataset are 0.38 for scenario A and 0.14 for scenario B. Based on our findings, we conclude that machine learning models (i.e., our target models) may contribute to the advancement in the field of application where they are deployed. Ensuring the security of both the model and the user data enables the protection of the intellectual property of ML models, preventing the leakage of sensitive information used in training and model users’ data.INDEX TERMS Homomorphic encryption, privacy-preserving machine learning, XGBoost.
format	Article
id	doaj-art-3ab0bf72b5904aad96bf6b91b27c3f3b
institution	OA Journals
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-3ab0bf72b5904aad96bf6b91b27c3f3b2025-08-20T02:02:01ZengIEEEIEEE Access2169-35362025-01-0113374313745610.1109/ACCESS.2025.354026110878994Privacy-Preserving Machine Learning (PPML) Inference for Clinically Actionable ModelsBaris Balaban0https://orcid.org/0000-0001-6573-402XSeyma Selcan Magara1https://orcid.org/0000-0002-0427-811XCaglar Yilgor2Altug Yucekul3Ibrahim Obeid4Javier Pizones5https://orcid.org/0000-0001-7394-9637Frank Kleinstueck6Francisco Javier Sanchez Perez-Grueso7Ferran Pellise8Ahmet Alanay9Erkay Savas10Cetin Bagci11Osman Ugur Sezerman12Department of Biostatistics and Bioinformatics, Institute of Health Sciences, Acıbadem Mehmet Ali Aydınlar University, Istanbul, TürkiyeDepartment of Computer Science and Engineering, Sabancı University, Istanbul, TürkiyeDepartment of Orthopedics and Traumatology, Acibadem University School of Medicine, Istanbul, TürkiyeDepartment of Orthopedics and Traumatology, Acibadem University School of Medicine, Istanbul, TürkiyeClinique du Dos, Elsan Jean Villar Private Hospital, Bordeaux, FranceSpine Surgery Unit, Hospital Universitario La Paz, Madrid, SpainDepartment of Orthopedics and Neurosurgery, Spine Center Division, Schulthess Klinik, Zürich, SwitzerlandSpine Surgery Unit, Hospital Universitario La Paz, Madrid, SpainSpine Surgery Unit, Hospital Universitari Vall d’Hebron, Barcelona, SpainDepartment of Orthopedics and Traumatology, Acibadem University School of Medicine, Istanbul, TürkiyeDepartment of Computer Science and Engineering, Sabancı University, Istanbul, TürkiyeBilmed Computer and Software Company, Istanbul, TürkiyeDepartment of Biostatistics and Bioinformatics, Institute of Health Sciences, Acıbadem Mehmet Ali Aydınlar University, Istanbul, TürkiyeMachine learning (ML) refers to algorithms (often models) that are learned directly from data, germane to past experience. As algorithms have constantly been evolving with the exponential increase of computing power and vastly generated data, privacy of algorithms as well as of data becomes extremely important due to regulations and IP rights. Therefore, it is vital to address privacy and security concerns of both data and model together with other performance metrics when commercializing machine learning models. Our aim is to show that privacy-preserving machine learning inference methods can safeguard the intellectual property of models and prevent plaintext models from disclosing information about the sensitive data employed in training these ML models. Additionally, these methods protect the confidentiality of model users’ sensitive patient data. We accomplish this by performing a security analysis to determine an appropriate query limit for each user, using the European Spine Study Group’s (ESSG) adult spinal deformity dataset. We implement a privacy-preserving tree-based machine learning inference and run two security scenarios (scenario A and scenario B) containing four parts with progressively increasing the number of synthetic data points, which are used to enhance the accuracy of the attacker’s substitute model. A target model is generated with particular operation site(s) in each scenario, and substitute models are built with nine-time threefold cross-validation using the XGBoost algorithm with the remaining sites’ data to assess the security of the target model. First, we create box plots of the test sets’ accuracy, sensitivity, precision, and F-score metrics to compare the substitute models’ performance with the target model. Second, we compare the gain values of the target and substitute models’ features. Third, we provide an in-depth analysis to check the inclusion of target model split points in substitute models with a heatmap. Finally, we compare the outputs of public and privacy-preserving models and report intermediate timing results. The privacy-preserving XGBoost model results are identical to the original plaintext model in the aforementioned two scenarios in terms of prediction accuracy. The differences between performance metrics of best-performing substitute models and target models are 0.27, 0.18, 0.25, 0.26 for scenario A, and 0.04, 0, 0.04, and 0.03 for scenario B for accuracy, sensitivity, precision, and F-score, respectively. The differences between target model accuracy and the mean accuracy values of models in each scenario on the substitute models’ test dataset are 0.38 for scenario A and 0.14 for scenario B. Based on our findings, we conclude that machine learning models (i.e., our target models) may contribute to the advancement in the field of application where they are deployed. Ensuring the security of both the model and the user data enables the protection of the intellectual property of ML models, preventing the leakage of sensitive information used in training and model users’ data.INDEX TERMS Homomorphic encryption, privacy-preserving machine learning, XGBoost.https://ieeexplore.ieee.org/document/10878994/Homomorphic EncryptionPrivacy-Preserving Machine LearningXGBoost
spellingShingle	Baris Balaban Seyma Selcan Magara Caglar Yilgor Altug Yucekul Ibrahim Obeid Javier Pizones Frank Kleinstueck Francisco Javier Sanchez Perez-Grueso Ferran Pellise Ahmet Alanay Erkay Savas Cetin Bagci Osman Ugur Sezerman Privacy-Preserving Machine Learning (PPML) Inference for Clinically Actionable Models IEEE Access Homomorphic Encryption Privacy-Preserving Machine Learning XGBoost
title	Privacy-Preserving Machine Learning (PPML) Inference for Clinically Actionable Models
title_full	Privacy-Preserving Machine Learning (PPML) Inference for Clinically Actionable Models
title_fullStr	Privacy-Preserving Machine Learning (PPML) Inference for Clinically Actionable Models
title_full_unstemmed	Privacy-Preserving Machine Learning (PPML) Inference for Clinically Actionable Models
title_short	Privacy-Preserving Machine Learning (PPML) Inference for Clinically Actionable Models
title_sort	privacy preserving machine learning ppml inference for clinically actionable models
topic	Homomorphic Encryption Privacy-Preserving Machine Learning XGBoost
url	https://ieeexplore.ieee.org/document/10878994/
work_keys_str_mv	AT barisbalaban privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels AT seymaselcanmagara privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels AT caglaryilgor privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels AT altugyucekul privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels AT ibrahimobeid privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels AT javierpizones privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels AT frankkleinstueck privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels AT franciscojaviersanchezperezgrueso privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels AT ferranpellise privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels AT ahmetalanay privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels AT erkaysavas privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels AT cetinbagci privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels AT osmanugursezerman privacypreservingmachinelearningppmlinferenceforclinicallyactionablemodels

Privacy-Preserving Machine Learning (PPML) Inference for Clinically Actionable Models

Similar Items