FI-NL2PY2SQL: Financial Industry NL2SQL Innovation Model Based on Python and Large Language Model

With the rapid development of prominent models, NL2SQL has made many breakthroughs, but customers still hope that the accuracy of NL2SQL can be continuously improved through optimization. The method based on large models has brought revolutionary changes to NL2SQL. This paper innovatively proposes a...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaozheng Du, Shijing Hu, Feng Zhou, Cheng Wang, Binh Minh Nguyen
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Future Internet
Subjects:
Online Access:https://www.mdpi.com/1999-5903/17/1/12
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832588399411724288
author Xiaozheng Du
Shijing Hu
Feng Zhou
Cheng Wang
Binh Minh Nguyen
author_facet Xiaozheng Du
Shijing Hu
Feng Zhou
Cheng Wang
Binh Minh Nguyen
author_sort Xiaozheng Du
collection DOAJ
description With the rapid development of prominent models, NL2SQL has made many breakthroughs, but customers still hope that the accuracy of NL2SQL can be continuously improved through optimization. The method based on large models has brought revolutionary changes to NL2SQL. This paper innovatively proposes a new NL2SQL method based on a large language model (LLM), which could be adapted to an edge-cloud computing platform. First, natural language is converted into Python language, and then SQL is generated through Python. At the same time, considering the traceability characteristics of financial industry regulatory requirements, this paper uses the open-source big model DeepSeek. After testing on the BIRD dataset, compared with most NL2SQL models based on large language models, EX is at least 2.73% higher than the original method, F1 is at least 3.72 higher than the original method, and VES is 6.34% higher than the original method. Through this innovative algorithm, the accuracy of NL2SQL in the financial industry is greatly improved, which can provide business personnel with a robust database access mode.
format Article
id doaj-art-291e3337dd1b4a28b3a10fe0cabc7c38
institution Kabale University
issn 1999-5903
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Future Internet
spelling doaj-art-291e3337dd1b4a28b3a10fe0cabc7c382025-01-24T13:33:33ZengMDPI AGFuture Internet1999-59032025-01-011711210.3390/fi17010012FI-NL2PY2SQL: Financial Industry NL2SQL Innovation Model Based on Python and Large Language ModelXiaozheng Du0Shijing Hu1Feng Zhou2Cheng Wang3Binh Minh Nguyen4School of Computer Science, Fudan University, Shanghai 200438, ChinaSchool of Computer Science, Fudan University, Shanghai 200438, ChinaSchool of Artificial Intelligence, Shanghai Normal University Tianhua College, No. 1661 Shengxin North Road, Shanghai 201815, ChinaBusiness Analysis BU, GienTech Technology Co., Ltd., Shanghai 200232, ChinaSchool of Information and Communication Technology, Hanoi University of Science and Technology, No. 1 Dai Co Viet, Hai Ba Trung, Hanoi 100000, VietnamWith the rapid development of prominent models, NL2SQL has made many breakthroughs, but customers still hope that the accuracy of NL2SQL can be continuously improved through optimization. The method based on large models has brought revolutionary changes to NL2SQL. This paper innovatively proposes a new NL2SQL method based on a large language model (LLM), which could be adapted to an edge-cloud computing platform. First, natural language is converted into Python language, and then SQL is generated through Python. At the same time, considering the traceability characteristics of financial industry regulatory requirements, this paper uses the open-source big model DeepSeek. After testing on the BIRD dataset, compared with most NL2SQL models based on large language models, EX is at least 2.73% higher than the original method, F1 is at least 3.72 higher than the original method, and VES is 6.34% higher than the original method. Through this innovative algorithm, the accuracy of NL2SQL in the financial industry is greatly improved, which can provide business personnel with a robust database access mode.https://www.mdpi.com/1999-5903/17/1/12LLMNL2SQLpre-trainingpromptPython
spellingShingle Xiaozheng Du
Shijing Hu
Feng Zhou
Cheng Wang
Binh Minh Nguyen
FI-NL2PY2SQL: Financial Industry NL2SQL Innovation Model Based on Python and Large Language Model
Future Internet
LLM
NL2SQL
pre-training
prompt
Python
title FI-NL2PY2SQL: Financial Industry NL2SQL Innovation Model Based on Python and Large Language Model
title_full FI-NL2PY2SQL: Financial Industry NL2SQL Innovation Model Based on Python and Large Language Model
title_fullStr FI-NL2PY2SQL: Financial Industry NL2SQL Innovation Model Based on Python and Large Language Model
title_full_unstemmed FI-NL2PY2SQL: Financial Industry NL2SQL Innovation Model Based on Python and Large Language Model
title_short FI-NL2PY2SQL: Financial Industry NL2SQL Innovation Model Based on Python and Large Language Model
title_sort fi nl2py2sql financial industry nl2sql innovation model based on python and large language model
topic LLM
NL2SQL
pre-training
prompt
Python
url https://www.mdpi.com/1999-5903/17/1/12
work_keys_str_mv AT xiaozhengdu finl2py2sqlfinancialindustrynl2sqlinnovationmodelbasedonpythonandlargelanguagemodel
AT shijinghu finl2py2sqlfinancialindustrynl2sqlinnovationmodelbasedonpythonandlargelanguagemodel
AT fengzhou finl2py2sqlfinancialindustrynl2sqlinnovationmodelbasedonpythonandlargelanguagemodel
AT chengwang finl2py2sqlfinancialindustrynl2sqlinnovationmodelbasedonpythonandlargelanguagemodel
AT binhminhnguyen finl2py2sqlfinancialindustrynl2sqlinnovationmodelbasedonpythonandlargelanguagemodel