Exploring Large Language Models’ Ability to Describe Entity-Relationship Schema-Based Conceptual Data Models
In the field of databases, Large Language Models (LLMs) have recently been studied for generating SQL queries from textual descriptions, while their use for conceptual or logical data modeling remains less explored. The conceptual design of relational databases commonly relies on the entity-relation...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Information |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2078-2489/16/5/368 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849327107039035392 |
|---|---|
| author | Andrea Avignone Alessia Tierno Alessandro Fiori Silvia Chiusano |
| author_facet | Andrea Avignone Alessia Tierno Alessandro Fiori Silvia Chiusano |
| author_sort | Andrea Avignone |
| collection | DOAJ |
| description | In the field of databases, Large Language Models (LLMs) have recently been studied for generating SQL queries from textual descriptions, while their use for conceptual or logical data modeling remains less explored. The conceptual design of relational databases commonly relies on the entity-relationship (ER) data model, where translation rules enable mapping an ER schema into corresponding relational tables with their constraints. Our study investigates the capability of LLMs to describe in natural language a database conceptual data model based on the ER schema. Whether for documentation, onboarding, or communication with non-technical stakeholders, LLMs can significantly improve the process of explaining the ER schema by generating accurate descriptions about how the components interact as well as the represented information. To guide the LLM with challenging constructs, specific hints are defined to provide an enriched ER schema. Different LLMs have been explored (ChatGPT 3.5 and 4, Llama2, Gemini, Mistral 7B) and different metrics (F1 score, ROUGE, perplexity) are used to assess the quality of the generated descriptions and compare the different LLMs. |
| format | Article |
| id | doaj-art-93486bd208564c099ad75d9be00d018f |
| institution | Kabale University |
| issn | 2078-2489 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Information |
| spelling | doaj-art-93486bd208564c099ad75d9be00d018f2025-08-20T03:47:58ZengMDPI AGInformation2078-24892025-04-0116536810.3390/info16050368Exploring Large Language Models’ Ability to Describe Entity-Relationship Schema-Based Conceptual Data ModelsAndrea Avignone0Alessia Tierno1Alessandro Fiori2Silvia Chiusano3Department of Control and Computer Engineering, Politecnico di Torino, 10129 Torino, ItalyDepartment of Control and Computer Engineering, Politecnico di Torino, 10129 Torino, ItalyDepartment of Control and Computer Engineering, Politecnico di Torino, 10129 Torino, ItalyDepartment of Control and Computer Engineering, Politecnico di Torino, 10129 Torino, ItalyIn the field of databases, Large Language Models (LLMs) have recently been studied for generating SQL queries from textual descriptions, while their use for conceptual or logical data modeling remains less explored. The conceptual design of relational databases commonly relies on the entity-relationship (ER) data model, where translation rules enable mapping an ER schema into corresponding relational tables with their constraints. Our study investigates the capability of LLMs to describe in natural language a database conceptual data model based on the ER schema. Whether for documentation, onboarding, or communication with non-technical stakeholders, LLMs can significantly improve the process of explaining the ER schema by generating accurate descriptions about how the components interact as well as the represented information. To guide the LLM with challenging constructs, specific hints are defined to provide an enriched ER schema. Different LLMs have been explored (ChatGPT 3.5 and 4, Llama2, Gemini, Mistral 7B) and different metrics (F1 score, ROUGE, perplexity) are used to assess the quality of the generated descriptions and compare the different LLMs.https://www.mdpi.com/2078-2489/16/5/368relational databaselarge language modelsdatabase designentity-relationship |
| spellingShingle | Andrea Avignone Alessia Tierno Alessandro Fiori Silvia Chiusano Exploring Large Language Models’ Ability to Describe Entity-Relationship Schema-Based Conceptual Data Models Information relational database large language models database design entity-relationship |
| title | Exploring Large Language Models’ Ability to Describe Entity-Relationship Schema-Based Conceptual Data Models |
| title_full | Exploring Large Language Models’ Ability to Describe Entity-Relationship Schema-Based Conceptual Data Models |
| title_fullStr | Exploring Large Language Models’ Ability to Describe Entity-Relationship Schema-Based Conceptual Data Models |
| title_full_unstemmed | Exploring Large Language Models’ Ability to Describe Entity-Relationship Schema-Based Conceptual Data Models |
| title_short | Exploring Large Language Models’ Ability to Describe Entity-Relationship Schema-Based Conceptual Data Models |
| title_sort | exploring large language models ability to describe entity relationship schema based conceptual data models |
| topic | relational database large language models database design entity-relationship |
| url | https://www.mdpi.com/2078-2489/16/5/368 |
| work_keys_str_mv | AT andreaavignone exploringlargelanguagemodelsabilitytodescribeentityrelationshipschemabasedconceptualdatamodels AT alessiatierno exploringlargelanguagemodelsabilitytodescribeentityrelationshipschemabasedconceptualdatamodels AT alessandrofiori exploringlargelanguagemodelsabilitytodescribeentityrelationshipschemabasedconceptualdatamodels AT silviachiusano exploringlargelanguagemodelsabilitytodescribeentityrelationshipschemabasedconceptualdatamodels |