Arch-Eval benchmark for assessing chinese architectural domain knowledge in large language models
Abstract The burgeoning application of Large Language Models (LLMs) in Natural Language Processing (NLP) has prompted scrutiny of their domain-specific knowledge processing, especially in the construction industry. Despite high demand, there is a scarcity of evaluative studies for LLMs in this area....
Saved in:
| Main Authors: | Jie Wu, Mincheng Jiang, Juntian Fan, Shimin Li, Hongtao Xu, Ye Zhao |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-04-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-025-98236-0 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Meticulous Thought Defender: Fine-Grained Chain-of-Thought (CoT) for Detecting Prompt Injection Attacks of Large Language Models
by: Lijuan Shi, et al.
Published: (2025-01-01) -
Measuring and Improving the Efficiency of Python Code Generated by LLMs Using CoT Prompting and Fine-Tuning
by: Ramya Jonnala, et al.
Published: (2025-01-01) -
SHIELD: an evaluation benchmark for face spoofing and forgery detection with multimodal large language models
by: Yichen Shi, et al.
Published: (2025-06-01) -
Capability-based training framework for generative AI in higher education
by: Pablo Burneo-Arteaga, et al.
Published: (2025-06-01) -
Correction: Capability-based training framework for generative AI in higher education
by: Pablo Burneo-Arteaga, et al.
Published: (2025-08-01)