Analysis of argument structure constructions in the large language model BERT

Understanding how language and linguistic constructions are processed in the brain is a fundamental question in cognitive computational neuroscience. In this study, we investigate the processing and representation of Argument Structure Constructions (ASCs) in the BERT language model, extending previ...

Full description

Saved in:

Bibliographic Details
Main Authors:	Pegah Ramezani, Achim Schilling, Patrick Krauss
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2025-01-01
Series:	Frontiers in Artificial Intelligence
Subjects:	argument structure constructions linguistic constructions (CXs) large language models (LLMs) BERT sentence representation computational linguistics
Online Access:	https://www.frontiersin.org/articles/10.3389/frai.2025.1477246/full
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832576369944428544
author	Pegah Ramezani Pegah Ramezani Achim Schilling Achim Schilling Patrick Krauss Patrick Krauss
author_facet	Pegah Ramezani Pegah Ramezani Achim Schilling Achim Schilling Patrick Krauss Patrick Krauss
author_sort	Pegah Ramezani
collection	DOAJ
description	Understanding how language and linguistic constructions are processed in the brain is a fundamental question in cognitive computational neuroscience. In this study, we investigate the processing and representation of Argument Structure Constructions (ASCs) in the BERT language model, extending previous analyses conducted with Long Short-Term Memory (LSTM) networks. We utilized a custom GPT-4 generated dataset comprising 2000 sentences, evenly distributed among four ASC types: transitive, ditransitive, caused-motion, and resultative constructions. BERT was assessed using the various token embeddings across its 12 layers. Our analyses involved visualizing the embeddings with Multidimensional Scaling (MDS) and t-Distributed Stochastic Neighbor Embedding (t-SNE), and calculating the Generalized Discrimination Value (GDV) to quantify the degree of clustering. We also trained feedforward classifiers (probes) to predict construction categories from these embeddings. Results reveal that CLS token embeddings cluster best according to ASC types in layers 2, 3, and 4, with diminished clustering in intermediate layers and a slight increase in the final layers. Token embeddings for DET and SUBJ showed consistent intermediate-level clustering across layers, while VERB embeddings demonstrated a systematic increase in clustering from layer 1 to 12. OBJ embeddings exhibited minimal clustering initially, which increased substantially, peaking in layer 10. Probe accuracies indicated that initial embeddings contained no specific construction information, as seen in low clustering and chance-level accuracies in layer 1. From layer 2 onward, probe accuracies surpassed 90 percent, highlighting latent construction category information not evident from GDV clustering alone. Additionally, Fisher Discriminant Ratio (FDR) analysis of attention weights revealed that OBJ tokens had the highest FDR scores, indicating they play a crucial role in differentiating ASCs, followed by VERB and DET tokens. SUBJ, CLS, and SEP tokens did not show significant FDR scores. Our study underscores the complex, layered processing of linguistic constructions in BERT, revealing both similarities and differences compared to recurrent models like LSTMs. Future research will compare these computational findings with neuroimaging data during continuous speech perception to better understand the neural correlates of ASC processing. This research demonstrates the potential of both recurrent and transformer-based neural language models to mirror linguistic processing in the human brain, offering valuable insights into the computational and neural mechanisms underlying language understanding.
format	Article
id	doaj-art-d26624088675461ebc8f543976cd6a97
institution	Kabale University
issn	2624-8212
language	English
publishDate	2025-01-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Artificial Intelligence
spelling	doaj-art-d26624088675461ebc8f543976cd6a972025-01-31T06:39:58ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122025-01-01810.3389/frai.2025.14772461477246Analysis of argument structure constructions in the large language model BERTPegah Ramezani0Pegah Ramezani1Achim Schilling2Achim Schilling3Patrick Krauss4Patrick Krauss5Department of English and American Studies, University of Erlangen-Nuremberg, Erlangen, GermanyPattern Recognition Lab, Cognitive Computational Neuroscience Group, University of Erlangen-Nuremberg, Erlangen, GermanyPattern Recognition Lab, Cognitive Computational Neuroscience Group, University of Erlangen-Nuremberg, Erlangen, GermanyNeuroscience Lab, University Hospital Erlangen, Erlangen, GermanyPattern Recognition Lab, Cognitive Computational Neuroscience Group, University of Erlangen-Nuremberg, Erlangen, GermanyNeuroscience Lab, University Hospital Erlangen, Erlangen, GermanyUnderstanding how language and linguistic constructions are processed in the brain is a fundamental question in cognitive computational neuroscience. In this study, we investigate the processing and representation of Argument Structure Constructions (ASCs) in the BERT language model, extending previous analyses conducted with Long Short-Term Memory (LSTM) networks. We utilized a custom GPT-4 generated dataset comprising 2000 sentences, evenly distributed among four ASC types: transitive, ditransitive, caused-motion, and resultative constructions. BERT was assessed using the various token embeddings across its 12 layers. Our analyses involved visualizing the embeddings with Multidimensional Scaling (MDS) and t-Distributed Stochastic Neighbor Embedding (t-SNE), and calculating the Generalized Discrimination Value (GDV) to quantify the degree of clustering. We also trained feedforward classifiers (probes) to predict construction categories from these embeddings. Results reveal that CLS token embeddings cluster best according to ASC types in layers 2, 3, and 4, with diminished clustering in intermediate layers and a slight increase in the final layers. Token embeddings for DET and SUBJ showed consistent intermediate-level clustering across layers, while VERB embeddings demonstrated a systematic increase in clustering from layer 1 to 12. OBJ embeddings exhibited minimal clustering initially, which increased substantially, peaking in layer 10. Probe accuracies indicated that initial embeddings contained no specific construction information, as seen in low clustering and chance-level accuracies in layer 1. From layer 2 onward, probe accuracies surpassed 90 percent, highlighting latent construction category information not evident from GDV clustering alone. Additionally, Fisher Discriminant Ratio (FDR) analysis of attention weights revealed that OBJ tokens had the highest FDR scores, indicating they play a crucial role in differentiating ASCs, followed by VERB and DET tokens. SUBJ, CLS, and SEP tokens did not show significant FDR scores. Our study underscores the complex, layered processing of linguistic constructions in BERT, revealing both similarities and differences compared to recurrent models like LSTMs. Future research will compare these computational findings with neuroimaging data during continuous speech perception to better understand the neural correlates of ASC processing. This research demonstrates the potential of both recurrent and transformer-based neural language models to mirror linguistic processing in the human brain, offering valuable insights into the computational and neural mechanisms underlying language understanding.https://www.frontiersin.org/articles/10.3389/frai.2025.1477246/fullargument structure constructionslinguistic constructions (CXs)large language models (LLMs)BERTsentence representationcomputational linguistics
spellingShingle	Pegah Ramezani Pegah Ramezani Achim Schilling Achim Schilling Patrick Krauss Patrick Krauss Analysis of argument structure constructions in the large language model BERT Frontiers in Artificial Intelligence argument structure constructions linguistic constructions (CXs) large language models (LLMs) BERT sentence representation computational linguistics
title	Analysis of argument structure constructions in the large language model BERT
title_full	Analysis of argument structure constructions in the large language model BERT
title_fullStr	Analysis of argument structure constructions in the large language model BERT
title_full_unstemmed	Analysis of argument structure constructions in the large language model BERT
title_short	Analysis of argument structure constructions in the large language model BERT
title_sort	analysis of argument structure constructions in the large language model bert
topic	argument structure constructions linguistic constructions (CXs) large language models (LLMs) BERT sentence representation computational linguistics
url	https://www.frontiersin.org/articles/10.3389/frai.2025.1477246/full
work_keys_str_mv	AT pegahramezani analysisofargumentstructureconstructionsinthelargelanguagemodelbert AT pegahramezani analysisofargumentstructureconstructionsinthelargelanguagemodelbert AT achimschilling analysisofargumentstructureconstructionsinthelargelanguagemodelbert AT achimschilling analysisofargumentstructureconstructionsinthelargelanguagemodelbert AT patrickkrauss analysisofargumentstructureconstructionsinthelargelanguagemodelbert AT patrickkrauss analysisofargumentstructureconstructionsinthelargelanguagemodelbert

Analysis of argument structure constructions in the large language model BERT

Similar Items