Analysis of argument structure constructions in the large language model BERT

Understanding how language and linguistic constructions are processed in the brain is a fundamental question in cognitive computational neuroscience. In this study, we investigate the processing and representation of Argument Structure Constructions (ASCs) in the BERT language model, extending previ...

Full description

Saved in:
Bibliographic Details
Main Authors: Pegah Ramezani, Achim Schilling, Patrick Krauss
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Artificial Intelligence
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frai.2025.1477246/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576369944428544
author Pegah Ramezani
Pegah Ramezani
Achim Schilling
Achim Schilling
Patrick Krauss
Patrick Krauss
author_facet Pegah Ramezani
Pegah Ramezani
Achim Schilling
Achim Schilling
Patrick Krauss
Patrick Krauss
author_sort Pegah Ramezani
collection DOAJ
description Understanding how language and linguistic constructions are processed in the brain is a fundamental question in cognitive computational neuroscience. In this study, we investigate the processing and representation of Argument Structure Constructions (ASCs) in the BERT language model, extending previous analyses conducted with Long Short-Term Memory (LSTM) networks. We utilized a custom GPT-4 generated dataset comprising 2000 sentences, evenly distributed among four ASC types: transitive, ditransitive, caused-motion, and resultative constructions. BERT was assessed using the various token embeddings across its 12 layers. Our analyses involved visualizing the embeddings with Multidimensional Scaling (MDS) and t-Distributed Stochastic Neighbor Embedding (t-SNE), and calculating the Generalized Discrimination Value (GDV) to quantify the degree of clustering. We also trained feedforward classifiers (probes) to predict construction categories from these embeddings. Results reveal that CLS token embeddings cluster best according to ASC types in layers 2, 3, and 4, with diminished clustering in intermediate layers and a slight increase in the final layers. Token embeddings for DET and SUBJ showed consistent intermediate-level clustering across layers, while VERB embeddings demonstrated a systematic increase in clustering from layer 1 to 12. OBJ embeddings exhibited minimal clustering initially, which increased substantially, peaking in layer 10. Probe accuracies indicated that initial embeddings contained no specific construction information, as seen in low clustering and chance-level accuracies in layer 1. From layer 2 onward, probe accuracies surpassed 90 percent, highlighting latent construction category information not evident from GDV clustering alone. Additionally, Fisher Discriminant Ratio (FDR) analysis of attention weights revealed that OBJ tokens had the highest FDR scores, indicating they play a crucial role in differentiating ASCs, followed by VERB and DET tokens. SUBJ, CLS, and SEP tokens did not show significant FDR scores. Our study underscores the complex, layered processing of linguistic constructions in BERT, revealing both similarities and differences compared to recurrent models like LSTMs. Future research will compare these computational findings with neuroimaging data during continuous speech perception to better understand the neural correlates of ASC processing. This research demonstrates the potential of both recurrent and transformer-based neural language models to mirror linguistic processing in the human brain, offering valuable insights into the computational and neural mechanisms underlying language understanding.
format Article
id doaj-art-d26624088675461ebc8f543976cd6a97
institution Kabale University
issn 2624-8212
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Artificial Intelligence
spelling doaj-art-d26624088675461ebc8f543976cd6a972025-01-31T06:39:58ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122025-01-01810.3389/frai.2025.14772461477246Analysis of argument structure constructions in the large language model BERTPegah Ramezani0Pegah Ramezani1Achim Schilling2Achim Schilling3Patrick Krauss4Patrick Krauss5Department of English and American Studies, University of Erlangen-Nuremberg, Erlangen, GermanyPattern Recognition Lab, Cognitive Computational Neuroscience Group, University of Erlangen-Nuremberg, Erlangen, GermanyPattern Recognition Lab, Cognitive Computational Neuroscience Group, University of Erlangen-Nuremberg, Erlangen, GermanyNeuroscience Lab, University Hospital Erlangen, Erlangen, GermanyPattern Recognition Lab, Cognitive Computational Neuroscience Group, University of Erlangen-Nuremberg, Erlangen, GermanyNeuroscience Lab, University Hospital Erlangen, Erlangen, GermanyUnderstanding how language and linguistic constructions are processed in the brain is a fundamental question in cognitive computational neuroscience. In this study, we investigate the processing and representation of Argument Structure Constructions (ASCs) in the BERT language model, extending previous analyses conducted with Long Short-Term Memory (LSTM) networks. We utilized a custom GPT-4 generated dataset comprising 2000 sentences, evenly distributed among four ASC types: transitive, ditransitive, caused-motion, and resultative constructions. BERT was assessed using the various token embeddings across its 12 layers. Our analyses involved visualizing the embeddings with Multidimensional Scaling (MDS) and t-Distributed Stochastic Neighbor Embedding (t-SNE), and calculating the Generalized Discrimination Value (GDV) to quantify the degree of clustering. We also trained feedforward classifiers (probes) to predict construction categories from these embeddings. Results reveal that CLS token embeddings cluster best according to ASC types in layers 2, 3, and 4, with diminished clustering in intermediate layers and a slight increase in the final layers. Token embeddings for DET and SUBJ showed consistent intermediate-level clustering across layers, while VERB embeddings demonstrated a systematic increase in clustering from layer 1 to 12. OBJ embeddings exhibited minimal clustering initially, which increased substantially, peaking in layer 10. Probe accuracies indicated that initial embeddings contained no specific construction information, as seen in low clustering and chance-level accuracies in layer 1. From layer 2 onward, probe accuracies surpassed 90 percent, highlighting latent construction category information not evident from GDV clustering alone. Additionally, Fisher Discriminant Ratio (FDR) analysis of attention weights revealed that OBJ tokens had the highest FDR scores, indicating they play a crucial role in differentiating ASCs, followed by VERB and DET tokens. SUBJ, CLS, and SEP tokens did not show significant FDR scores. Our study underscores the complex, layered processing of linguistic constructions in BERT, revealing both similarities and differences compared to recurrent models like LSTMs. Future research will compare these computational findings with neuroimaging data during continuous speech perception to better understand the neural correlates of ASC processing. This research demonstrates the potential of both recurrent and transformer-based neural language models to mirror linguistic processing in the human brain, offering valuable insights into the computational and neural mechanisms underlying language understanding.https://www.frontiersin.org/articles/10.3389/frai.2025.1477246/fullargument structure constructionslinguistic constructions (CXs)large language models (LLMs)BERTsentence representationcomputational linguistics
spellingShingle Pegah Ramezani
Pegah Ramezani
Achim Schilling
Achim Schilling
Patrick Krauss
Patrick Krauss
Analysis of argument structure constructions in the large language model BERT
Frontiers in Artificial Intelligence
argument structure constructions
linguistic constructions (CXs)
large language models (LLMs)
BERT
sentence representation
computational linguistics
title Analysis of argument structure constructions in the large language model BERT
title_full Analysis of argument structure constructions in the large language model BERT
title_fullStr Analysis of argument structure constructions in the large language model BERT
title_full_unstemmed Analysis of argument structure constructions in the large language model BERT
title_short Analysis of argument structure constructions in the large language model BERT
title_sort analysis of argument structure constructions in the large language model bert
topic argument structure constructions
linguistic constructions (CXs)
large language models (LLMs)
BERT
sentence representation
computational linguistics
url https://www.frontiersin.org/articles/10.3389/frai.2025.1477246/full
work_keys_str_mv AT pegahramezani analysisofargumentstructureconstructionsinthelargelanguagemodelbert
AT pegahramezani analysisofargumentstructureconstructionsinthelargelanguagemodelbert
AT achimschilling analysisofargumentstructureconstructionsinthelargelanguagemodelbert
AT achimschilling analysisofargumentstructureconstructionsinthelargelanguagemodelbert
AT patrickkrauss analysisofargumentstructureconstructionsinthelargelanguagemodelbert
AT patrickkrauss analysisofargumentstructureconstructionsinthelargelanguagemodelbert