On Assessing the Performance of LLMs for Target-Level Sentiment Analysis in Financial News Headlines

The importance of sentiment analysis in the rapidly evolving financial markets is widely recognized for its ability to interpret market trends and inform investment decisions. This study delves into the target-level financial sentiment analysis (TLFSA) of news headlines related to stock. The study c...

Full description

Saved in:
Bibliographic Details
Main Authors: Iftikhar Muhammad, Marco Rospocher
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/18/1/46
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832589411126083584
author Iftikhar Muhammad
Marco Rospocher
author_facet Iftikhar Muhammad
Marco Rospocher
author_sort Iftikhar Muhammad
collection DOAJ
description The importance of sentiment analysis in the rapidly evolving financial markets is widely recognized for its ability to interpret market trends and inform investment decisions. This study delves into the target-level financial sentiment analysis (TLFSA) of news headlines related to stock. The study compares the performance in the TLFSA task of various sentiment analysis techniques, including rule-based models (VADER), fine-tuned transformer-based models (DistilFinRoBERTa and Deberta-v3-base-absa-v1.1) as well as zero-shot large language models (ChatGPT and Gemini). The dataset utilized for this analysis, a novel contribution of this research, comprises 1476 manually annotated Bloomberg headlines and is made publicly available (due to copyright restrictions, only the URLs of Bloomberg headlines with the manual annotations are provided; however, these URLs can be used with a Bloomberg terminal to reconstruct the complete dataset) to encourage future research on this subject. The results indicate that the fine-tuned Deberta-v3-base-absa-v1.1 model performs better across all evaluation metrics than other evaluated models in TLFSA. However, LLMs such as ChatGPT-4, ChatGPT-4o, and Gemini 1.5 Pro provide similar performance levels without the need for task-specific fine-tuning or additional training. The study contributes to assessing the performance of LLMs for financial sentiment analysis, providing useful insights into their possible application in the financial domain.
format Article
id doaj-art-7eebeb23f0b745dcae7d81185e7e7f18
institution Kabale University
issn 1999-4893
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj-art-7eebeb23f0b745dcae7d81185e7e7f182025-01-24T13:17:36ZengMDPI AGAlgorithms1999-48932025-01-011814610.3390/a18010046On Assessing the Performance of LLMs for Target-Level Sentiment Analysis in Financial News HeadlinesIftikhar Muhammad0Marco Rospocher1Department of Economics, University of Verona, 37129 Verona, ItalyDepartment of Foreign Languages and Literatures, University of Verona, 37129 Verona, ItalyThe importance of sentiment analysis in the rapidly evolving financial markets is widely recognized for its ability to interpret market trends and inform investment decisions. This study delves into the target-level financial sentiment analysis (TLFSA) of news headlines related to stock. The study compares the performance in the TLFSA task of various sentiment analysis techniques, including rule-based models (VADER), fine-tuned transformer-based models (DistilFinRoBERTa and Deberta-v3-base-absa-v1.1) as well as zero-shot large language models (ChatGPT and Gemini). The dataset utilized for this analysis, a novel contribution of this research, comprises 1476 manually annotated Bloomberg headlines and is made publicly available (due to copyright restrictions, only the URLs of Bloomberg headlines with the manual annotations are provided; however, these URLs can be used with a Bloomberg terminal to reconstruct the complete dataset) to encourage future research on this subject. The results indicate that the fine-tuned Deberta-v3-base-absa-v1.1 model performs better across all evaluation metrics than other evaluated models in TLFSA. However, LLMs such as ChatGPT-4, ChatGPT-4o, and Gemini 1.5 Pro provide similar performance levels without the need for task-specific fine-tuning or additional training. The study contributes to assessing the performance of LLMs for financial sentiment analysis, providing useful insights into their possible application in the financial domain.https://www.mdpi.com/1999-4893/18/1/46financial sectortarget-level sentiment analysistraditional methodslarge language modelszero-shot learning
spellingShingle Iftikhar Muhammad
Marco Rospocher
On Assessing the Performance of LLMs for Target-Level Sentiment Analysis in Financial News Headlines
Algorithms
financial sector
target-level sentiment analysis
traditional methods
large language models
zero-shot learning
title On Assessing the Performance of LLMs for Target-Level Sentiment Analysis in Financial News Headlines
title_full On Assessing the Performance of LLMs for Target-Level Sentiment Analysis in Financial News Headlines
title_fullStr On Assessing the Performance of LLMs for Target-Level Sentiment Analysis in Financial News Headlines
title_full_unstemmed On Assessing the Performance of LLMs for Target-Level Sentiment Analysis in Financial News Headlines
title_short On Assessing the Performance of LLMs for Target-Level Sentiment Analysis in Financial News Headlines
title_sort on assessing the performance of llms for target level sentiment analysis in financial news headlines
topic financial sector
target-level sentiment analysis
traditional methods
large language models
zero-shot learning
url https://www.mdpi.com/1999-4893/18/1/46
work_keys_str_mv AT iftikharmuhammad onassessingtheperformanceofllmsfortargetlevelsentimentanalysisinfinancialnewsheadlines
AT marcorospocher onassessingtheperformanceofllmsfortargetlevelsentimentanalysisinfinancialnewsheadlines