Benchmark for Evaluation of Danish Clinical Word Embeddings

In natural language processing, benchmarks are used to track progress and identify useful models. Currently, no benchmark for Danish clinical word embeddings exists. This paper describes the development of a Danish benchmark for clinical word embeddings. The clinical benchmark consists of ten da...

Full description

Saved in:
Bibliographic Details
Main Authors: Martin Sundahl Laursen, Jannik Skyttegaard Pedersen, Pernille Just Vinholt, Rasmus Søgaard Hansen, Thiusius Rajeeth Savarimuthu
Format: Article
Language:English
Published: Linköping University Electronic Press 2023-03-01
Series:Northern European Journal of Language Technology
Online Access:https://nejlt.ep.liu.se/article/view/4132
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832591229598040064
author Martin Sundahl Laursen
Jannik Skyttegaard Pedersen
Pernille Just Vinholt
Rasmus Søgaard Hansen
Thiusius Rajeeth Savarimuthu
author_facet Martin Sundahl Laursen
Jannik Skyttegaard Pedersen
Pernille Just Vinholt
Rasmus Søgaard Hansen
Thiusius Rajeeth Savarimuthu
author_sort Martin Sundahl Laursen
collection DOAJ
description In natural language processing, benchmarks are used to track progress and identify useful models. Currently, no benchmark for Danish clinical word embeddings exists. This paper describes the development of a Danish benchmark for clinical word embeddings. The clinical benchmark consists of ten datasets: eight intrinsic and two extrinsic. Moreover, we evaluate word embeddings trained on text from the clinical domain, general practitioner domain and general domain on the established benchmark. All the intrinsic tasks of the benchmark are publicly available.
format Article
id doaj-art-50822792d2d34dd6bce3936aa9066d3c
institution Kabale University
issn 2000-1533
language English
publishDate 2023-03-01
publisher Linköping University Electronic Press
record_format Article
series Northern European Journal of Language Technology
spelling doaj-art-50822792d2d34dd6bce3936aa9066d3c2025-01-22T15:25:16ZengLinköping University Electronic PressNorthern European Journal of Language Technology2000-15332023-03-019110.3384/nejlt.2000-1533.2023.4132Benchmark for Evaluation of Danish Clinical Word EmbeddingsMartin Sundahl Laursen0Jannik Skyttegaard Pedersen1Pernille Just Vinholt2Rasmus Søgaard Hansen3Thiusius Rajeeth Savarimuthu4University of Southern DenmarkUniversity of Southern DenmarkOdense University HospitalOdense University HospitalUniversity of Southern Denmark In natural language processing, benchmarks are used to track progress and identify useful models. Currently, no benchmark for Danish clinical word embeddings exists. This paper describes the development of a Danish benchmark for clinical word embeddings. The clinical benchmark consists of ten datasets: eight intrinsic and two extrinsic. Moreover, we evaluate word embeddings trained on text from the clinical domain, general practitioner domain and general domain on the established benchmark. All the intrinsic tasks of the benchmark are publicly available. https://nejlt.ep.liu.se/article/view/4132
spellingShingle Martin Sundahl Laursen
Jannik Skyttegaard Pedersen
Pernille Just Vinholt
Rasmus Søgaard Hansen
Thiusius Rajeeth Savarimuthu
Benchmark for Evaluation of Danish Clinical Word Embeddings
Northern European Journal of Language Technology
title Benchmark for Evaluation of Danish Clinical Word Embeddings
title_full Benchmark for Evaluation of Danish Clinical Word Embeddings
title_fullStr Benchmark for Evaluation of Danish Clinical Word Embeddings
title_full_unstemmed Benchmark for Evaluation of Danish Clinical Word Embeddings
title_short Benchmark for Evaluation of Danish Clinical Word Embeddings
title_sort benchmark for evaluation of danish clinical word embeddings
url https://nejlt.ep.liu.se/article/view/4132
work_keys_str_mv AT martinsundahllaursen benchmarkforevaluationofdanishclinicalwordembeddings
AT jannikskyttegaardpedersen benchmarkforevaluationofdanishclinicalwordembeddings
AT pernillejustvinholt benchmarkforevaluationofdanishclinicalwordembeddings
AT rasmussøgaardhansen benchmarkforevaluationofdanishclinicalwordembeddings
AT thiusiusrajeethsavarimuthu benchmarkforevaluationofdanishclinicalwordembeddings