-
1
PerGOLD: Identification of offensive language in Persian tweets: leveraging crowdsourcing
Published 2025-04-01“…In this paper, we introduce PerGOLD, a new Persian General Offensive Language Dataset, in which we use an event-based data collection methodology to detect offensive language in Persian Twitter. …”
Get full text
Article -
2
-
3
-
4
Are Natural Language Processing methods applicable to EPS forecasting in Poland?
Published 2025-02-01Get full text
Article -
5
Automated generation of discharge summaries: leveraging large language models with clinical data
Published 2025-05-01“…Abstract This study explores the use of open-source large language models (LLMs) to automate generation of German discharge summaries from structured clinical data. …”
Get full text
Article -
6
UrduSER: A comprehensive dataset for speech emotion recognition in Urdu languageMendeley Data
Published 2025-06-01Get full text
Article -
7
La frase scissa nell’insegnamento dell’italiano per gli studenti di scienze umanistiche
Published 2015-12-01Get full text
Article -
8
Sign language detection dataset: A resource for AI-based recognition systemsMendeley Data
Published 2025-08-01“…For diversity and realism, three participants were involved in data collection, each providing 1000 images per sign, resulting in a rich and diverse dataset. …”
Get full text
Article -
9
ONUBAD: A comprehensive dataset for automated conversion of Bangla regional dialects into standard Bengali dialectMendeley Data
Published 2025-02-01“…Despite significant research on the Bangla language in Natural Language Processing (NLP), there remains a notable resource deficit for its diverse regional dialects, such as those spoken in Chittagong, Sylhet, and Barisal. …”
Get full text
Article -
10
Cognitive Computing with Large Language Models for Student Assessment Feedback
Published 2025-04-01Get full text
Article -
11
The Use of Large Language Models for the Analysis of Professional Competencies in the Regional Labor Market of the Republic of Belarus
Published 2025-06-01“…An integrated approach to the analysis of professional competencies in the Belarusian labor market using large language models is presented. A methodology is proposed that includes data collection using web scra pers, preliminary processing using a multi-level cleaning system and normalization of text information, classification and analysis of competencies based on interaction with large language models. …”
Get full text
Article -
12
Advancing automatic speech recognition for low-resource ghanaian languages: Audio datasets for Akan, Ewe, Dagbani, Dagaare, and IkposoScience Data Bank
Published 2025-08-01“…To enhance the dataset’s utility in ASR and linguistic research 10 % of the audio recordings for each language were randomly selected and transcribed, resulting in approximately 100 h of transcription per language. …”
Get full text
Article -
13
-
14
A curated crowdsourced dataset of Luganda and Swahili speech for text-to-speech synthesisMendeley Data
Published 2025-10-01“…The final dataset contains over 19 h of Luganda and 15 h of Kiswahili recordings from six female speakers per language, each paired with a text transcription. …”
Get full text
Article -
15
A dataset for classifying phrases and sentences into statements, questions, or exclamations based on sound pitchMendeley Data.
Published 2025-08-01“…The dataset contains equal representation from all three classes, about 4200 samples per class, and metadata such as speaker gender, age group, and sentence identifiers.The original audio files, alongside resources like Mel-Frequency Cepstral Coefficients (MFCCs) and waveform visualizations, can be found on Mendeley Data. …”
Get full text
Article -
16
A Systematic Review of Cost-Effectiveness Studies Reporting Cost-per-DALY Averted.
Published 2016-01-01“…<h4>Methods</h4>We conducted a systematic review of cost-effectiveness studies reporting cost-per-DALY averted from 2000 through 2015. We developed the Global Health Cost-Effectiveness Analysis (GHCEA) Registry, a repository of English-language cost-per-DALY averted studies indexed in PubMed. …”
Get full text
Article -
17
StegGPT: A Novel Foundation-Model-Based Character-Level Linguistic Steganography Method Utilizing Large Language Models
Published 2025-01-01“…This study addresses the critical need for robust safeguarding of sensitive data stored on personal computing devices and during data transmissions, alongside the increasing need for secure digital interactions. …”
Get full text
Article -
18
English- and Spanish-speaking U.S. adults’ perceptions of the most common reasons for abortion: a study of open-ended data before and after Dobbs v. Jackson
Published 2025-07-01“…Methods We analyzed open-ended data from two waves of a 2022 longitudinal survey (n = 681 participants; n = 2,043 responses per wave; n = 4,086 total responses) collected before and after the Dobbs decision in English and Spanish via Ipsos’s KnowledgePanel®. …”
Get full text
Article -
19
Il controllo di autorità come sfida per la filologia. Frequently Asked Question
Published 2024-09-01Get full text
Article -
20