Showing 1 - 20 results of 426 for search 'per data language', query time: 0.11s Refine Results
  1. 1

    PerGOLD: Identification of offensive language in Persian tweets: leveraging crowdsourcing by Fatemeh Jafarinejad, Marziea Rahimi, Maryam Khodabakhsh, Seyedehfatemeh Karimi

    Published 2025-04-01
    “…In this paper, we introduce PerGOLD, a new Persian General Offensive Language Dataset, in which we use an event-based data collection methodology to detect offensive language in Persian Twitter. …”
    Get full text
    Article
  2. 2
  3. 3
  4. 4
  5. 5

    Automated generation of discharge summaries: leveraging large language models with clinical data by Matthias Ganzinger, Nicola Kunz, Pascal Fuchs, Cornelia K. Lyu, Martin Loos, Martin Dugas, Thomas M. Pausch

    Published 2025-05-01
    “…Abstract This study explores the use of open-source large language models (LLMs) to automate generation of German discharge summaries from structured clinical data. …”
    Get full text
    Article
  6. 6
  7. 7
  8. 8

    Sign language detection dataset: A resource for AI-based recognition systemsMendeley Data by Bindu Garg, Manisha Kasar, Priyanka Paygude, Amol Dhumane, Srinivas Ambala, Jitendra Rajpurohit, Abhay Sharma, Vidula Meshram, Amber Vats, Achyut Kashyap

    Published 2025-08-01
    “…For diversity and realism, three participants were involved in data collection, each providing 1000 images per sign, resulting in a rich and diverse dataset. …”
    Get full text
    Article
  9. 9

    ONUBAD: A comprehensive dataset for automated conversion of Bangla regional dialects into standard Bengali dialectMendeley Data by Nusrat Sultana, Rumana Yasmin, Bijon Mallik, Mohammad Shorif Uddin

    Published 2025-02-01
    “…Despite significant research on the Bangla language in Natural Language Processing (NLP), there remains a notable resource deficit for its diverse regional dialects, such as those spoken in Chittagong, Sylhet, and Barisal. …”
    Get full text
    Article
  10. 10
  11. 11

    The Use of Large Language Models for the Analysis of Professional Competencies in the Regional Labor Market of the Republic of Belarus by I. N. Kalinouskaya

    Published 2025-06-01
    “…An integrated approach to the analysis of professional competencies in the Belarusian labor market using large language models is presented. A methodology is proposed that includes data collection using web scra­ pers, preliminary processing using a multi-level cleaning system and normalization of text information, classification and analysis of competencies based on interaction with large language models. …”
    Get full text
    Article
  12. 12

    Advancing automatic speech recognition for low-resource ghanaian languages: Audio datasets for Akan, Ewe, Dagbani, Dagaare, and IkposoScience Data Bank by Isaac Wiafe, Jamal-Deen Abdulai, Akon Obu Ekpezu, Raynard Dodzi Helegah, Elikem Doe Atsakpo, Charles Nutrokpor, Fiifi Baffoe Payin Winful, Kafui Kwashie Solaga

    Published 2025-08-01
    “…To enhance the dataset’s utility in ASR and linguistic research 10 % of the audio recordings for each language were randomly selected and transcribed, resulting in approximately 100 h of transcription per language. …”
    Get full text
    Article
  13. 13
  14. 14

    A curated crowdsourced dataset of Luganda and Swahili speech for text-to-speech synthesisMendeley Data by Andrew Katumba, Sulaiman Kagumire, Joyce Nakatumba-Nabende, John Quinn, Sudi Murindanyi

    Published 2025-10-01
    “…The final dataset contains over 19 h of Luganda and 15 h of Kiswahili recordings from six female speakers per language, each paired with a text transcription. …”
    Get full text
    Article
  15. 15

    A dataset for classifying phrases and sentences into statements, questions, or exclamations based on sound pitchMendeley Data. by Ayub Othman Abdulrahman, Shanga Ismail Othman, Gazo Badran Yasin, Meer Salam Ali

    Published 2025-08-01
    “…The dataset contains equal representation from all three classes, about 4200 samples per class, and metadata such as speaker gender, age group, and sentence identifiers.The original audio files, alongside resources like Mel-Frequency Cepstral Coefficients (MFCCs) and waveform visualizations, can be found on Mendeley Data. …”
    Get full text
    Article
  16. 16

    A Systematic Review of Cost-Effectiveness Studies Reporting Cost-per-DALY Averted. by Peter J Neumann, Teja Thorat, Yue Zhong, Jordan Anderson, Megan Farquhar, Mark Salem, Eileen Sandberg, Cayla J Saret, Colby Wilkinson, Joshua T Cohen

    Published 2016-01-01
    “…<h4>Methods</h4>We conducted a systematic review of cost-effectiveness studies reporting cost-per-DALY averted from 2000 through 2015. We developed the Global Health Cost-Effectiveness Analysis (GHCEA) Registry, a repository of English-language cost-per-DALY averted studies indexed in PubMed. …”
    Get full text
    Article
  17. 17

    StegGPT: A Novel Foundation-Model-Based Character-Level Linguistic Steganography Method Utilizing Large Language Models by Omer Farooq Ahmed Adeeb, Seyed Jahanshah Kabudian

    Published 2025-01-01
    “…This study addresses the critical need for robust safeguarding of sensitive data stored on personal computing devices and during data transmissions, alongside the increasing need for secure digital interactions. …”
    Get full text
    Article
  18. 18

    English- and Spanish-speaking U.S. adults’ perceptions of the most common reasons for abortion: a study of open-ended data before and after Dobbs v. Jackson by Lucrecia Mena-Meléndez, Xiana Bueno, Kyla M. Cary, Nana Amma Asamoah, Brandon L. Crawford, Ronna C. Turner, Kristen N. Jozkowski

    Published 2025-07-01
    “…Methods We analyzed open-ended data from two waves of a 2022 longitudinal survey (n = 681 participants; n = 2,043 responses per wave; n = 4,086 total responses) collected before and after the Dobbs decision in English and Spanish via Ipsos’s KnowledgePanel®. …”
    Get full text
    Article
  19. 19
  20. 20