A dataset dedicated to the training of large- language models for agronomic management practices and production in Norwegian agricultureGithubKaggle

This dataset focuses on the agricultural management practices and production in Norway, derived from the websites Nibio.no, Plantevernleksikonet.no, and nlr.no. All gathered data is in Norwegian. The data is in JSON files (RAW format) and covers topics pertinent to Norwegian agriculture, such as cro...

Full description

Saved in:
Bibliographic Details
Main Authors: Olena Bugaiova, Kristian Nikolai Jæger Hansen
Format: Article
Language:English
Published: Elsevier 2025-04-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340925000587
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This dataset focuses on the agricultural management practices and production in Norway, derived from the websites Nibio.no, Plantevernleksikonet.no, and nlr.no. All gathered data is in Norwegian. The data is in JSON files (RAW format) and covers topics pertinent to Norwegian agriculture, such as crop rotation, soil health, plant protection and sustainable farming techniques. The data was collected by three Python scripts specially adapted to each website. The cleaned text data is valuable for training or evaluating Natural Language Processing (NLP) Models in an experimental context in Norway or adapting Large-Language Models (LLM) to the domain of Norwegian agriculture within the Norwegian language.
ISSN:2352-3409