A pipeline for processing hyperspectral images, with a case of melanin-containing barley grains as an example

Analysis of hyperspectral images is of great interest in plant studies. Nowadays, this analysis is used more and more widely, so the development of hyperspectral image processing methods is an urgent task. This paper presents a hyperspectral image processing pipeline that includes: preprocessing, ba...

Full description

Saved in:
Bibliographic Details
Main Authors: I. D. Busov, M. A. Genaev, E. G. Komyshev, V. S. Koval, T. E. Zykova, A. Y. Glagoleva, D. A. Afonnikov
Format: Article
Language:English
Published: Siberian Branch of the Russian Academy of Sciences, Federal Research Center Institute of Cytology and Genetics, The Vavilov Society of Geneticists and Breeders 2024-07-01
Series:Вавиловский журнал генетики и селекции
Subjects:
Online Access:https://vavilov.elpub.ru/jour/article/view/4187
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832575031982424064
author I. D. Busov
M. A. Genaev
E. G. Komyshev
V. S. Koval
T. E. Zykova
A. Y. Glagoleva
D. A. Afonnikov
author_facet I. D. Busov
M. A. Genaev
E. G. Komyshev
V. S. Koval
T. E. Zykova
A. Y. Glagoleva
D. A. Afonnikov
author_sort I. D. Busov
collection DOAJ
description Analysis of hyperspectral images is of great interest in plant studies. Nowadays, this analysis is used more and more widely, so the development of hyperspectral image processing methods is an urgent task. This paper presents a hyperspectral image processing pipeline that includes: preprocessing, basic statistical analysis, visualization of a multichannel hyperspectral image, and solving classification and clustering problems using machine learning methods. The current version of the package implements the following methods: construction of a confidence interval of an arbitrary level for the difference of sample averages; verification of the similarity of intensity distributions of spectral lines for two sets of hyperspectral images on the basis of the Mann–Whitney U-criterion and Pearson’s criterion of agreement; visualization in two-dimensional space using dimensionality reduction methods PCA, ISOMAP and UMAP; classification using linear or ridge regression, random forest and catboost; clustering of samples using the EM-algorithm. The software pipeline is implemented in Python using the Pandas, NumPy, OpenCV, SciPy, Sklearn, Umap, CatBoost and Plotly libraries. The source code is available at: https://github.com/igor2704/Hyperspectral_images. The pipeline was applied to identify melanin pigment in the shell of barley grains based on hyperspectral data. Visualization based on PCA, UMAP and ISOMAP methods, as well as the use of clustering algorithms, showed that a linear separation of grain samples with and without pigmentation could be performed with high accuracy based on hyperspectral data. The analysis revealed statistically significant differences in the distribution of median intensities for samples of images of grains with and without pigmentation. Thus, it was demonstrated that hyperspectral images can be used to determine the presence or absence of melanin in barley grains with great accuracy. The flexible and convenient tool created in this work will significantly increase the efficiency of hyperspectral image analysis.
format Article
id doaj-art-9e66e51ab9d04ebdb291e0fe935d8873
institution Kabale University
issn 2500-3259
language English
publishDate 2024-07-01
publisher Siberian Branch of the Russian Academy of Sciences, Federal Research Center Institute of Cytology and Genetics, The Vavilov Society of Geneticists and Breeders
record_format Article
series Вавиловский журнал генетики и селекции
spelling doaj-art-9e66e51ab9d04ebdb291e0fe935d88732025-02-01T09:58:13ZengSiberian Branch of the Russian Academy of Sciences, Federal Research Center Institute of Cytology and Genetics, The Vavilov Society of Geneticists and BreedersВавиловский журнал генетики и селекции2500-32592024-07-0128444345510.18699/vjgb-24-501479A pipeline for processing hyperspectral images, with a case of melanin-containing barley grains as an exampleI. D. Busov0M. A. Genaev1E. G. Komyshev2V. S. Koval3T. E. Zykova4A. Y. Glagoleva5D. A. Afonnikov6Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences ; Novosibirsk State UniversityInstitute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences ; Novosibirsk State UniversityInstitute of Cytology and Genetics of the Siberian Branch of the Russian Academy of SciencesInstitute of Cytology and Genetics of the Siberian Branch of the Russian Academy of SciencesInstitute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences ; Novosibirsk State UniversityInstitute of Cytology and Genetics of the Siberian Branch of the Russian Academy of SciencesInstitute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences ; Novosibirsk State UniversityAnalysis of hyperspectral images is of great interest in plant studies. Nowadays, this analysis is used more and more widely, so the development of hyperspectral image processing methods is an urgent task. This paper presents a hyperspectral image processing pipeline that includes: preprocessing, basic statistical analysis, visualization of a multichannel hyperspectral image, and solving classification and clustering problems using machine learning methods. The current version of the package implements the following methods: construction of a confidence interval of an arbitrary level for the difference of sample averages; verification of the similarity of intensity distributions of spectral lines for two sets of hyperspectral images on the basis of the Mann–Whitney U-criterion and Pearson’s criterion of agreement; visualization in two-dimensional space using dimensionality reduction methods PCA, ISOMAP and UMAP; classification using linear or ridge regression, random forest and catboost; clustering of samples using the EM-algorithm. The software pipeline is implemented in Python using the Pandas, NumPy, OpenCV, SciPy, Sklearn, Umap, CatBoost and Plotly libraries. The source code is available at: https://github.com/igor2704/Hyperspectral_images. The pipeline was applied to identify melanin pigment in the shell of barley grains based on hyperspectral data. Visualization based on PCA, UMAP and ISOMAP methods, as well as the use of clustering algorithms, showed that a linear separation of grain samples with and without pigmentation could be performed with high accuracy based on hyperspectral data. The analysis revealed statistically significant differences in the distribution of median intensities for samples of images of grains with and without pigmentation. Thus, it was demonstrated that hyperspectral images can be used to determine the presence or absence of melanin in barley grains with great accuracy. The flexible and convenient tool created in this work will significantly increase the efficiency of hyperspectral image analysis.https://vavilov.elpub.ru/jour/article/view/4187hyperspectral imagesmachine learningstatistical analysisbarley grainspigment composition
spellingShingle I. D. Busov
M. A. Genaev
E. G. Komyshev
V. S. Koval
T. E. Zykova
A. Y. Glagoleva
D. A. Afonnikov
A pipeline for processing hyperspectral images, with a case of melanin-containing barley grains as an example
Вавиловский журнал генетики и селекции
hyperspectral images
machine learning
statistical analysis
barley grains
pigment composition
title A pipeline for processing hyperspectral images, with a case of melanin-containing barley grains as an example
title_full A pipeline for processing hyperspectral images, with a case of melanin-containing barley grains as an example
title_fullStr A pipeline for processing hyperspectral images, with a case of melanin-containing barley grains as an example
title_full_unstemmed A pipeline for processing hyperspectral images, with a case of melanin-containing barley grains as an example
title_short A pipeline for processing hyperspectral images, with a case of melanin-containing barley grains as an example
title_sort pipeline for processing hyperspectral images with a case of melanin containing barley grains as an example
topic hyperspectral images
machine learning
statistical analysis
barley grains
pigment composition
url https://vavilov.elpub.ru/jour/article/view/4187
work_keys_str_mv AT idbusov apipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT magenaev apipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT egkomyshev apipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT vskoval apipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT tezykova apipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT ayglagoleva apipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT daafonnikov apipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT idbusov pipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT magenaev pipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT egkomyshev pipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT vskoval pipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT tezykova pipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT ayglagoleva pipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample
AT daafonnikov pipelineforprocessinghyperspectralimageswithacaseofmelanincontainingbarleygrainsasanexample