Good practice for assignment of breeds and populations—a review

With the purpose to organize methodologies found in (recent) papers focusing on the development of genomic breed/population assignment tools, this review proposes to highlight good practice for the development of such tools. After an appropriate quality control of markers and the building of a repre...

Full description

Saved in:
Bibliographic Details
Main Authors: H. Wilmot, N. Gengler
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-02-01
Series:Frontiers in Animal Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fanim.2025.1508081/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832087190752985088
author H. Wilmot
H. Wilmot
N. Gengler
author_facet H. Wilmot
H. Wilmot
N. Gengler
author_sort H. Wilmot
collection DOAJ
description With the purpose to organize methodologies found in (recent) papers focusing on the development of genomic breed/population assignment tools, this review proposes to highlight good practice for the development of such tools. After an appropriate quality control of markers and the building of a representative reference population, three main steps can be followed to develop a genomic breed/population assignment tool: 1) The selection of discriminant markers, 2) The development of a model that allows accurate assignment of animals to their breed/population of origin, the so-called classification step, and, 3) The validation of the developed model on new animals to evaluate its performances in real conditions. The first step can be avoided when a mid- or low-density chip is used, depending on the methodology used for assignment. In the case selection of SNPs is necessary, we advise the use of one stage methodologies and to define a threshold for this selection. Then, machine learning can be used to develop the model per se, based on the selected or available markers. To tune the model, we recommend the use of cross-validation. Finally, new animals, not used in the first two steps, should be used to evaluate the performances of the model (e.g., with balanced accuracy and probabilities), also in terms of computation time.
format Article
id doaj-art-edbfc2e5d0f044eab87819b3eb6e503d
institution Kabale University
issn 2673-6225
language English
publishDate 2025-02-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Animal Science
spelling doaj-art-edbfc2e5d0f044eab87819b3eb6e503d2025-02-06T07:09:41ZengFrontiers Media S.A.Frontiers in Animal Science2673-62252025-02-01610.3389/fanim.2025.15080811508081Good practice for assignment of breeds and populations—a reviewH. Wilmot0H. Wilmot1N. Gengler2National Fund for Scientific Research (F.R.S.-FNRS), Brussels, BelgiumTERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège, Gembloux, BelgiumTERRA Teaching and Research Centre, Gembloux Agro-Bio Tech, University of Liège, Gembloux, BelgiumWith the purpose to organize methodologies found in (recent) papers focusing on the development of genomic breed/population assignment tools, this review proposes to highlight good practice for the development of such tools. After an appropriate quality control of markers and the building of a representative reference population, three main steps can be followed to develop a genomic breed/population assignment tool: 1) The selection of discriminant markers, 2) The development of a model that allows accurate assignment of animals to their breed/population of origin, the so-called classification step, and, 3) The validation of the developed model on new animals to evaluate its performances in real conditions. The first step can be avoided when a mid- or low-density chip is used, depending on the methodology used for assignment. In the case selection of SNPs is necessary, we advise the use of one stage methodologies and to define a threshold for this selection. Then, machine learning can be used to develop the model per se, based on the selected or available markers. To tune the model, we recommend the use of cross-validation. Finally, new animals, not used in the first two steps, should be used to evaluate the performances of the model (e.g., with balanced accuracy and probabilities), also in terms of computation time.https://www.frontiersin.org/articles/10.3389/fanim.2025.1508081/fullbreed compositionclassificationclusteringadmixturepurebredcrossbred
spellingShingle H. Wilmot
H. Wilmot
N. Gengler
Good practice for assignment of breeds and populations—a review
Frontiers in Animal Science
breed composition
classification
clustering
admixture
purebred
crossbred
title Good practice for assignment of breeds and populations—a review
title_full Good practice for assignment of breeds and populations—a review
title_fullStr Good practice for assignment of breeds and populations—a review
title_full_unstemmed Good practice for assignment of breeds and populations—a review
title_short Good practice for assignment of breeds and populations—a review
title_sort good practice for assignment of breeds and populations a review
topic breed composition
classification
clustering
admixture
purebred
crossbred
url https://www.frontiersin.org/articles/10.3389/fanim.2025.1508081/full
work_keys_str_mv AT hwilmot goodpracticeforassignmentofbreedsandpopulationsareview
AT hwilmot goodpracticeforassignmentofbreedsandpopulationsareview
AT ngengler goodpracticeforassignmentofbreedsandpopulationsareview