Expression‐based machine learning models for predicting plant tissue identity
Abstract Premise The selection of Arabidopsis as a model organism played a pivotal role in advancing genomic science. The competing frameworks to select an agricultural‐ or ecological‐based model species were rejected, in favor of building knowledge in a species that would facilitate genome‐enabled...
Saved in:
Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2025-01-01
|
Series: | Applications in Plant Sciences |
Subjects: | |
Online Access: | https://doi.org/10.1002/aps3.11621 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832542888975663104 |
---|---|
author | Sourabh Palande Jeremy Arsenault Patricia Basurto‐Lozada Andrew Bleich Brianna N. I. Brown Sophia F. Buysse Noelle A. Connors Sikta Das Adhikari Kara C. Dobson Francisco Xavier Guerra‐Castillo Maria F. Guerrero‐Carrillo Sophia Harlow Héctor Herrera‐Orozco Asia T. Hightower Paulo Izquierdo MacKenzie Jacobs Nicholas A. Johnson Wendy Leuenberger Alessandro Lopez‐Hernandez Alicia Luckie‐Duque Camila Martínez‐Avila Eddy J. Mendoza‐Galindo David Cruz Plancarte Jenny M. Schuster Harry Shomer Sidney C. Sitar Anne K. Steensma Joanne Elise Thomson Damián Villaseñor‐Amador Robin Waterman Brandon M. Webster Madison Whyte Sofía Zorilla‐Azcué Beronda L. Montgomery Aman Y. Husbands Arjun Krishnan Sarah Percival Elizabeth Munch Robert VanBuren Daniel H. Chitwood Alejandra Rougon‐Cardoso |
author_facet | Sourabh Palande Jeremy Arsenault Patricia Basurto‐Lozada Andrew Bleich Brianna N. I. Brown Sophia F. Buysse Noelle A. Connors Sikta Das Adhikari Kara C. Dobson Francisco Xavier Guerra‐Castillo Maria F. Guerrero‐Carrillo Sophia Harlow Héctor Herrera‐Orozco Asia T. Hightower Paulo Izquierdo MacKenzie Jacobs Nicholas A. Johnson Wendy Leuenberger Alessandro Lopez‐Hernandez Alicia Luckie‐Duque Camila Martínez‐Avila Eddy J. Mendoza‐Galindo David Cruz Plancarte Jenny M. Schuster Harry Shomer Sidney C. Sitar Anne K. Steensma Joanne Elise Thomson Damián Villaseñor‐Amador Robin Waterman Brandon M. Webster Madison Whyte Sofía Zorilla‐Azcué Beronda L. Montgomery Aman Y. Husbands Arjun Krishnan Sarah Percival Elizabeth Munch Robert VanBuren Daniel H. Chitwood Alejandra Rougon‐Cardoso |
author_sort | Sourabh Palande |
collection | DOAJ |
description | Abstract Premise The selection of Arabidopsis as a model organism played a pivotal role in advancing genomic science. The competing frameworks to select an agricultural‐ or ecological‐based model species were rejected, in favor of building knowledge in a species that would facilitate genome‐enabled research. Methods Here, we examine the ability of models based on Arabidopsis gene expression data to predict tissue identity in other flowering plants. Comparing different machine learning algorithms, models trained and tested on Arabidopsis data achieved near perfect precision and recall values, whereas when tissue identity is predicted across the flowering plants using models trained on Arabidopsis data, precision values range from 0.69 to 0.74 and recall from 0.54 to 0.64. Results The identity of belowground tissue can be predicted more accurately than other tissue types, and the ability to predict tissue identity is not correlated with phylogenetic distance from Arabidopsis. k‐nearest neighbors is the most successful algorithm, suggesting that gene expression signatures, rather than marker genes, are more valuable to create models for tissue and cell type prediction in plants. Discussion Our data‐driven results highlight that the assertion that knowledge from Arabidopsis is translatable to other plants is not always true. Considering the current landscape of abundant sequencing data, we should reevaluate the scientific emphasis on Arabidopsis and prioritize plant diversity. |
format | Article |
id | doaj-art-445b8cf3380b4722a063fff37af1dfb6 |
institution | Kabale University |
issn | 2168-0450 |
language | English |
publishDate | 2025-01-01 |
publisher | Wiley |
record_format | Article |
series | Applications in Plant Sciences |
spelling | doaj-art-445b8cf3380b4722a063fff37af1dfb62025-02-03T12:21:34ZengWileyApplications in Plant Sciences2168-04502025-01-01131n/an/a10.1002/aps3.11621Expression‐based machine learning models for predicting plant tissue identitySourabh Palande0Jeremy Arsenault1Patricia Basurto‐Lozada2Andrew Bleich3Brianna N. I. Brown4Sophia F. Buysse5Noelle A. Connors6Sikta Das Adhikari7Kara C. Dobson8Francisco Xavier Guerra‐Castillo9Maria F. Guerrero‐Carrillo10Sophia Harlow11Héctor Herrera‐Orozco12Asia T. Hightower13Paulo Izquierdo14MacKenzie Jacobs15Nicholas A. Johnson16Wendy Leuenberger17Alessandro Lopez‐Hernandez18Alicia Luckie‐Duque19Camila Martínez‐Avila20Eddy J. Mendoza‐Galindo21David Cruz Plancarte22Jenny M. Schuster23Harry Shomer24Sidney C. Sitar25Anne K. Steensma26Joanne Elise Thomson27Damián Villaseñor‐Amador28Robin Waterman29Brandon M. Webster30Madison Whyte31Sofía Zorilla‐Azcué32Beronda L. Montgomery33Aman Y. Husbands34Arjun Krishnan35Sarah Percival36Elizabeth Munch37Robert VanBuren38Daniel H. Chitwood39Alejandra Rougon‐Cardoso40Department of Computational Mathematics, Science and Engineering Michigan State University East Lansing Michigan USADepartment of Computer Science and Engineering Michigan State University East Lansing Michigan USALaboratorio Internacional de Investigación sobre el Genoma Humano (LIIGH) Universidad Nacional Autónoma de México Juriquilla Querétaro MexicoDepartment of Plant Biology Michigan State University East Lansing Michigan USADepartment of Plant Biology Michigan State University East Lansing Michigan USADepartment of Plant Biology Michigan State University East Lansing Michigan USADepartment of Horticulture Michigan State University East Lansing Michigan USADepartment of Computational Mathematics, Science and Engineering Michigan State University East Lansing Michigan USAEcology, Evolution, and Behavior Program Michigan State University East Lansing Michigan USAUnidad de Investigación Médica en Inmunología e Infectología Instituto Mexicano del Seguro Social Ciudad de México MexicoLaboratory of Agrigenomic Sciences, Escuela Nacional de Estudios Superiores Unidad León Universidad Nacional Autónoma de México León Guanajuato MexicoDepartment of Horticulture Michigan State University East Lansing Michigan USAPosgrado en Ciencias Biológicas Universidad Nacional Autónoma de México Ciudad de México MexicoDepartment of Plant Biology Michigan State University East Lansing Michigan USADepartment of Plant, Soil, and Microbial Sciences Michigan State University East Lansing Michigan USADepartment of Biochemistry and Molecular Biology Michigan State University East Lansing Michigan USAEcology, Evolution, and Behavior Program Michigan State University East Lansing Michigan USAEcology, Evolution, and Behavior Program Michigan State University East Lansing Michigan USALaboratorio Internacional de Investigación sobre el Genoma Humano (LIIGH) Universidad Nacional Autónoma de México Juriquilla Querétaro MexicoLaboratory of Agrigenomic Sciences, Escuela Nacional de Estudios Superiores Unidad León Universidad Nacional Autónoma de México León Guanajuato MexicoColección Nacional de Aves, Posgrado en Ciencias Biológicas, Instituto de Biología Universidad Nacional Autónoma de México Ciudad de México MexicoLaboratory of Agrigenomic Sciences, Escuela Nacional de Estudios Superiores Unidad León Universidad Nacional Autónoma de México León Guanajuato MexicoDepartamento de Botánica, Posgrado en Ciencias Biológicas, Instituto de Biología Universidad Nacional Autónoma de México Ciudad de México MexicoMolecular Plant Sciences Program Michigan State University East Lansing Michigan USADepartment of Computer Science and Engineering Michigan State University East Lansing Michigan USADepartment of Plant, Soil, and Microbial Sciences Michigan State University East Lansing Michigan USADepartment of Plant Biology Michigan State University East Lansing Michigan USAMolecular Plant Sciences Program Michigan State University East Lansing Michigan USAPrograma de Posgrado en Ciencias Biológicas, Facultad de Ciencias Universidad Nacional Autónoma de México Ciudad de México MexicoDepartment of Plant Biology Michigan State University East Lansing Michigan USADepartment of Plant Biology Michigan State University East Lansing Michigan USADepartment of Plant, Soil, and Microbial Sciences Michigan State University East Lansing Michigan USAPrograma de Posgrado en Ciencias Biológicas, Escuela Nacional de Estudios Superiores (ENES) Unidad Morelia, Universidad Nacional Autónoma de México Morelia Michoacán MexicoDepartment of Biology Grinnell College Grinnell Iowa USADepartment of Biology University of Pennsylvania Philadelphia Pennsylvania USADepartment of Biomedical Informatics, Center for Health AI University of Colorado Anschutz Medical Campus Aurora Colorado USADepartment of Computational Mathematics, Science and Engineering Michigan State University East Lansing Michigan USADepartment of Computational Mathematics, Science and Engineering Michigan State University East Lansing Michigan USADepartment of Horticulture Michigan State University East Lansing Michigan USADepartment of Computational Mathematics, Science and Engineering Michigan State University East Lansing Michigan USALaboratory of Agrigenomic Sciences, Escuela Nacional de Estudios Superiores Unidad León Universidad Nacional Autónoma de México León Guanajuato MexicoAbstract Premise The selection of Arabidopsis as a model organism played a pivotal role in advancing genomic science. The competing frameworks to select an agricultural‐ or ecological‐based model species were rejected, in favor of building knowledge in a species that would facilitate genome‐enabled research. Methods Here, we examine the ability of models based on Arabidopsis gene expression data to predict tissue identity in other flowering plants. Comparing different machine learning algorithms, models trained and tested on Arabidopsis data achieved near perfect precision and recall values, whereas when tissue identity is predicted across the flowering plants using models trained on Arabidopsis data, precision values range from 0.69 to 0.74 and recall from 0.54 to 0.64. Results The identity of belowground tissue can be predicted more accurately than other tissue types, and the ability to predict tissue identity is not correlated with phylogenetic distance from Arabidopsis. k‐nearest neighbors is the most successful algorithm, suggesting that gene expression signatures, rather than marker genes, are more valuable to create models for tissue and cell type prediction in plants. Discussion Our data‐driven results highlight that the assertion that knowledge from Arabidopsis is translatable to other plants is not always true. Considering the current landscape of abundant sequencing data, we should reevaluate the scientific emphasis on Arabidopsis and prioritize plant diversity.https://doi.org/10.1002/aps3.11621Arabidopsisflowering plantsgene expressionmachine learningmodel speciestissue identity |
spellingShingle | Sourabh Palande Jeremy Arsenault Patricia Basurto‐Lozada Andrew Bleich Brianna N. I. Brown Sophia F. Buysse Noelle A. Connors Sikta Das Adhikari Kara C. Dobson Francisco Xavier Guerra‐Castillo Maria F. Guerrero‐Carrillo Sophia Harlow Héctor Herrera‐Orozco Asia T. Hightower Paulo Izquierdo MacKenzie Jacobs Nicholas A. Johnson Wendy Leuenberger Alessandro Lopez‐Hernandez Alicia Luckie‐Duque Camila Martínez‐Avila Eddy J. Mendoza‐Galindo David Cruz Plancarte Jenny M. Schuster Harry Shomer Sidney C. Sitar Anne K. Steensma Joanne Elise Thomson Damián Villaseñor‐Amador Robin Waterman Brandon M. Webster Madison Whyte Sofía Zorilla‐Azcué Beronda L. Montgomery Aman Y. Husbands Arjun Krishnan Sarah Percival Elizabeth Munch Robert VanBuren Daniel H. Chitwood Alejandra Rougon‐Cardoso Expression‐based machine learning models for predicting plant tissue identity Applications in Plant Sciences Arabidopsis flowering plants gene expression machine learning model species tissue identity |
title | Expression‐based machine learning models for predicting plant tissue identity |
title_full | Expression‐based machine learning models for predicting plant tissue identity |
title_fullStr | Expression‐based machine learning models for predicting plant tissue identity |
title_full_unstemmed | Expression‐based machine learning models for predicting plant tissue identity |
title_short | Expression‐based machine learning models for predicting plant tissue identity |
title_sort | expression based machine learning models for predicting plant tissue identity |
topic | Arabidopsis flowering plants gene expression machine learning model species tissue identity |
url | https://doi.org/10.1002/aps3.11621 |
work_keys_str_mv | AT sourabhpalande expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT jeremyarsenault expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT patriciabasurtolozada expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT andrewbleich expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT briannanibrown expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT sophiafbuysse expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT noelleaconnors expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT siktadasadhikari expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT karacdobson expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT franciscoxavierguerracastillo expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT mariafguerrerocarrillo expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT sophiaharlow expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT hectorherreraorozco expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT asiathightower expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT pauloizquierdo expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT mackenziejacobs expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT nicholasajohnson expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT wendyleuenberger expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT alessandrolopezhernandez expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT alicialuckieduque expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT camilamartinezavila expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT eddyjmendozagalindo expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT davidcruzplancarte expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT jennymschuster expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT harryshomer expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT sidneycsitar expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT anneksteensma expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT joanneelisethomson expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT damianvillasenoramador expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT robinwaterman expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT brandonmwebster expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT madisonwhyte expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT sofiazorillaazcue expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT berondalmontgomery expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT amanyhusbands expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT arjunkrishnan expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT sarahpercival expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT elizabethmunch expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT robertvanburen expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT danielhchitwood expressionbasedmachinelearningmodelsforpredictingplanttissueidentity AT alejandrarougoncardoso expressionbasedmachinelearningmodelsforpredictingplanttissueidentity |