X-Mapper: fast and accurate sequence alignment via gapped x-mers
Abstract Sequence alignment is foundational to many bioinformatic analyses. Many aligners start by splitting sequences into contiguous, fixed-length seeds, called k-mers. Alignment is faster with longer, unique seeds, but more accurate with shorter seeds avoiding mutations. Here, we introduce X-Mapp...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2025-01-01
|
Series: | Genome Biology |
Subjects: | |
Online Access: | https://doi.org/10.1186/s13059-024-03473-7 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832571634454626304 |
---|---|
author | Jeffry M. Gaston Eric J. Alm An-Ni Zhang |
author_facet | Jeffry M. Gaston Eric J. Alm An-Ni Zhang |
author_sort | Jeffry M. Gaston |
collection | DOAJ |
description | Abstract Sequence alignment is foundational to many bioinformatic analyses. Many aligners start by splitting sequences into contiguous, fixed-length seeds, called k-mers. Alignment is faster with longer, unique seeds, but more accurate with shorter seeds avoiding mutations. Here, we introduce X-Mapper, aiming to offer high speed and accuracy via dynamic-length seeds containing gaps, called gapped x-mers. We observe 11–24-fold fewer suboptimal alignments analyzing a human reference and 3–579-fold lower inconsistency across bacterial references than other aligners, improving on 53% and 30% of reads aligned to non-target strains and species, respectively. Other seed-based analysis algorithms might benefit from gapped x-mers too. |
format | Article |
id | doaj-art-1800fe219b634d5c96529b77404f556b |
institution | Kabale University |
issn | 1474-760X |
language | English |
publishDate | 2025-01-01 |
publisher | BMC |
record_format | Article |
series | Genome Biology |
spelling | doaj-art-1800fe219b634d5c96529b77404f556b2025-02-02T12:27:06ZengBMCGenome Biology1474-760X2025-01-0126112710.1186/s13059-024-03473-7X-Mapper: fast and accurate sequence alignment via gapped x-mersJeffry M. Gaston0Eric J. Alm1An-Ni Zhang2GoogleDepartment of Biological Engineering, Massachusetts Institute of TechnologyDepartment of Biological Engineering, Massachusetts Institute of TechnologyAbstract Sequence alignment is foundational to many bioinformatic analyses. Many aligners start by splitting sequences into contiguous, fixed-length seeds, called k-mers. Alignment is faster with longer, unique seeds, but more accurate with shorter seeds avoiding mutations. Here, we introduce X-Mapper, aiming to offer high speed and accuracy via dynamic-length seeds containing gaps, called gapped x-mers. We observe 11–24-fold fewer suboptimal alignments analyzing a human reference and 3–579-fold lower inconsistency across bacterial references than other aligners, improving on 53% and 30% of reads aligned to non-target strains and species, respectively. Other seed-based analysis algorithms might benefit from gapped x-mers too.https://doi.org/10.1186/s13059-024-03473-7BioinformaticsSequence alignment algorithmsK-merMicrobial sequencing |
spellingShingle | Jeffry M. Gaston Eric J. Alm An-Ni Zhang X-Mapper: fast and accurate sequence alignment via gapped x-mers Genome Biology Bioinformatics Sequence alignment algorithms K-mer Microbial sequencing |
title | X-Mapper: fast and accurate sequence alignment via gapped x-mers |
title_full | X-Mapper: fast and accurate sequence alignment via gapped x-mers |
title_fullStr | X-Mapper: fast and accurate sequence alignment via gapped x-mers |
title_full_unstemmed | X-Mapper: fast and accurate sequence alignment via gapped x-mers |
title_short | X-Mapper: fast and accurate sequence alignment via gapped x-mers |
title_sort | x mapper fast and accurate sequence alignment via gapped x mers |
topic | Bioinformatics Sequence alignment algorithms K-mer Microbial sequencing |
url | https://doi.org/10.1186/s13059-024-03473-7 |
work_keys_str_mv | AT jeffrymgaston xmapperfastandaccuratesequencealignmentviagappedxmers AT ericjalm xmapperfastandaccuratesequencealignmentviagappedxmers AT annizhang xmapperfastandaccuratesequencealignmentviagappedxmers |