Software / mapseq

Description

MAPseq is a set of fast and accurate sequence read classification tools designed to assign taxonomy and OTU classifications to ribosomal RNA sequences. This is done using a provided reference set of full-length ribosomal RNA sequences for which taxonomies are known, and for which a set of high quality OTU clusters has been previously generated.

For each query sequence MAPseq uses a fast kmer-search approach followed by sequence alignment. Two confidence estimation approaches (identity cutoff and unique hit) are combined to produce an accurate confidence estimation of the OTU or taxonomic assignment. The identity cutoff based confidence uses identity cutoffs estimated previously for each taxonomic level. The unique hit based confidence compares the alignment score of the top hit to the second best hits having different taxonomic labels: the higher the difference the more confident is the assignment.

Download

MAPseq v1.2.3 (2 Oct 2018)

[source] Linux/MacOSX available on github

precompiled binaries:
[linux binary] [linux binary (kernel 3.13)] [macosx binary]

[user manual]


Bugs, comments, suggestions: [send us an email] [write on GitHub]

Reference

Matias Rodrigues JF, Schmidt TSB, Tackmann J, and von Mering C (2017) MAPseq: highly efficient k-mer search with confidence estimates, for rRNA sequence analysis. Bioinformatics. http://doi.org/10.1093/bioinformatics/btx517

History

1.2.3 (2 Oct 2018)
- Fixed last sequence of database not loading if it missed a newline. Added assert for empty sequences in database.
- Fixed double line output for the same query sequence with long queries (>1200bp). Only the highest scoring hit is reported now.

1.2.2 (30 Oct 2017)
- Fixed multithreaded race condition causing issues on some systems.

1.2.1 (23 Oct 2017)
- Updated mapref to v2.2. Fixed several issues with v2.0.
- Dropped LTP taxonomy due to low coverage.

1.2 (16 July 2017)
- Updated mapref to v2.0, now includes 1.5 million sequences.
- Added assert checks.

1.1 (24 April 2017)
- Several improvements and bug fixes, updated to latest NCBI taxonomy.

1.0 (14 October 2016)
- First release of MAPseq.

License

The MAPseq code is available under the GPL v3 license. The eutils supporting library included in the distribution is available under a separate license. Commercial use of the unmodified and the binary version is allowed. Please read the COPYING file included in the package for further details.

Supplementary data

MAPseq supplementary data