Bowtie.indexer (v2)

Builds a Bowtie2 (v. 2.1.0) index from a set of DNA sequences

Author: Ben Langmead and Steven L. Salzberg, Johns Hopkins Bloomberg School of Public Health and the University of Maryland CBCB; Marc-Danie Nazaire, Broad Institute

Contact: gp-help@broadinstitute.org

Algorithm Version: 2.1.0

Summary

Bowtie.indexer builds a Bowtie 2 index from a set of DNA sequences.  This module takes a file or ZIP archive of sequence files in FASTA format, and outputs a set of 6 files in a ZIP archive. These files together constitute the index. For more information on the FASTA format, see the NIH description here at http://www.ncbi.nlm.nih.gov/BLAST/fasta.shtml.
This document is adapted from the Bowtie documentation for release 2.1.0.  For more information about Bowtie.indexer_2.1.0, see the Bowtie Web site.  Bowtie.indexer_2.1.0 was created at the Johns Hopkins Bloomberg School of Public Health and the University of Maryland Center for Bioinformatics and Computational Biology.

References

Ferragina P, Manzini G. Opportunistic data structures with applications.  In FOCS '00 Proceedings of the 41st Annual Symposium on Foundations of Computer Science. Washington, DC: IEEE Computer Society; 2000. http://portal.acm.org/citation.cfm?id=796543

Langmead B, Salzberg S. Fast gapped-read alignment with Bowtie 2. Nature Methods. 2012, 9:357-359.

Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. http://genomebiology.com/2009/10/3/R25.  (http://genomebiology.com/2009/10/3/R25)

Links

Bowtie 2: http://bowtie-bio.sourceforge.net/bowtie2/index.shtml
Bowtie 2 indexer documentation: http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml - the-bowtie2-build-indexer

Parameters

Name Description Command line flag
fasta file * A file or zip of files containing sequences in FASTA format  
random seed * The seed for the random number generator --seed
index name * The base name of the index files to write. By default uses the name of the input file.  

* - required

Input Files

  1. fasta.file
    A single file in FASTA format (can be compressed in gzip format ie., .gz), or a ZIP archive of multiple FASTA files.

Output Files

  1. Six files comprise the index, and are output in a ZIP archive (<index name>.zip).  The file names are in the following formats:

    • <index name>.1.bt2

    • <index name>.2.bt2

    • <index name>.3.bt2

    • <index name>.4.bt2

    • <index name>.rev.1.bt2

    • <index name>.rev.2.bt2

Platform Dependencies

Task Type:
RNA-seq

CPU Type:
any

Operating System:
Macintosh, Linux

Language:
Perl;C++

Version Comments

Version Release Date Description
2 2013-06-14 Updated to use Bowtie 2.1.0
1 2011-01-04