IGV User Guide

This guide describes the Integrative Genomics Viewer (IGV).

Look at a printer-friendly HTML version of the whole User Guide.

User Interface

Main Window

The following figure shows data from The Cancer Genome Atlas:

The tool bar provides access to commonly used functions. The menu bar and pop-up menus (not shown) provide access to all other functions.

The red box on the chromosome ideogram indicates which portion of the chromosome is displayed. When zoomed out to display the full chromosome, the red box disappears from the ideogram.

The ruler reflects the visible portion of the chromosome. The tick marks indicate chromosome locations. The span lists the number of bases currently displayed.

IGV displays data in horizontal rows called tracks. Typically, each track represents one sample or experiment. This example shows segmented copy number data.

IGV also displays features, such as genes, in tracks. By default, IGV displays data in one panel and features in another, as shown here. Drag-and-drop a track name to move a track from one panel to another. Combine data and feature panels by selecting the option to display all tracks in a single panel on the General tab of the Preferences window.
Track names are listed in the far left panel. Legibility of the names depends on the height of the tracks; i.e., the smaller the track the less legible the name.
An optional attribute panel displays sample/track attributes represented as colored blocks, where each unique value is assigned a unique color. 

 

 



 

Menu Bar

Menu Command Description
File Load from File Displays genomic data from one or more files. more...
  Load from URL Displays genomic data from a file identified by URL. more...
  Load from Server Displays genomic data from the IGV data server. more...
  New Session Unloads all currently loaded data, as if you exited and restarted IGV. more...
  Open Session Opens a previously saved session file. more...
  Save Session Saves your current settings to a named session file. more...
  Reload Session Reloads all the tracks in the current session.The viewing parameters and other settings remain the same. 
  Save Image Saves a snapshot of the IGV window to a graphics file, omitting the menu bar and tool bar. Specify the image file format by setting the filename extension in the file save dialog to .png  or .svg. 
  Exit Closes IGV.
Genomes Load Genome from File Loads a genome into IGV from your file system.
  Load Genome from URL Loads a genome into IGV from a web URL.
  Load Genome from Server Loads a genome into IGV from the IGV data server. more...
  Remove Genomes Remove one or more genomes from the Genomes dropdown menu. The currently displayed reference genome cannot be removed.
  Create .genome File A utility for creating an IGV .genome file that packages the sequence and optionally other attributes. 
View Preferences Opens a tabbed menu of data display preferences. more...
  Color Legends Displays color legends for track data, which may be modified. more...
  Show Name Panel Shows/hides the track name panel.
  Set Name Panel Width Resets the track name panel width.
  Show Attribute Display Shows/hides the sample/track attribute panel. more...
  Select Attributes to Show Shows/hides selected individual attributes columns in the attribute panel. more...
  Show Header Panel Shows/hides the panel with the chromosome location information and ruler.
  Reorder Panels Allows the user to reorder the display panels in the IGV window.
  Go to View and select loci visited in your navigation history.
Tracks Sort Tracks Sorts track data. more...
  Group Tracks Groups track data. more...
  Filter Tracks Filters track data. more...
  Fit Data to Window Sets the track height to display all of the data, or as much data as possible. more...
  Set Track Height Sets the track height to a specified value. more...
Regions Region Navigator Opens the region navigator. more...
  Gene Lists Opens the gene lists window. more...
  Export Regions

Saves currently defined regions of interest to a BED file. If no regions of interest are defined, no BED file is created. more...

  Import Regions Imports regions of interest from a BED file. more...
Tools Run Batch Script Executes a series of sequential tasks.  Users can load at .txt file that contains a list of commands, one per line, that will be run by IGV.  The accepted commands are the same as the IGV Port Commands.
  Run igvtools Launches the igvtools interface window. more...
  Find Motif Search for a particular nucleotide sequence in the reference genome. more...
  BLAT Run a BLAT search on an entered query sequence. more...
  Combine Data Tracks Combine two selected numeric tracks to dynamically create a new track. Operators include add, subtract, multiply, and divide. For example, when multiplying two tracks, for a locus each with data values of 10 and 2 in the separate tracks will have a value of 20 in the new track.
GenomeSpace

Load File from GenomeSpace

Load a file into IGV from your GenomeSpace directory. more...
  Load Genome from GenomeSpace Load a genome into IGV from your GenomeSpace directory. more...
  Save Session to GenomeSpace Save current IGV session to your GenomeSpace directory. more...
  Load Session from GenomeSpace Load a previous session from your GenomeSpace directory. more...
  Logout Log out of GenomeSpace
  Register Register a new account at GenomeSpace
Help User Guide... Displays the IGV User Guide.
  Help Forum... In your default web browser, opens the home page for the igv-help forum.
  About IGV Displays IGV version and build number.

 

Tool Bar

Genome drop-down menu to select and load a genome. more...
Chromosome drop-down menu to select and zoom to a chromosome. more...
Search box. Displays the chromosome location being shown. To jump to a different location, enter the locus or gene name and click Go. more...

Zooms out to to whole genome view. more...

  Moves backward and forward through your navigation history, like the back and forward buttons in a web browser.

Refreshes the display.

Defines a region of interest on the chromosome. more...

Reduces the row height on all tracks to fit them all into the window - or as many as possible; will also expand tracks to fill the view, if needed.
  Controls the popup information behavior. Options include displaying the information as the cursor hovers over an item, or when the item is clicked. The popup can also be disabled.

Shows/hides a vertical ruler line that follows the cursor in the data panels.

Zooms in and out on a chromosome. Sometimes referred to as the "railroad track." more...

 

Pop-up Menus

To select tracks and display the pop-up menu, do one of the following:

  • Right-click a track to select it and display the pop-up menu.
  • Right-click an attribute value to select all tracks with that attribute value and display the pop-up menu. Tip: Keep in mind that right-clicking an attribute may select tracks that are not visible in the data panel. Scroll down the data panel to view all the selected tracks.
  • Control-click track names (Mac: Command-click) to select the tracks, then right-click one of the selections to display the pop-up menu.

Commands in the track pop-up menu change the display options for the selected tracks. Most changes made via the pop-up menu are lost when you exit IGV unless you save the session. In a few cases, changing the pop-up menu also changes an option in the Preferences window; these changes are persistent.

The type of data displayed in the selected tracks determines which commands appear in the pop-up menu. This page lists commands by track type: data track, feature track, and alignment track. Use your browser's search function to find a particular command.

Data Track

Data tracks display numeric values. For an example, click File>Load from Server and select The Cancer Genome Atlas.GBM.Expression.GBM Batch 1-8 Centered and Normalized (hg18). The following commands appear in the pop-up menu for data tracks:

Command Description
Track Settings  
Rename Track Renames a track. more...
Change Track Color (Positive/Negative Values) Changes the track color for selected tracks. more...
Change Track Height Changes the track height for selected tracks. more...
Change Font Size Changes the font size for selected tracks.
Type of Graph
Heatmap
Bar Chart
Points
Line Plot
Changes the way IGV displays track data. more...
Windowing Function
10th Percentile
Median
Mean
90th Percentile
Maximum
 
 

Changes the value represented by each pixel of track data.

At all but the lowest zoom levels, each pixel represents a significant amount of data. IGV divides the data to be displayed into "windows" of equal length each corresponding to a single pixel, summarizes the values across each window, and then displays the summarized values in the track. Select the function IGV will use to summarize the values.

Data Range  
Set Data Range Changes the minimum, baseline, and maximum values of the graph used to display track data. more...
Set Heatmap Scale Changes the data range and color of the heatmaps used to display track data. more...
Log scale Plots the chart for that track on a log scale.
Autoscale Toggles the autoscaling function for a given track.  With autoscaling enabled, IGV automatically adjusts the plot Y scale to the data range currently in view.  As the user pans and moves, this scaling continually adjusts.  
Show Data Range Toggles whether the numeric range of the values in the view for a given track is displayed; works for charts other than heatmaps.
Track Management  
Create Overlay Track Merge the selected tracks so that they are displayed on top of one another.
Separate Tracks Only enabled for overlaid tracks. Restores them to separate tracks.
Remove Track(s) Removes selected track(s) from the display. more...
Save image Saves the data visible in the IGV panel to an image file. Specify the image file format by setting the filename extension in the file save dialog to .png, .jpeg, .jpg, or .svg. 

 

Feature Track

Feature tracks identify genomic features. For an example, see the Gene track, which IGV loads when you select a genome. The following commands appear in the pop-up menu for feature tracks:

Command Description
Rename Track Renames a track. more...
Change Track Color Changes the track color for selected tracks. more...
Change Track Height Changes the track height for selected tracks. more...
Change Font Size Changes the font size of the feature labels.

Collapsed

Expanded

Squished

Displays overlapping features, such as different transcripts of a gene, on one line or multiple lines or condensed (squished).  more...

Copy Details to Clipboard Copies the pop-up text for the selected feature to the system clipboard so that you can paste the information into other applications.
Copy Sequence Copies the sequence of the selected feature to the system clipboard so that you can paste the information into other applications.
BLAT Sequence

The selected feature, defined by feature start and end bounds, is used in a BLAT (BLAST-like Alignment Tool) search within the given genome as detailed on the BLAT Search page.

  • Search with sequences of up to 8 kb in length.
  • The default BLAT server is hosted at the UCSC Genome Browser and supports most UCSC derived genomes including human and mouse genomes. Change the default search server in Advanced Preferences.
Set Feature Visibility Window Specifies the threshold, in kilobases, for IGV to display features in the window. In other words, if you set this at 50 kb, IGV will only display features after you have zoomed in to display 50 kb or less in the IGV window.
Remove Track(s) Removes selected track(s) from the display. more...
Save image Saves the data visible in the IGV panel to an image file. Specify the image file format by setting the filename extension in the file save dialog to .png, .jpeg, .jpg, or .svg. 

 

Alignment Track

Alignment tracks display alignments (more here). For an example, select the Human hg19 genome from the genome dropdown menu in the toolbar, and then click File>Load from Server and select an alignment from the 1000 Genomes project. Tip: Zoom in to view alignments and the alignment track pop-up menu.

Command Description
Rename track Renames a track.
Copy read details to clipboard When you hover over a read, the tool tip displays information about the read. This option copies that information and the read sequence to the clipboard.
Group alignments Groups alignments by read strand, first-in-pair strand, sample, read group, chromosome of mate, pair orientation, supplementary flag, or tag.
Sort alignments

Sorts alignments by start location, strand, base, mapping quality, sample, read group, or insert size as defined in the SAM/BAM file format and detailed below in Color alignments

When sorting by base, alignments which span a base with a splice, i.e. do not actually cover the base, now appear at the bottom.

Additionally, repeat the most recent sort with hotkey ctrl-s.

Color alignments

Colors alignments by the following options:

  • No color displays reads in gray. Low mapping quality reads are still represented in unshaded white.
  • Insert size details are here.
    • Blue is for inserts that are smaller than expected.
    • Red is for inserts that are larger than expected.
    • Inter-chromosomal rearrangements are color-coded by chromosome.
  • Pair orientation details are here.
    • Shades of green to blue show structural events of inversions, duplications, and translocations within a chromosome.
  • Insert size and pair orientation. Translocations on the same chromosome follow the color-coding schema for pair orientation, whereas translocations between two chromosomes follow the color-coding schema for insert size. If a read pair has both unexpected orientation and insert-size, the orientation color schema is used.
  • Read strand in pastels, red for positive rightward (5' to 3') DNA strand, blue for negative leftward (reverse-complement) DNA strand, and grey for unpaired mate, mate not mapped, or otherwise unknown status.
  • First-of-pair strand assignment is dependent on RNA transcript directionality and is useful for directional libraries. Displays reads or read pairs in which the forward read is first (F1 or F1R2) in red and reads or read pairs in which the reverse read is first (R1 or R1F2) in blue. Unknown status is in gray.
    • For a given transcript, non-directional libraries will show a mix of red and blue reads aligning to the locus.
    • Directional libraries will show reads of one color in the direction matching the transcript orientation.
  • Read group is designated in the SAM format file header section under @RG. E.g., for Illumina reads, RG typically groups reads from a lane.
  • Sample is a tag designated in the SAM format file header section under @RG that specifies sample information. E.g., a sample may by split and run across multiple sequencing lanes represented by different read groups but the same sample tag.
  • Tag allows custom input of a two-letter tag as designated in the SAM format file header section. E.g., CN designates the name of the sequencing center producing the read.
    • The UCSC YC tag is supported for user-defined colored tags in RGB intensities as detailed here.
  • Bisulfite mode with six options: CG, CHH, CHG, HCG, GCH, and WCG. Details are here. The rules for what constitute a mismatch to the reference genome are adjusted to account for the expected C to T (and for the reverse complement G to A) conversions. A red C or G indicates a protected site, such as by methyl modification, while a blue T or A indicates bisulfite conversion of an unprotected site.
Shade base by quality Uses the color intensity of a mismatched base to indicate its quality score: the darker the shading the higher the score. Changing this option also changes the option on the Alignments tab of the Preferences window.
Show mismatched bases By default, mismatched bases are displayed as colored letters on a gray bar that represents the read. To change the default color scheme, see Modify the prefs.properties file.
Show all bases Select this option to display all bases in the read. To change the default color scheme, see Modify the prefs.properties file.
View as pairs For more information on this option, see this page.
Go to mate Jumps to the region of the paired read (if any).
View mate region in split screen

Open anoter panel centered on the mate of the clicked alignment.

Set insert size options Controls color-coding of paired reads based on the inferred insert size.
Re-pack alignments Sorts alignments to minimize gaps at the top of the track.
Show coverage track When selected, IGV displays the matching coverage track for the alignment track.
Load coverage data

Loads coverage data for an alignment track. To generate coverage data, use igvtools. more...

Loading an alignment track from the IGV data server (File > Load from Server) automatically loads the matching coverage data.

Collapsed

Expanded

Squished

Changes the height of the reads to adjust the amount of information displayed.
Select by name Opens a window so you can enter the name of a read. IGV will highlight that read with a colored border.  Note that IGV does not change the view, so if the read is not currently visible this option will have no apparent effect.
Clear selections

Clears the outlines that show paired reads.

  • Control+click (Mac: Command+click) a read to outline the read and its paired mate in the same color. Colors are arbitrary but unique to each pair. A black outline indicates that the selected read has no mate.
  • To clear the outline for a paired read, Control+click (Command+click) either read.
  • To clear all outlines, right-click and select Clear selectionss
Copy read sequence Copies the nucleotide sequence of the selected read orregion of interest to the clipboard.
BLAT read sequence

The selected read is used in a BLAT (BLAST-like Alignment Tool) search within the given genome as detailed on the BLAT Search page.

  • Search with sequences of up to 8 kb in length.
  • The default BLAT server is hosted at the UCSC Genome Browser and supports most UCSC derived genomes including human and mouse genomes. Change the default search server in Advanced Preferences.
Copy consensus sequence

Calculates the concensus sequence for the region in view and copies the information to the clipboard. The method for calculating the consensus is taken from Cavener, Nucleic Acids Res. 15, 1353-1361, 1987.

1. If the frequency of a single nucleotide at a specific position is greater than 50% and greater than twice the number of the second most frequent nucleotide it is assigned as the consensus nucleotide.

2. If the sum of the frequencies of two nucleotides is greater than 75% (but neither meet the criteria for a single nucleotide assignment) they are assigned as co-consensus nucleotides.

3. If no single nucleotide or pair of nucleotides meets the criteria,  assign an 'N'.

Information copied to the clipboard includes:

  • Locus of the copied sequence (i.e., region currently in view)
  • The consensus sequence.
  • A matrix with the details of all nucleotide counts used to calculate the consensue sequence. Rows in the matrix correspond to the bases along the sequence. The values in a row show the counts for each type of nucleotide at that locus. A header row above the matrix indicates the order of the nucleotide columns (A, C, G, T, and N).
Sashimi Plot Open a Sashimi-style plot window. more...
Show Coverage Track Show or hide the associated coverage track.
Show Splice Junction Track Show or hide the associated splice junction track.
Hide Track Hide the alignment track.  
Save image Saves the data visible in the IGV panel to an image file. Specify the image file format by setting the filename extension in the file save dialog to .png, .jpeg, .jpg, or .svg. 
Export Alignments Export alignments in view to a SAM file.
Remove Track Remove track permanently

 

Preferences

To display the Preferences window, click View>Preferences. Preferences are preserved across sessions. To override preferences during a session, use the track display pop-up menu. Each section on this page describes the options on a tab of the Preferences window: General, Tracks, Mutations, Charts, Alignments, Probes, Proxy, Advanced, and IonTorrent.

To restore default preferences or modify other default settings not listed here, see the Modify the prefs.properties file page.

General

Select to distinguish regions with zero values (white) from regions with missing data (gray). Clear (default) to display both regions in the same way (white). Affects only bar charts and scatter plots.
Select to display all tracks in a single panel. Clear (default) to display data tracks (e.g., expression data) in one panel and feature tracks (e.g., genes) in another.
Select (default) to show attributes and attribute values to the left of the data panel. Clear to hide the attributes. This option and View>Show Attribute Display have the same effect on attribute display.
Select to outline the boundaries of regions of interest in black. Clear (default) to leave them without black boundaries.
Zoom in on search results. When selected (default) the zoom level is automatically adjusted so that the target feature fills the view after a successful search. If not checked, the target feature of a search is centered in the view but the zoom level is unaffected.
Change this to change the resolution (in base pairs) at which the sequence track becomes visible.

Change this to define how large a flanking region in base pairs. To specify the flanking region as a percentage of feature length, enter the percentage as a negative number. IGV adds the flank before and after a feature locus when you zoom to a feature, or when you view gene/loci lists in multiple panels.

Click here to change the background color of the IGV display.
Use this to set a default font size for labeling tracks and features.

Tracks

Default track height for bar charts, scatter plots, and line plots.
Default track height for all other tracks.

Name of an attribute in the sample information file. IGV uses the corresponding attribute value as the track name.

Select to expand feature tracks by default. You may have to restart IGV for this to take effect.

Collapsed:

Expanded:

Select (default) to show the "expand/collapse" triangular icon on feature tracks.

Collapsed with the icon:                          Expanded with the icon:

                           

Select to normalize tracks containing coverage data in .tdf files that were created using igvtools.  This normalization option multiplies each value by [1,000,000 / (totalReadCount)]. 

  • This is only available for .tdf files created using igvtools builds dated 1/28/2010 or later.  Earlier versions of igvtools did not record the total read count.
  • This selection takes effect with new sessions loaded afterwards.

Mutations

Obsolete
Obsolete
Obsolete

Select to color-code mutation data. To view and change the mutation coloring scheme, click the Choose Colors button. The Edit button next to Mutation in View>Color Legends also displays the same dialog. 

The default colors are shown below (Screenshot 2015.02.18).

Charts

Select to add a border at the top to the track.
Select to add a border at the bottom of the track.

Select (default) to color the top and bottom borders (if any). Clear to show the borders in black regardless of the track color. Tip: To change the track color, use the track display pop-up menu.

Select to label the track with its name, provided the track is at least 25 pixels high.
Select to label the y-axis with its data range.
Select (default) to allow charts (barchart, scatterplot, and lineplot) to automatically adjust the plot Y scale to the data range currently in view.  As the user pans and moves, this scaling continually adjusts.  Clear to turn autoscaling off.  There is an option in the popup menu to enable autoscaling for a single track.

Select (default) to show the range of the data.

Select to show all features in heatmaps.

The following figures illustrate these track display options.

  • Color borders selected (default):
  • All options selected:

 

Alignments

Sets the threshold at which IGV displays reads. Reads are visible only when IGV is zoomed in to display a number of bases less than or equal to this threshold.

Downsampling

IGV displays a specified number of randomly sampled alignments configured by the downsampling parameters instead of keeping all of them in memory. The coverage track, displaying total coverage at a region, is unaffected; that is, it always shows unsampled values. Default setting downsamples up to 100 per 50 nt window and paired reads are downsampled as a set.

  • Downsample reads: Deselect to turnoff downsampling.
  • Max read count: The maximum value that can be set is 100,000 reads per window. IGV uses reservoir sampling, so that all reads are kept if the read count is less than Max read count. If the read count is greater, the probability of any given read being sampled is equal to (Max read count) / (actual read count).
  • per window size (bases): Sampling is performed over windows, using the window size specified here.

Downsampled regions are marked by black rectangles spanning the downsampled region just under the coverage track as shown in the screenshot below. When zoomed out, the black rectangles may appear like a continuous black line.

Filter and Shading Options

For more information, see the Sequence Alignment/Map Format Specification.

  • Coverage allele-fraction threshold: Sets the mismatch threshold at which bases on an alignment coverage track are colored. The default is 0.2, i.e., if a nucleotide differs from the reference sequence in greater than 20% of reads, IGV colors the bar in the coverage bar chart in proportion to the read count of each base (ACGT). The threshold for an individual track can be changed from the pop-up menu.
  • Quality weight allele fraction: Deselect to disable quality weighting of the allele fraction when displaying single nucleotide mismatches in the coverage track for sequencing data. Prior to IGV v2.3.46, allele fractions automatically displayed quality-weighted fractions. The tooltip metrics reflect nonadjusted fractions. The numbers preceding + and - correspond to strand-specific counts, and sum to the total.
  • Filter duplicate reads: Clear to display alignments marked as duplicate reads. In DNA-Seq alignments these PCR or optical duplicates are often marked and filtered. In RNA-Seq alignments considerations differ.
  • Filter vendor failed reads: Clear to display reads that fail the vendor quality check.
  • Filter secondary alignments: Select to hide all secondary alignments for a read and display only the primary alignment. A read may map ambiguously to multiple locations, e.g. due to repeats. Only one of the multiple read alignments is considered primary, and this decision may be arbitrary. All other alignments have the secondary alignment flag. 
  • Filter supplementary alignments: Select to hide all supplementary alignments for a read and display only the representative alignment. A chimeric alignment that is represented as a set of linear alignments that do not have large overlaps typically has one linear alignment that is considered the representative alignment. Others are called supplementary and have a supplementary alignment flag.
  • Filter alignments by read group: Select to hide alignments that match the read groups listed in the filter file. The filter file is a text file that lists read groups, one per line.  This option means that IGV does not load the alignments associated with these read groups. Specify a URL or absolute file path to the file. Read groups are designated in the SAM format file header section under @RG.
  • Mapping quality threshold: Sets a threshold on alignment mapping quality. Only alignments with mapping quality greater than or equal to this threshold are shown.
  • Shade mismatched bases by quality: Select (default) to shade mismatched bases by quality, with lower quality being more transparent. A commonly used base quality metric is the Phred quality score, represented as Q, as detailed in Wikipedia

Phred quality scores are logarithmically linked to error probabilities with typical ranges between Q2 and Q63. For example, Q10–Q15 correspond to 10%–3% base call error probabilities (cutoff range recommended by Trimmomatic), Q20 to 1% error probability or 99% base call accuracy probability (conservative cutoff recommended by GATK), and Q60 corresponds to a one in a million error probability or 99.9999% base call accuracy probability. Differences in Q for the Sanger versus Solexa/Illumina GA platforms are graphed in this Wikipedia entry that shows diverging probabilities for scores less than Q13 for the different platforms. The same entry discusses Phred+33 versus Phred+64 systems of scoring, with the former being more prevalent for recent platforms.

  • Flag insertions larger than: Select to flag reads containing insertions larger than the specified bases. An insertion within a read is denoted with a purple I () as detailed in Viewing Alignments. When this feature is activated, the I symbol is colored red for insertions larger than the size specified.
  • Show center line: Select so that, when zoomed in sufficiently, IGV displays a line at the center of the display. At higher resolutions, the center line becomes two lines that frame the aligned bases at the center of the display, as shown in the figure above.
  • Show coverage track: Select to display a coverage track for each alignment track. The coverage track is visible only when alignments are visible. It displays a gray bar chart showing the depth of the reads at each locus. If a nucleotide differs from the reference sequence in greater than 20% of quality-weighted reads, IGV colors the bar in proportion to the read count of each base (A, C, G, T). Modifying this option affects the display of subsequently loaded alignment tracks. Note, to change the threshold from the default 20%, see Coverage allele-freq threshold.
  • Show soft-clipped bases: Select to show the soft clipped sections of the read.
  • Flag unmapped pairs: Select to draw a red box around any paired alignment whose mate is not mapped.

Splice Junction Track Options

  • Show junction track: Select to display the splice junction track.
  • Min flanking width: The minimum amount of nucleotide coverage required on both sides of a junction for a read to be associated with the junction. This affects the coverage of displayed junctions, and the display of junctions covered only by reads with small flanking regions.
  • Min junction coverage:  The minimum number of reads associated with a junction required for the junction to be displayed.
Sets default size thresholds for color-coded flagging of paired end alignments. Only paired end alignments with insert sizes between these thresholds are flagged.  Select Compute to compute selected values from the actual size distribution of each library.

 

Proxy

Sets proxy parameters for connecting to the Internet.  IGV will use this to load hosted genomes and hosted data sets.
Select and enter values if a username and password is required for the proxy.
Clears all proxy settings.

Advanced

Select this option to enable a port on which IGV listens for commands and http requests. Enabling the port allows control of IGV from a web browser. more...

Select this option to edit URLs for the IGV data and genome servers. These settings are rarely changed. 

IGV caches each genome that it loads. On rare occasions, it may be necessary to clear the cached genome file to display an updated version of the genome. Click Clear Genome Cache to do this.

Genome Server URL is the URL for the genome server that populates the genome drop-down list.

Data Registry URL is the URL for the hosted data sets registry (populates File>Load from Server dialog).

Keep this selected to allow IGV to automatically check for updated genomes. Clear to disable this automatic check.

IGV 2.2, released December 18, 2012, and newer versions allow disabling anti-aliasing. This can significantly improve performance in some circumstances with running with X-Windows.

IGV 2.3.46, released March 2015 allows BLAT sequence searches of features, aligned reads, and selected regions of interest of up to 8 kb in length via right-click pop-up menu. Change the server hosting the genome against which BLAT searches. The default is the BLAT server hosted by UCSC's Genome Browser. Most UCSC derived genomes are supported, including human and mouse genomes.

 

UPDATE:  See here for new BLAT customization options as of release 2.10.3.

This allows you to move your default IGV directory.

 

Color Legends

By default, IGV uses heatmaps to display certain types of data (see Default Display). Use the Color Legends window to change the default colors for these heatmaps.

To change the default colors:

  1. Click View>Color Legends to display the Color Legends window.
  2. Click a heatmap legend to set its color and range.
  3. For LOH and Mutation data, click a colored box to change its color.

Alternatively, set default mutation data colors by modifying the prefs.properties file.

 

Keyboard Shortcuts

There are some useful keyboard shortcuts you can use in IGV.

Shortcut Description
ctrl-R Defines the region currently in view as a region of interest.
ctrl-F/ctrl-B Skip forward to the next feature and back to the last feature.
ctrl-shift-F/ctrl-shift-B If you have the feature track expanded and have selected one of the rows, this will skip forward to the next exon or back to the last exon.

alt-left/alt-right (Windows)
cmd-[ and cmd-] (Mac)

These move you back and forward through your IGV history.
Arrow keys Pans left, right, up, and down in the current chromosome.
Home/End keys Skips to the page top or bottom of the current view, then pages right or left respectively.
PageUp/PageDown keys Pages up and down the current view.

 

Navigating the View

Zooming

A number of different controls are provided to zoom the view to larger or smaller regions of the genome, from a whole-genome view that lays out all the chromosomes side by side, to a single whole chromosome, and all the way to base-pair resolution.

Click the "home" icon to zoom out to the whole-genome view.

From the whole-genome view, zoom to a chromosome by clicking its label.

Select a chromosome from the drop-down menu to zoom to it; or select "All" to zoom out to the whole genome view

To zoom in and out on a chromosome:

Zoom in Zoom out
+ -
Double-click or shift-click the track data Alt-click (Mac: option-click) the track data
Click a zoom level on the zoom slider Click a zoom level on the zoom slider
Click the plus (+) icon on the zoom slider Click the minus (-) icon on the zoom slider
Click and drag on the genome ruler to select an
area to which to zoom
 

Scrolling and Panning

Vertical scroll of data tracks Horizontal pan across the genome *
Scroll bar in the IGV window Click and drag the track data
Click and drag the track data Click the chromosome ideogram to scroll to that location
Page Up and Page Down keys Click the ruler to center that location
Up and down arrow keys Left and right arrow keys
  Home and End keys (scroll by screen width)

* You cannot pan horizontally across the genome when IGV is displaying the whole genome or a whole chromosome; you must be zoomed further in.

Searching

Use the search box to find and go to a genomic locus:

A track name cannot be searched for in the search box. To find specific tracks, use menu item Tracks>Filter Tracks.

Jumping

If you have a feature track loaded (e.g., Gene track, BED, or GFF file), you can jump from one feature to the next.

  1. Click on the track name to select the track that contains the features that you want to find.
  2. Jump from feature to feature:
    • Press Ctrl+F to jump forward to the next feature.
    • Press Ctrl+B to jump backward to the previous feature.

    IGV positions the start of the next (or previous) feature at the center of the display.

You can also jump from one exon to the next. To exon-jump, select a feature track and press Shift+Ctrl+F to center the next exon in your view, Shift+Ctrl+B to move back one exon.

Back and Forward Buttons

The back and forward buttons in the toolbar () allow you to move backward and forward through your views of the genome the way you move back and forward in a web browser.

Loading a Genome

Genomes are selected from the genome drop-down list on the upper-left of the IGV window.  Intially, this list contains a single item, Human hg18 or Human hg19, depending on the version of IGV.  To add other genomes to the list, see the sections below on "Selecting a Hosted Genome" and "Loading Other Genomes".

Selecting a Hosted Genome

IGV provides a number of hosted genomes.  If the genome you want is not in the drop-down list, either click on the More... entry at the bottom of the list, or select Genomes>Select Hosted Genome from the menu. This will bring up a list of all the hosted genomes, listed in alphabetical order. Scroll down to find the one you want, or use the 'Filter' to narrow down the list.

Loading Other Genomes

If you have the .FASTA file for your reference genome sequence, it can be loaded by clicking on Genomes > Load Genome from File or Genomes > Load Genome from URL. In this case, the gene annotations will not be loaded automatically, but if you have the gene annotation file, it can be loaded like any other data file via the Files > Load from menus. To automatically load gene annotations, as well as an optional cytoband file, you can create a genome JSON file as described below. 

FASTA files can be plain text or block gzipped, and must be indexed with a .fai as defined by the Samtools suite (www.htslib.org). If the file is plain text (not block gzipped) and not indexed, IGV will attempt to index it.  IGV remembers the location of the FASTA file and the file will appear in the drop-down list until it is removed as described below.

Removing a Genome

To remove a genome from the IGV menu:

Creating a Genome JSON File 

In special cases it might be desirable to create a genome JSON file to define the reference. This option enables additional files to be associated with the FASTA reference sequence file, such as annotation track files. The genome JSON format is described in the IGV github wiki. The file name should have a ".json" extension. Once created it can be loaded from the Genomes menu. (This replaces the old .genome file, which has been deprecated).

External Control of IGV

Controlling IGV through a Port

IGV can optionally listen for http requests over a port. This option is turned off by default but can be enabled from the Advanced tab of the Preferences window

Note:  IGV will write a response back to the port socket upon completion of each command.  It is good practice to read this response before sending the next command.   Failure to do so can overflow the socket buffer and cause IGV to freeze.   See the example below for the recommended pattern.

Commands

See https://github.com/igvteam/igv/wiki/Batch-commands for complete list of port and batch commands.

Example

Example java code:

        Socket socket = new Socket("127.0.0.1", 60151);
        PrintWriter out = new PrintWriter(socket.getOutputStream(), true);
        BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));

        out.println("load na12788.bam,n12788.tdf");
        String response = in.readLine();
        System.out.println(response);

        out.println("genome hg18");
        response = in.readLine();
        System.out.println(response);

        out.println("goto chr1:65,827,301");
        //out.println("goto chr1:65,839,697");
        response = in.readLine();
        System.out.println(response);

        out.println("snapshotDirectory /screenshots");
        response = in.readLine();
        System.out.println(response);

        out.println("snapshot");
        response = in.readLine();
        System.out.println(response);
 

Running IGV with a batch file

As of version 1.5, a user can load a text file to execute a series of sequential tasks by using Tools>Run Batch Script. The user loads a TXT file that contains a list of commands, one per line, that will be run by IGV. Arguments are delimited by spaces (NOTE: not tabs). Lines beginning with # or // are are skipped. See https://github.com/igvteam/igv/wiki/Batch-commands for a complete list of batch commands..

Example

new
genome hg18
load  myfile.bam
snapshotDirectory mySnapshotDirectory
goto chr1:65,289,335-65,309,335
sort position
collapse
snapshot
goto chr1:113,144,120-113,164,120
sort base
collapse
snapshot
goto chr4:68,457,006-68,467,006
sort strand
collapse
snapshot

The example script does the following:

  1. Loads a file.
  2. Sets the genome and snapshot directory.
  3. Jumps to a specified locus.
  4. Sorts, collapses, and then takes a snapshot of the screen.
  5. Repeats these steps for other loci.

 

Creating HTML Links to IGV

Load Files with HTML Links

Data and session files can be loaded into IGV from a web browser or other application supporting hyperlinks.  This makes use of the listener port.  To use this option you must enable the port listener on the Advanced tab of the Preferences window, accessed from the View menu.   Links can be created to load data or jump to a locus as follows.

http://localhost:port/load?file=URL&locus=locus&genome=genome&merge=[true|false|ask]&name=name

http://localhost:port/goto?locus=locus

  • The file parameter value can be a URL or a comma-delimited list of URLs to most IGV-supported data file types (exceptions listed below), or a session file.  
  • The merge parameter (optional) controls whether or not the loaded data is merged with the existing IGV session, or a loaded into a new session.
    • If true, the data specified in the link will be added to the existing IGV session. This is the default behavior if the parameter is not specified.
    • If false, any data currently loaded will be unloaded after clicking the link.  
    • If ask, a dialog will pop up to ask the user whether or not to unload the current session before loading new tracks. (Available as of IGV 2.15.4)

The merge paramater is ignored if loading a session; loading a session will alway unload the previous session.

  • The name parameter (optional) specifies a name or names for the track.  If multiple tracks are loaded as a comma-delimited list, the name parameter value should also be a comma-delimited list of the same size.   The name parameter is ignored if loading a session.

Examples:

http://localhost:60151/load?file=http://www.broadinstitute.org/igvdata/annotations/hg18/conservation/pi.12mer.wig.tdf&locus=egfr&genome=hg18&merge=ask

http://localhost:60151/load?file=http://www.broadinstitute.org/igvdata/exampleFiles/gbm_session.xml

http://localhost:60151/goto?locus=egfr

Session File Format

IGV produces a session file in XML format when a user clicks on File>Save Session. You can also create a session file manually.  The XML format (IGV version 1.5) is described below.

Session XML Hierarchy
  • <Global>
    • <Resources>
      • <Resource>
    • <Panel>
      • <Track>
        • <DataRange>
Description of Session Components

Required - These elements are required in a session file.

  • <Global>: The root XML element.
    • genome= The genome id (e.g., hg18).
    • locus= The initial genomic region to be viewed (chromosome:start-end or gene name).
    • version= The session version (this must equal '3').
  • <Resources>: An enclosing element for all Resource elements.
  • <Resource>: Contains information about the data sources to be loaded, including data files, feature files, sample information files, and DAS servers.
    • name= The name of the track (single track files only).
    • path= The location of the data source (file path or URL).
    • url= Defines a URL for external links associated with a feature track. Any '$$' in this string will be substituted with the name of the current feature.

Optional  - These elements are optional in a session file and are used to determine the placement of tracks and visual style choices. They are included in an XML file produce when you save a session in IGV, but are typically not included in an XML file that is created manually.

  • <Panel>:
    • name= A panel identifier used internally by IGV. 
    • height= The default height for the panel.
  • <Track>:
    • color= The default color for the data in the track.
    • expand= Whether the track is initially expanded or not.
    • height= The default height of the track.
    • id= A track identified used internally by IGV.
    • name= The display name for the track.
    • renderer= The renderer used to display the data.
    • visible= Whether the track is visible or has been filtered out.
    • windowFunction= The function to be used when displaying data
  • <DataRange>:  Defines the y axis for tracks when rendered as charts.
    • baseline= Line is drawn at this y value. Default is 0.
    • maximum= Maximum y value.
    • minimum= Minimum y value. Default is 0.
    • type= Scale type, "LINEAR" or "LOG"
    • drawBaseline (not used)
    • flipAxis (not used)
Session File Example

The XML below is an example of a minimal session file.

------------------------------------------------------------------------------------------------------------------------------

<?xml version="1.0" encoding="UTF-8"?>

<Global genome="hg18" locus="EGFR" version="3">

<Resources>

<Resource name="RNA Genes" path="http://www.broadinstitute.org/igvdata/tcga/gbm/GBM_batch1-8_level3_exp.txt.recentered.080820.gct.tdf"/>

<Resource name="RNA Genes" path="http://www.broadinstitute.org/igvdata/annotations/hg18/rna_genes.bed"/>

<Resource name="sno/miRNA" path="http://www.broadinstitute.org/igvdata/tcga/gbm/Sample_info.txt"/>

</Resources>

</Global>

 

Viewing the Reference Genome

Sequence Track Options

When zoomed in sufficiently, the reference genome Sequence track appears at the top of the lower panel above the Genes track, if any, in the IGV display as shown in the Screenshot (2015.04.01). The sequence is represented by colored bars or colored letters, depending on zoom level, with adenine in green, cytosine in blue, guanine in yellow, and thymine in red (A, C, G, T). To change this default nucleotide coloring scheme see the Modify the prefs.properties file page.

IGV displays the sequence of bases as they appear in the FASTA file for the reference genome. In addtion to the upper case letters A, C, G, and T, you may see lower case letters for these bases, and also N / n. Lower case letters often mark repeated regions, and N/n may represent ambigous nucleotides. However, the convention for the use of case and N, is not completely standardised, and depends on the creator of the genome sequence. A useful discussion about this can found on Bioinformatics StackExchange (see for example response #11 on this thread).

Flipping the Strand

You can change the strand that is displayed by clicking on the arrow in the title to the left of the track. Note that the sequence and the arrow are only displayed when zoomed in to a sufficiently small region. 

  • Alternatively, right-click on Sequence track to select Flip strand from the pop-up menu.

 

The direction of the arrow indicates which strand is currently displayed. An arrow pointing left indicates that the negative strand is showing. This strand will show the complement nucleotides and reverse complement translations.

Sequence Translation

With the reference genome sequence track, you can optionally display a 3-band track that shows a 3-frame translation of the amino acid sequence for the corresponding nucleotide sequence. The translation is shown for the strand indicated.

  • Right-click on Sequence track to select Show translation from the pop-up menu and to select a Translation Table.
  • Selecting Save image from the right-click pop-up menu save the lower display panel containing the Sequence track as an image. Specify the image file format by setting the filename extension in the file save dialog to .png, .jpeg, .jpg, or .svg. 

Amino acids are displayed as blocks colored in alternating shades of gray. Methionines are colored green, and all stop codons are colored red. When you zoom all the way in, the amino acid symbols will appear. 

You can toggle the display of this translation track by clicking once, anywhere in the sequence or translation track, or by toggling Show Translation in the track popup menu.

Feature Track Options

Viewing Options for the Feature Track

There are 3 different options for viewing the feature track.  These allow you to display overlapping features, such as different transcripts of a gene, on one line or multiple lines

To change the view of the feature track, right-click on the feature track and select one of the options:

Collapsed:

Squished:

Expanded:

Exon Jumping

This feature is similar to feature jumping.  To feature-jump, you select a feature track and press Ctrl-F for forward, Ctrl-B for back.  To exon-jump, you select a feature track and press SHIFT-Ctrl-F to center the next exon in your view, SHIFT-Ctrl-B to move back one exon.

GFF style tags for BED files

The "name" field (column 4) of a BED file can contain GFF3 style key-value attribute tags by specifying "gffTags=on" on the track line.  These attributes will be displayed in the mouse hover popup text.

Loading Data and Attributes

Data and genomic annotations can be loaded from local files, HTTP URLs, or an IGV data server.

Load from File

Load data files by browsing for files on the local file system or other file systems you have mounted. See File Formats for information about the file formats IGV accepts.

To load data from the file system:

  1. Select File>Load from File. IGV displays the Select Files window.
  2. Select one or more data files or sample information files, then click OK.

Load from URL

To load data from an HTTP URL:

  1. Select File>Load from URL.
  2. Enter the HTTP or FTP URL for a data file or sample information file.
  3. If the file is indexed, enter the index file name in the field provided.
  4. Click OK.

To load a file from Google Cloud Storage, enter the path to the file with the "gs://" prefix. For example, a path of gs://genomics-public-data/platinum-genomes/bam/NA12877_S1.bam will display the reads for sample NA12877 from Illumina Platinum Genomes.

Notes:

Load from Server

To load data from the IGV data server:

  1. Select File>Load from Server. The Available Datasets window appears:
  2. Expand the tree to see the datasets.
  3. Select one or more datasets. Click the check box to the left of a dataset to select it.
    Warning: Selecting a folder selects all of its subfolders and all of the datasets in those folders. This can potentially be a huge amount of data. To be sure you are loading only the data you want, open collapsed folders and select only the datasets of interest.
  4. Click OK.

Removing Tracks 

To remove all tracks and attributes:

To remove specific tracks, do one of the following:

 Creating a Chromosome Name Alias File

One of the common causes for a data loading failure is a mismatch in chromosome names between the data file and the IGV genome it is being viewed against.  

The workaround is to create a tab delimiated "alias" file to specify alternate names for a chromosome.  The second column contains the corresponding name in the genome assembly you are viewing (e.g., chr1 for our "hg38" genome).  For instance, the first 2 lines of an alias file might look like this:

chr1 <tab> 1 <tab> CM000663.2  <tab>  NC_000001.11

chr2  <tab>  2 <tab> CM000664.2 <tab> NC_000002.12

Name the file with the pattern <genome iD>_alias.tab, the word "alias",  for example, hg38_alias.tab.  Place this file in the igv genomes directory. The default location for this folder <user home>/igv/genomes, it can be changed in Preferences -> Advanced

Note: Certain well-known aliases are built into IGV and do not require an alias file.  These include mappings that involve adding or removing the prefix "chr" to the name, for example  1 -> chr1 and chr1 -> 1.    Also, NCBI identifiers that start with "gi|" and follow the pattern illustrated in the example above are automatically mapped. 

Viewing Data

Default Display

When you load genomic data, IGV displays the data in horizontal rows called tracks. Typically, each track represents one sample or experiment. For each track, IGV displays the track identifier, one or more attributes, and the data.

When loading a data file, IGV uses the file extension to determine the file format (see File Formats), which in turn sets the data type and default display options.    Some common file formats, assumed data type, and display options are listed below.

File Format Determines Data Type
File Format Data Type
seg Segmented copy number
bam, cram Sequence alignments
bed, gtf, gff3, psl, bigbed Genome annotations
wig, bedgraph, bigwig, tdf Quantative data

 

Changing the Display

You can override IGV's default display options in several ways:

  • Use the track pop-up menu to change the appearance of selected tracks.
  • Use the Preferences window to set display preferences for all tracks.
  • Use the Color Legends window to set the default data range and color for heatmaps, which IGV uses to display segmented copy number.
  • Add a track line at the top of a data file to specify the display settings for the data.

This section describes a few commonly used display options that apply to all (or most) tracks: graph type, data range, track color, track height, and track names.  For a complete list of display options, review the options available in the pop-up menus, Preferences window, Color Legends window, and the menu bar (View and Tracks menus).

Graph Type

Most tracks are displayed using one of four graph types (the following graphs show the same data):

Heatmap:  
Bar chart:  
Points (Scatter plot):  
Line plot:

IGV determines the default graph type for a track as described in Default Display.

To change the graph type of selected tracks:

  • Right-click a track and select a graph type from the pop-up menu.

Data Range

The data range for a track provides the minimum, baseline, and maximum value for the graph, and also whether the scale is linear or logarithmic. IGV determines the default data range for a track as described in Default Display.

To change the data range for selected heat map tracks:

  • Right-click a track and select Set Heatmap Scale from the pop-up menu.  The heatmap scale can be set per track.

To change the data range for other selected tracks:

  • Right-click a track and select Set Data Range from the pop-up menu.

Changing the data range can significantly affect the data display: 

minimum, baseline, maximum Result
 0,0,3  
-1.5,0,1.5  
 -5,0,5  

Track Color

To change the track color for selected heat map tracks:

  • Right-click a track and select Set Heatmap Scale from the pop-up menu.

To change the track color for tracks that are displayed as something other than a heatmap (i.e., bar chart, scatter plot, or line plot):

  • Right-click a track and from the pop-up menu select either Change Track Color (Positive Values) or Change Track Color (Negative Values).

Track Height

To change the height of selected tracks:

To change the height of all tracks:

  • Select Tracks>Set Track Height and enter a value.

To fit the data to the window:

  • Select Tracks>Fit to Window.
    IGV displays all tracks. If necessary, it sets the track height to 1 pixel and scrolls the data.

Track Names

By default, IGV displays track names to the left of the attribute panel. Legibility of the track names depends on track height; for example, track names will not be legible when track height is 1 pixel).

To select the attribute IGV uses as the track name:

To display the track name as a track label:

To rename a track:

  • Right-click a track or a track name, then select Rename Track in the pop-up menu.

You can only rename one track at a time. You can preserve track name changes only by saving the session.

Viewing Options for the Feature Track

There are 3 different options for viewing the feature track.  These allow you to display overlapping features, such as different transcripts of a gene, on one line or multiple lines

To change the view of the feature track, right-click on the feature track and select one of the options:

Collapsed

Squished

Expanded

Segmented Data

File Format

For segmented copy number data, use the SEG format.

Display Notes

  • By default, IGV displays segmented data as a blue-to-red heatmap where the data range is -1.5 to 1.5. If loaded segmented data appears in tracks colored all red, check the data values and modify the data range as necessary.
  • To change track display options, use the track pop-up menu. The commands that appear in the pop-up menu are those relevant to any data track.

 

GWAS Data

IGV can display genome-wide association study (GWAS)  data as a "manhattan plot", color-coded by chromosome.   Data formats are described here.

The plot represents the significance of the association between a SNP or haplotype and the trait being measured.  The Y-axis shows -log10 transformed P values, which represent the strength of association.

The size of the data points in the plot and their height on the left-hand side of the data pane relate directly to their significance: the larger the point and the higher the point on the scale, the more significant the association with the trait.  You can see the point size difference in the following screenshot of data on chromosome 1.

As in other parts of IGV, hovering over a data point allows you to see a pop-up containing the data specifically associated with that point.  You can see the pop-up for the topmost data point in this image. Note that the point's position on the scale on the left is associated with its P value. 

GWAS Pop-up Menu

 The following commands appear in the pop-up menu for GWAS tracks:

Command Description
Rename Track Renames the track.
Remove Track Removes the selected track from the display.
Set Data Range... Changes the minimum, baseline, and maximum values of the scale used for the GWAS data.
Change Track Height... Changes the display height of the track.

Color Scheme
Chromosome color
Single color
Alternating color

Changes the display to use different color schemes for the chromosome color-coding. The chromosome color scheme (default) uses the colors defined by IGV. 

The single color scheme changes all the chromosomes to display in a single color (blue by default). 

The alternating color scheme uses two colors (blue and gold by default) that alternate through the chromosomes.

Set primary color...

Set the color for the single color scheme and for one of the colors in the alternating color scheme.
Set alternating color... Set the alternating color in the alternating color scheme.

Set minimum point size...

Set the minimum data point display size.
Set maximum point size... Set the maximum data point display size.
Save image... Save the current display as an image file. Specify the file format by setting the filename extension in the file save dialog to .png, .jpeg, .jpg, or .svg. 

 

RNA Secondary Structure

IGV can be used to visualize RNA secondary structures in arcs connecting base pairs (the linear format). Alternative structures, where one nucleotide is involved in more than one base pair, and pseudo knots, where arcs cross, can be accommodated.

To visualize the structures, the base pairing information should be stored in a bed format file, which is quite similar to the commonly used 'connect format’ described by the mfold program (Zuker 2003, PMID: 12824337). The file must include a track line which species graphType=arc, e.g. "track graphType=arc”. Each record line must contain the first three columns of a bed file: chrom, start and end, where the start and end represent the base pair. Note that the start position follows standard BED file convention and is zero-based (first base on a sequence is position 0). The following small example represent a hypothetical stem loop:


track graphType=arc
chr1 10 25 stemloop1
chr1 11 24 stemloop1
chr1 12 23 stemloop1
chr1 13 22 stemloop1
chr1 14 21 stemloop1
chr1 15 20 stemloop1

Additional examples can be found in the supplement of the following paper Lu Z, Zhang QC, Lee B, Flynn RA, Smith MA, Robinson JT, Davidovich C, Gooding AR, Goodrich KJ, Mattick JS, Mesirov JP, Cech TR, Chang HY. RNA Duplex Map in Living Cells Reveals Higher-Order Transcriptome Structure. Cell. 2016 May 12.

Viewing Alignments

This page introduces viewing alignment data and associated tracks in the following sections:

Related topics on other pages cover more detailed topics:

File Formats

Aligned reads from sequencing can be loaded into IGV in the BAM format, SAM format, or CRAM format.

Both BAM and SAM files are described on the Samtools project page http://www.htslib.org and in the 2014 article titled Sequence Alignment/Map Format Specification by the SAM/BAM Format Specification Working Group.

IGV requires that BAM and CRAM files have an associated index file.

If you receive a .bam file from a sequencing facility, you will usually also get the corresponding index file. If you need to create the index yourself, there are multiple tools available for indexing BAM files, including igvtools, the samtools package, and the Picard.SortSam module in GenePattern.  

Tracks

Loading an alignment file creates up to 3 associated tracks:  
By default the Alignment and Coverage tracks are initially displayed.  This setting can be altered from the Alignments tab of the Preferences window.   Also, showing or hiding individual tracks can be controlled with track popup menus.
 
Note:  If hiding the Alignment track by default you might also consider the setting to create all tracks in a single panel.  This is in the General tab of the Preferences window. 
 
The Coverage and Alignment tracks are described below.  The Splice Junction Track is covered on a separate page.

Coverage Track

By default IGV dynamically calculates and displays the default coverage track for an alignment file. When IGV is zoomed to the alignment read visibility threshold (by default, 30 KB), the coverage track displays the depth of the reads displayed at each locus as a gray bar chart. If a nucleotide differs from the reference sequence in greater than 20% of quality weighted reads, IGV colors the bar in proportion to the read count of each base (ACGT).

  • Override the default threshold for an individual coverage track by right-clicking the track and selecting Set allele frequency threshold. For example, set the value to .3 to change the threshold to 30%. 
  • To change the default for all coverage tracks, set the value Coverage allele-fraction threshold in the Alignments tab under View > Preferences. The preferences also include the option to disable quality weighting.

View count details by hovering the mouse over a coverage bar. Copy the count details to your computer's clipboard from the right-click menu.

Pre-computed Coverage Data

The dynamically calculated coverage data can be augmented by loading pre-computed  coverage data from a file.  When this option is used the track displays coverage at all zoom levels including at the whole genome and chromosome view. To generate the extended coverage data file ending in TDF extension, use igvtools. The resulting file can be associated with the alignment track by file naming convention or loaded independently from the track popup-menu. 

Visibility Range Threshold and Downsampling

IGV reduces memory usage in the following two ways to improve performance of viewing alignments.

You can adjust the above settings in the Alignment Preferences panel. For example, for lower coverage data, you can provide a larger visibility range threshold. Or for deep coverage, you might want to provide a smaller visibility range threshold and adjust the downsampling to show more reads.

Downsampled reads areas are marked with a black rectangle just under the coverage track. The coverage track represents coverage for all the reads.

In the example shown, the downsampled regions are marked by seven black rectangles just under the coverage track.

 

Alignment Track

This section gives an overview of the alignment track. For options available from the alignment track menu, including grouping, sorting and coloring options, see the alignments section of the pop-up menu page.

    

Detecting Structural Variants

IGV uses color and other visual markers to highlight potential genetic alterations in reads against a reference sequence. Genetic alternations include single nucleotide variations, structural variations, and aneuploidy. Structural variations include insertions, deletions, inversions, tandem duplications, translocations, and other more complex rearrangements. Interpretation of some of these variations are discussed briefy in this section and the next. Interpreting Color by Insert Size and Interpreting Color by Pair Orientation give more detailed explaination of read colors.

An additional factor to take into consideration when judging potential genetic alterations is quality of reads and quality of mapping. IGV uses transparency to indicate quality.

Colors and transparency are used at two levels within alignments: (1) for mapped reads, and (2) for individual bases within reads.

  Color Transparency
mapped reads see Paired-End Alignments section mapping quality
individual bases Mismatched bases read quality (phred) score

Color and Transparency for Individual bases

By default, read bases that match the reference are displayed in gray. Read bases that do not match are color coded, and insertions and deletions within reads relative to the reference are marked. Insertions are indicated by a purple I () and deletions are indicated with a black dash (). In addition, mismatched bases are assigned a transparency value proportional to the read quality known as the phred score. This has the effect of de-emphasizing low quality reads.

Transparency for Mapped Reads

Note that alignments that are displayed with light gray borders and transparent or white fill, as shown in the screenshot, have a mapping quality equal to zero. Interpretation of this mapping quality depends on the mapping aligner as some commonly used aligners use this convention to mark a read with multiple alignments. In such a case, the read also maps to another location with equally good placement. It is also possible the read could not be uniquely placed but the other placements do not necessarily give equally good quality hits. 

Insertions

In a gapped alignment, IGV indicates insertions with respect to the reference with a purple I () or red I for  insertions greater than a user activated and specified cutoff.  Hover over the insertion symbol to view the inserted bases.

 

Deletions

In a gapped read, IGV indicates deletions with respect to the reference with a black bar.

Coloring and Sorting Alignments

Users can also specifiy color and also sort reads by various options, including start location, strand, nucleotide, mapping quality, sample tag, or read group tag. For a description of all user-specified color and sort options, see the alignment track pop-up menu.

For example, to sort alignments:

  1. Right-click a track for the pop-up menu.
  2. Select a Sort option from the pop-up menu. IGV sorts the alignments that intersect the locus at the center of the track, no matter where the cursor was located for the right-click. To display a line down the center of the track, set the Show center line property in the Alignment Preferences panel.

Sorting rearranges rows so that alignments that intersect the center of the display appear in the order specified.  This can cause the alignment layout away from the center line to appear sparse.   To restore the layout to an optimally packed configuration, select Re-pack alignments from the pop-up menu.

Repeat the most recent sort with hotkey ctrl-s.

Paired-End Alignments

IGV provides several features for working with paired-end alignments. This section covers viewing reads as pairs, coloring of mapped paired reads, and the split-screen view. Interpretation of colors is discussed briefy here and in more detail in Interpreting Color by Insert Size and Interpreting Color by Pair Orientation.

View As Pairs

By default, IGV displays reads individually because they pack compactly. Select View as pairs from the right-click menu to display pairs together with a line joining the ends as shown in the image below. The hover element details (2) are also displayed either for a single read in normal view (left) or for a pair of reads in paired reads view (right).

Coloring of Mapped Paired Reads

IGV colors paired-end alignments in two ways.

Control+click (Mac: Command+click) a read to outline the read and its paired mate in the same color. Colors are arbitrary but unique to each pair. A black outline indicates that the selected read has no mate.

  • Control+click (Command+click) either read to clear the outline.
  • Right-click and select Go to Mate Region to jump to the paired mate.
    • If the paired reads have a large insert size, the paired mate will not be highlighted. Turn on the Color by insert size and pair orientation option from the popup menu to confirm as described below.
  • Right-click and select Clear Selections to clear all outlines.

Outlined paired reads are preserved when switched to View as pairs option. However, outlining reads only works in the unpaired view and not in the paired view.

Hover over or click a read to view information about the read, including the location of its paired mate.

IGV colors (1) paired end reads with inferred insert size smaller or larger than expected; (2) read with mate that is aligned to a different chromosome; (3) paired-end alignments with deviant pair orientation. Note that coloring by insert size is a feature designed originally for DNA alignments against the genome. It is based on set base pair values or computed from the size distribution of a library.

  • See Interpreting Color by Insert Size for more detail.
    • Blue is for inserts that are smaller than expected
    • Red is for inserts that are larger than expected.
    • Inter-chromosomal rearrangements are color-coded by chromosome.
  • See Interpreting Color by Pair Orientation for more detail.
    • Shades of green, teal, and dark blue show structural events of inversions, duplications, and translocations.
    • Color assignments depend on sequencing platform.
  • Other Color by options are described in the alignment track pop-up menu options

Translocations on the same chromosome can be detected by color-coding for pair orientation, whereas translocations between two chromosomes can be detected by coloring by insert size. See both by selecting the Color alignments by> insert size and pair orientation option.

Split Screen View

Split screen views can be invoked on-the-fly from paired-end alignment tracks. Right-click over an alignment and select View mate region in split screen from the drop-down list.  If the alignment clicked over does not have a mapped mate this option will be grayed out. You can select this option for mutliple alignments and view multiple panels side by side.

To return to a normal single screen view, right-click on the locus name at the top of the panel you wish to keep and select Switch to standard view. Alternatively, double-click the locus name.
 
You can control the view of each panel independently. Pan by click-dragging in the panel;  double-click to zoom in and alt-click to zoom out.

 

Interpreting Color by Insert Size

Coloring by insert size is for DNA alignments and is not designed to indicate RNA-Seq paired read mate distances. It is based on set base pair values or computed from the size distribution of a library against the reference genome as defined in the Alignment Preferences Panel.

The inferred insert size can be used to detect structural variants, such as:

  • deletions
  • insertions
  • inter-chromosomal rearrangements

IGV uses color coding to flag anomalous insert sizes. When you select Color alignments>by insert size in the popup menu, the default coloring scheme is:

  • for an inferred insert size that is larger than expected (possible evidence of a deletion)
  • for an inferred insert size that is smaller than expected (possible evidence of an insertion)
  • for paired end reads that are coded by the chromosome on which their mates can be found

Deletions

In a deletion a section of DNA is absent in the subject genome compared to the reference genome.

When pairs from a section of DNA spanning the deletion are aligned to the genome the inferred insert size will be larger than expected.  This is due to the deleted section of the genome, not present in the subject.  Schematically this can be visualized as follows:

So in the case of a deletion, the inferred insert size is GREATER THAN the expected insert size.   In IGV such an event might look like the following.

Reads that are colored red have larger than expected inferred sizes, and therefore indicate possible deletions.

Insertions

In the case of an insertion, a section of DNA is present in the subject genome that is not represented in the reference genome.

The effect on distance between aligned pairs is opposite in the case of a deletion; the "inferred insert size" is smaller than expected.

The maximum size of an insertion detectable by insert size anomaly is limited by the size of the fragments.  They must be long enough to span the insertion and include sequences on both ends that are mapped to the reference.  The maximum detectable size is approximately equal to:

fragment length - (2x read length)

Detection of this event is therefore more likely with larger fragment libraries, such as Illumina mate-pair (not paired-end) and SOLID.

In the example above reads that are colored blue have smaller than expected inferred sizes, and therefore indicate insertions.

Inter-chromosomal Rearrangement

IGV codes inserts for inter-chromosomal rearrangements.  For instance, in this case, one end is on chromosome 1 and the other is on chromosome 6.

Interpreting Color by Pair Orientation

The orientation of paired reads can be used to detect structural events including:

  • inversions
  • duplications
  • translocations

By selecting Color alignments>by pair orientation, you can flag anomalous pair orientations in IGV.

Orientation is defined in terms of read-strand: left versus right, and first read versus second read of a pair.

(figure courtesy of Bob Handsaker)

These categories only apply where both mates map to the same chromosome.

Inversions

An inversion is a large section of DNA that is reversed in the subject genome compared to the reference genome.

When an inversion shows up in paired-end reads, the reads are distinctively variant from the reference genome.

This appears in IGV as shown below.

Inverted Duplication

When a large section of DNA is duplicated and inserted into the genome in a reversed configuration compared to the original sequence, this is called an inverted duplication.

There will be overlapping left and right reads, and there will likely be altered coverage depth/copy number.

This appears in IGV as shown below.

Tandem Duplication

When a large section of DNA is duplicated and inserted into the genome next to the original sequence, this is called a tandem duplication.

 The reads will not only be duplicated, but also be arranged as shown below.

IGV will display this rearrangement as shown below.

Translocation on the Same Chromosome

When a large section of DNA is removed from one location and inserted elsewhere, that is a translocation. 

Translocations on the same chromosome can be detected by color-coding for pair orientation, whereas translocations between two chromosomes can be detected by coloring by insert size.

Interpreting Color by Bisulfite Mode

IGV v2.1 (released April 2012) and onwards offer a coloring by bisulfite mode option from the right-click pop-up menu for alignments. The six offered modes are summarized in the table, and are explained further on this page.

For a general overivew of viewing alignments in IGV, see Viewing Alignments.

Coloring by bisulfite mode supports visualization of DNA libraries that have undergone bisulfite conversion and sequencing. The mode supports visualization of alignments from the following and similar techniques:

  • BS-Seq, bisulfite sequencing
  • RRBS-Seq, reduced representation bisulfite sequencing
  • TAB-Seq, Tet-assisted bisulfite sequncing
  • NOMe-Seq
IGV Bisulfite mode description relevance
CG
  • CpG
  • Canonical methylation target site in eukaryotes
CHH and CHG
  • H represents any nucleotide but guanine (H comes after G)
  • CHH = CAA, CAT, CAC; CTT, CTA, CTC; CCC, CCA, CCT
  • CHG = CAG, CTG, CCG

Additional methylation sites:

  • For mammals in a cell-type dependent manner. Pervasive in human embryonic stem cells (Lister 2009). For example, in H1 stem cells comprises almost 25% of all cytosines at which DNA methylation is identified compared to 99.98% mCG in IMR90 cells.
  • In plants serve as additional canonical methylation target sites.
HCG
  • ACG, TCG, CCG
  • inclusive of WCG
GCH
  • GCA, GCT, GCC
WCG
  • W represents A or T (Weak)
  • ACG, TCG

When bisulfite sequence tracks are initially loaded, default coloring of mismatches against the reference will show red T's and green A's. When coloring is switched to bisulfite mode, two new coloring schema are applied and together allow you to visually distinguish read strand and bisulfite conversion status.

  1. Reads are colored by DNA strand. For paired reads, this is the same as coloring by first-of-pair strand.
    • Gray for forward reads or forward read first pairs (F1 or F1R2)
    • Sage for reverse reads or reverse read first pairs (R1 or R1F2)
  2. The chosen mode is highlighted in reads with a red or blue nucleotide corresponding to the position of the cytosine in the reference genome.
    • For forward reads, a red C denotes a nonconverted cytosine, implying methyl or other protection, while a blue T denotes a bisulfite converted cytosine.
    • For reverse reads, a red G denotes a nonconverted cytosine, implying methyl or other protection, while a blue A denotes a bisulfite converted cytosine.
    • In zoomed-out views, colored nucleotides are represented by colored lines.

Because not all mode matching sites are biologically relevant in the context of methylation, bisulfite experiments compare changes in methylation between a control sample and the variable. When comparing two samples, a change in methylation status will be marked by a difference in color for a given site. Red to blue indicates loss of methylation, or hypomethylation; blue to red indicates increased protection by methylation, or hypermethylation, as shown for the tumor sample in the screenshot below which visualizes data from Berman et al (2012).

Bisulfite sequencing (BS-Seq) identifies sites of DNA methylation

Coloring by bisulfite mode in IGV allows for visualization of alignments of BS-Seq reads, a DNA-modification technique used to distinguish sites of DNA methylation and hydroxymethylation in epigenetic studies. Alignments in IGV are against a reference genome of correct sequence as coloring is based on deviations from the reference sequence. Read alignment may have been against a bisulfite-transformed genome sequence, in which case genomic coordinates would still be for that of the original reference genome.

In DNA methylation, the methyl CH3 group is added to the cytosine base at the carbon 5 position (5-meC) in a sequence-context dependent manner. In mammals this context is typically CpG dinucleotides, and in plants this is CpG, CpHpG, and CpHpH di- and tri-nucleotides. These correspond to the CG, CHG, and CHH bisulfite coloring modes in IGV. The IUPAC ambiguity code H represents any nucleotide but guanine.

Promoter methylation is typically associated with repression, while genic methylation correlates with transcriptional activity.

BS-Seq exploits the different sensitivities of cytosine and 5-meC to bisulfite

Bisulfite modification exploits the different sensitivities of cytosine and 5-meC to deamination by bisulfite under acidic conditions. Cytosine undergoes conversion to uracil whereas 5-meC is unmodified and remains intact. The uracil is subsequently converted to thymine after PCR amplification while 5-meC residues remain cytosines.

A number of the different genome-wide methylome technologies use bisulfite chemistry and this IGV mode applies to those that in addition sequence the bisulfite converted DNA, such as by Illumina high-throughput sequencing. These include whole-genome bisulfite sequencing (WGBS) and reduced-representation-bisulfite sequencing (RRBS), both of which provide single-nucleotide resolution.

RRBS targets bisulfite sequencing to an enriched population of the genome while WGBS porportedly determines the methylation state of every cytosine in the target sequence. However, as with any technique limitations exist, including the inability to discriminate 5-meC from 5-hydroxymethylcytosine (5-hmeC) modifications, which was discovered to be pervasive in mammalian DNA in 2009 (Yu, Cell 2012).

Multiple techniques are used to distinguish 5-hmeC from 5-meC. Of relevance to coloring by bisulfite mode in IGV is TAB-Seq (Tet-assisted bisulfite sequencing), in which 5-hmeC sites are protected by glucosylation prior to bisulfite conversion. Because 5-meC sites remain unprotected from mTet1 oxidation to 5-carboxylcytosine (5-caC), and subsequent bisulfite conversion, only 5-hmeC site cytosines remain unchanged in reads (Yu, Nature Protocols 2012).

The following figure diagrams the nucleotide conversions that occur for a methylated versus unmethylated locus during bisulfite conversion and PCR, and IGV's corresponding coloring of these sites in CG bisulfite mode.

For a given DNA fragment, four strands arise after treatment and PCR amplification. These are the original top strand (OT), the original bottom strand (OB), and strands which are complementary to OT and OB  (CTOT and CTOB). IGV visualizes reads in one direction, and for the given direction reads from the opposite strand are automatically displayed as the reverse complement. Therefore, OT and CTOT reads are displayed in the reference-forward direction (gray) while OB and CTOB reads are displayed in the reverse direction (sage) and are differentially colored as indicated.

To sort reads by strand, use the right-click pop-up menu on the alignment track.

You can also infer the read-strand by the specific nucleotides that are highlighted by the mode.  OT and CTOT yield methylation information for cytosines on the top strand (C and T highlighted), while OB and CTOB will give methylation information for the paired complement, that is for guanines paired to the methylatable cytosines (G and A highlighted).

NOMe-Seq additionally determines nucleosome positioning

In addition to detecting methylation states, bisulfite conversion is used in footprinting studies. For example to determine nucleosome positioning in yeast and mammalian cells.

The additional IGV color modes--HCG, GCH, and WCG (diagram)--are relevant to NOMe-Seq, a genome-wide nucleosome footprinting and methylome sequencing method (Kelly 2012). This method obtains nucleosome positioning information based on the GpC methyltransferase M.CviPI accessibility to GpC sites, and at the same time obtains endogenous DNA methylation information from CpG sites.

  • GCH cytosines are used to plot enzyme accessibility, that is, nucleosome protection or occupancy.
  • HCG cytosines are used for endogenous methylation. GCG is excluded due to ambiguity between endogenous and enzymatic methylation.
    • The authors note that in their context, GCGs represent less than 0.24% of the genome and make up only 5.6% of all GC dinucleotides.
    • In addition, 93.4% of GCG trinucleotides have a GCH within 20 bp (and half of these within 5 bp) from which nucleosome occupancy information can be derived.
  • Authors use WCG instead of HCG in certain calculations to exclude off-target activity of M.CviPI on CCG sites.

References

Berman, Benjamin P, Daniel J Weisenberger, Joseph F Aman, Toshinori Hinoue, Zachary Ramjan, Yaping Liu, Houtan Noushmehr, et al. 2012. “Regions of Focal DNA Hypermethylation and Long-Range Hypomethylation in Colorectal Cancer Coincide with Nuclear Lamina-Associated Domains.” Nature Genetics 44 (1): 40–46. doi:10.1038/ng.969.

Kelly, Theresa K, Yaping Liu, Fides D Lay, Gangning Liang, Benjamin P Berman, and Peter a Jones. 2012. “Genome-Wide Mapping of Nucleosome Positioning and DNA Methylation within Individual DNA Molecules Genome-Wide Mapping of Nucleosome Positioning and DNA Methylation within Individual DNA Molecules,” 2497–2506. doi:10.1101/gr.143008.112.

Lister, Ryan, Mattia Pelizzola, Robert H Dowen, R David Hawkins, Gary Hon, Julian Tonti-Filippini, Joseph R Nery, et al. 2009. “Human DNA Methylomes at Base Resolution Show Widespread Epigenomic Differences.” Nature 462 (7271). Nature Publishing Group: 315–22. doi:10.1038/nature08514.

Stirzaker, Clare, Phillippa C. Taberlay, Aaron L. Statham, and Susan J. Clark. 2014. “Mining Cancer Methylomes: Prospects and Challenges.” Trends in Genetics 30 (2). Elsevier Ltd: 75–84. doi:10.1016/j.tig.2013.11.004.

Yu, Miao, Gary C Hon, Keith E Szulwach, Chun-Xiao Song, Peng Jin, Bing Ren, and Chuan He. 2012. “Tet-Assisted Bisulfite Sequencing of 5-Hydroxymethylcytosine.” Nature Protocols 7 (12): 2159–70. doi:10.1038/nprot.2012.137.

Yu, Miao, Gary C Hon, Keith E Szulwach, Chun-Xiao Song, Liang Zhang, Audrey Kim, Xuekun Li, et al. 2012. “Base-Resolution Analysis of 5-Hydroxymethylcytosine in the Mammalian Genome.” Cell 149 (6): 1368–80. doi:10.1016/j.cell.2012.04.027.

Splice Junctions

IGV supplements each alignment track with (1) a coverage track and (2) if selected in the Alignment Preferences panel, a default splice junctions track. This page describes the default junctions track as well as independently loaded junctions data in the standard .bed format. See Sashimi Plot for how to derive and manipulate interactive junction visualizations within IGV. 

When enabled, IGV dynamically computes the junctions track from alignment data. The junctions track displays arcs connecting alignment blocks from a single read.  For RNA data these connections normally arise from splice junctions, thus the name Splice Junction Track.

Each splice junction is represented by an arc from the beginning to the end of the junction.

  • When available, IGV uses the "XS" tag provided by the alignment to determine strandedness. If missing, strandeness is inferred from the read strand. For paired-end data the strand of the alignment marked "first in pair" is used.   
  • Junctions from the + strand are colored red and extend above the center line. 
  • Junctions from the – strand are blue and extend below the center line.
  • The height of the arc, and its thickness, are proportional to the depth of read coverage up to 50 reads (first image).
    • Display a more proportionate representation by selecting Autoscale from the right-click menu (second image).

Hovering the mouse over or clicking on a junction will display coverage information. The first screenshot shows multiple coverage detail panels for each three components of two splice junctions on opposite strands.

  • Read depth for each end of the junction is displayed. For the red junction below, starting flank depth is 109 reads and ending flank depth is at 6606 reads.
  • Other details for a given junction's three hover elements are the same.

 

Right-click pop-up menu options for Junction tracks

Menu options are as detailed for the Feature tracks menu with the following additions or differences.

Command Description

Collapsed
Expanded
Squished

Tracks are collapsed by default. The expanded mode breaks up the junctions track to multiple junctions tracks to minimize visual overlap. IGV does not interpret isoform information.
Autoscale The height of the arc, and its thickness, are proportional to the depth of read coverage.
  • By default, all junctions with more than 50 reads have the same thickness.
  • Select Autoscale to display a more proportionate representation.
Sashimi Plot Displays junctions information for regions within the current IGV view in a new panel with additional options. See Sashimi Plot for details.
Export Features Download junctions track from IGV as a .bed file.

Example showing differential splicing

  • Start IGV and make sure Show junction track is checked in the Alignment Preferences panel and the Visibility range threshold is set to 500.
  • Load the Human hg19 genome.
  • Select File > Load from Server. In the popup window select Tutorials > RNA-Seq (Body Map) Heart and Liver.
  • Enter SLC25A3 in the search bar to see an instance where the third exon is differentially spliced for the two tissues (Screenshot 2015.4.15).
    • Here we have colored alignments by XS tag. The library was unstranded, and XS tag values were assigned to reads crossing junctions (in pink) using a predefined transcriptome index.

Enable junctions view for .bed files

The splice junction view  can also be loaded indpendent of alignments by using a modified bed format,  derived from the "junctions.bed" file produced by the TopHat program. Display details are as described in the section above.

  • This view is enabled by including a track line that specifies either name=junctions or graphType=junctions.
  • TopHat's "junctions.bed" file includes a track line specifying name=junctions by default, so no action is required for these files.

Junction files should be in the standard .bed format.  The score field is used to indicate depth of coverage.

 

Sashimi Plot

Sashimi plots visualize splice junctions from aligned RNA-seq data and a gene annotation track. IGV displays the Sashimi plot in a separate window and allows for more manipulations of the plots than the junctions track

  1. To view a Sashimi plot of your alignment data, first zoom out the view to contain the entire region of interest as scrolling and zooming in the Sashimi plot will be limited to this initial region.
  2. Right click on the alignment track to bring up the pop-up menu, and select Sashimi Plot.
  3. Select one feature track to serve as the annotation.
    1. If there is only one possible feature track, e.g., the default RefSeq genes track loaded with the reference genome, then it is automatically loaded.
    2. If you loaded additional feature tracks, IGV presents a dialog for you to select one for the new plot.
  4. IGV prompts you to select which alignment tracks you would like to view as Sashimi plots. Select any number and press OK.

The Sashimi plot is displayed in a separate window. The coverage for each alignment track is plotted as a bar graph. Arcs representing splice junctions connect exons. Arcs display the number of reads split across the junction (junction depth). Genomic coordinates and the gene annotation track are shown below the junction tracks.

  • Hovering the mouse over each of the exons in the feature annotation track displays additional information in a yellow tooltip. 
  • Zoom in using the + button at top, and scroll by click-dragging the panel.
  • To view only those junctions which overlap a particular exon, select that exon by clicking on it.
    • Multiple exons can be selected using ctrl + <click> and they will be highlighted as white boxes.
    • To clear selections, click on a blank area of the annotations section of the panel.

Static images of Sashimi plots can also be generated outside IGV with sashimi_plot, a Python tool which is part of the MISO package. Read more about sashimi_plot here.

Popup Menu Options

Command Description

Set Exon Coverage Max

  • Set the minimum and maximum data range for the track to display.
  • Option to log scale.
  • Data range is shown in brackets in the top left of each track.
  • This option can be set on individual tracks.

Set Junction Coverage Min

  • Set minimum junction depth to include in the display.
  • This option can be set on individual tracks.
Set Junction Coverage Max
  • The thickness of each junction line will be proportional to the coverage, up to this value.
  • This option can be set on individual tracks.
Set Color
  • Change the color of the track.
  • This option can be set on individual tracks.
Show Exon Coverage Data
  • Selected by default.
  • Deselect to remove exon coverage data and data track range labels.

Text
Circle
None

  • Text is default and displays the junction depth in text number for each arc, as shown in the screenshot above.
  • Circle replaces the text with a solid circle amenable to labeling.
  • None removes all labels.
Combine Strands
Forward Strand
Reverse Strand

A junction's strandedness is determined by the BAM file XS tag value for the split read. How you assigned the XS tag values to the reads determines whether you potentially display novel junctions or display junctions reflecting previously determined junction annotations. See the Splice Junctions page for more details.

  • Combine Strands is default and shows both + and – strand junctions.
  • Forward Strand displays only + strand junctions.
  • Reverse Strand displays only – strand junctions.
Save Image
  • Save the Sashimi plot to an image file. Specify the file format by setting the filename extension in the file save dialog to .png, .jpeg, .jpg, or .svg. 

 

Viewing Variants

VCF (variant call format) and MAF (mutation annotation file) file formats display variations in sequence. Links above take you to details for each format in the File Formats guide. Links below detail visualizing each type of file in IGV.

MAF Mutation Files

MAF (mutation annotation format) files display mutations. IGV recognizes text-based files with .maf, .maf.txt  file extensions as mutation files. 

IGV will visualize each individual sample's mutation data as a single track.

  • The all chromosomes view summarizes mutations in a coverage track (Screenshot below, 2015.02.18).
  • Zooming in, individual chromosome views and more detailed views mark sites of mutations with open rectangles. Default settings display these in black-and-white.
  • Color code mutations by mutation type (Screenshot above, 2015.02.18) by checking the Color code mutations box under View>Preferences>Mutations. See the Preferences page for more details.
  • Mouse-over or click on a mutation to bring up an information panel on the specific mutation. This panel displays the information provided in the mutation file columns, in order, up to an area limit.
  • A site where both alleles are mutated, or is mutated in multiple samples in a track that is a conglomerate of multiple samples, displays the rectangle with a horizontal line through the middle.

 

VCF Variant Files

VCF stands for Variant Call Format, and this file format is used to encode genetic variant sites and genotypes. The file includes informatioThe format is further described at https://samtools.github.io/hts-specs/

Viewing a VCF File with Genotypes

The section on the top displays each variant site.
If the file also includes sample data (optional), each sample is displayed as a row of genotypes at the variant sites specified at the top. Dark blue = heterozygous, Cyan = homozygous variant, Grey = reference.  

If a file has more than 10 genotypes, the VCF file will be opened in its own pane, with a scroll bar, as shown below.

VCF Popup Menu

To see the options for changing the view of your VCF file, right-click on a variant.  Some of the options are specific to the variant selected. Find more details on the menu options on the Pop-up Menu page. 

To change the genomic window size at which VCF data is loaded, right-click and select Set Feature Visibility Window...

To change the color coding of the plot, select Color By>Allele.

The Sort Variant By options allow you to sort the set by a trait of a specific variant.  You can select the sort twice for the same variant to flip it, i.e., if you sort depth, it sorts from high to low; select the depth sort a second time to sort from low to high.

The Display Mode changes what you can see of the data:

Expanded shows the genotypes at the usual row height, with the sample names in the first column.

Squished shows the genotypes with the rows compressed to maximize the data visible on the page.

You can also adjust the height of the squished row by right-clicking and selecting Change Squished Row Height. You can change the height of the rows in the window provided.

 

Viewing a VCF File Without Genotypes

If you open a VCF file that does not contain genotypes data, the view will be different, displaying only the bars marking the calls, as shown below.

If the file does contain the sample genotypes, you can hide the genotype information and display only the sites (as in the view below) by unchecking Show Genotypes in the right-click popup menu.

Multi-Locus View

By default, IGV displays one contigous genomic region, but multiple loci can also be displayed side-by-side in split panes. There is no set limit on the number of loci, but if the IGV window is split into too many panes, each one will be too small to be useful. Enter multi-locus view by:

or

or

or

To change the size of the flanking region around the gene displayed, before loading data go to View>Preferences>General>Feature flanking region and enter the base pairs or percent to display on either side of each locus.

The following screenshot shows a multi-locus view of segmented copy number data. The IGV data panel has been split into 6 separate vertical panels displaying the regions containing the genes KRAS, MYC, RAC1, RAC2, RAC3, and RAF1. All the panels display the same set of data tracks. 

 

Removing or Rearranging Panels

To remove a panel, right-click on the panel header and select Remove panel.

Panels can be rearranged by drag and drop.  Click on the white header bar at the top of the panel and drag it to its new position.  For example, in the figure below KRAS has been dropped between RAC1 and RAC2.

Changing the View in a Panel

The zoom slider in the toolbar is disabled in multi-locus view. However, you can double-click in a panel to zoom in the view in that panel (or alt-click to zoom out). Click-dragging will also pan the view in the panel.  

To return to the original view of the locus specified, right-click the name header at the top of the panel you want to reset and select Reset panel to '[gene name or locus]'.

To return to the normal single-locus view, double-click the name header at the top of any of the panels, or right-click in a header and select Switch to standard view.

Sorting Tracks by Panel

Right-click in the panel header to bring up the sort menu. This menu will vary depending on data type.

The following image illustrates what happens if you select Sort by amplification in the KRAS panel.

 

Gene Lists

To view a gene list or define a new one, select Regions >Gene Lists....


This opens a window for selecting an existing list or creating a new list. 

To view an existing gene list in multi-locus view, select a name in the List column of any Group and click View. IGV informs you of items that cannot be mapped to the current reference genome and continues on to display loci with matches.

My Gene Lists

You can click Import to upload a text file containing your own gene list. Load lists of genes or loci in GMT, GRP and BED format. For example, find and download GMT files from the Molecular Signatures Database

You can also click New to create a new gene list. This opens a dialog in which you can enter a name, description, and your list of genes or regions.



New and imported lists will appear under the My lists group and are saved for continued future access in the lists subfolder in the igv folder installed in your home directory.

Regions of Interest

Regions of interest (ROI) are intervals defined by the user. They are marked in red below the ruler. Hovering the mouse over this red region displays lines that demarcate the ROI down the panels. Clicking on the red highlight pops up a menu for options that include sorting by various data-specific metrics and copying to the clipboard or BLAT searching the corresponding reference sequence (Screenshot 2015.05.05).

This page outlines three ways to define a region of interest--(1) by mouse, (2) by keyboard shortcut, and (3) by using the Region Navigator. The fourth section describes right-click menu options.

Define by mouse

On the tool bar, click the Define a Region of Interest icon:

In the data panel, single click the start of the region and then the end of the region. Do NOT click and drag. IGV displays lines delimiting the region of interest for the first and second click, then marks the region in red under the ruler.

This options works for single bases so long as the reference sequence resolves bases in view. The reference sequence appears above the annotation track when zoomed in as described in the Sequence Track Options page.

Define by keyboard shortcut

Display the region of interest to fill the entire view and press Control + R.

Region Navigator features

To open the Region Navigator select Regions>Region Navigator from the menu bar. The Region Navigator lists defined regions of interest (ROI) in a sortable table as shown in the Screenshot (2015.05.06).

The Description field is blank until filled by you. To input a description, either right-click on the ROI and select Edit description from the menu or double-click the field in the Region Navigator.

The following table summarizes the features available from the Region Navigator.

To select an ROI from the list, click on it. Select multiple ROIs from the list by holding down a keyboard key and clicking by mouse, e.g. Shift + mouse-click for consecutive rows or [Mac/PC] Command/Control + mouse-click to select individual rows.

  Region Navigator feature Description
Define ROIs Add Add the currently displayed region in its entirety to the list.
Delete Remove the selected ROI from the list.
Double-click cells under Start, End, or Description The cell will be boxed as shown in the Screenshot above. Edit cell content.
Sort list Show All Chrs Uncheck or check to limit the list to loci on the displayed chromosome or all chromosomes.
Click a column header, e.g. Chr Sort table by ascending or descending alphanumeric order.
Search and Clear Search Type a search term on which to filter the displayed list. To remove the filter, click Clear Search.
Navigate to ROIs View Navigate the display to the selected ROI. If multiple ROIs are selected in the navigator, the loci display in split panes.
Zoom to Region Uncheck to keep the current zoom level when navigating to a new ROI. Check to ensure IGV adjusts the zoom level to display the entire ROI when navigating to the new ROI.

Region of Interest options

Click the red bar under the ruler to display the region of interest (ROI) context menu for the following options.

Menu option Description
Sort

Sort based on data values within the ROI. Sort options vary with data type and may not be available for regions of interest for certain file types, e.g. alignment or VCF tracks, for which sort options are available via feature pop-up menu.

Scatter Plot Available for continuous value data, e.g. gene expression, copy number, and methylation data. See the Scatter Plots section for details.

Zoom

Center and zoom the display to the ROI.

Edit description

Input a short description for the ROI.

Copy sequence

Copy the reference sequence to the clipboard.

BLAT sequence BLAT search the section of the reference sequence against the entire reference genome. See the BLAT Search page for details.
Delete

Removes the region of interest.

 

Sample Attributes

Attributes can be associated with tracks and used for filtering, sorting, and grouping data.  By default all tracks have at least 3 attributes: Data File, Data Type, and Name. To display additional attributes, load a sample attribute file.  IGV displays attribute names and values in the attributes panel.

Color-Coded Attribute Values

IGV uses color-coded blocks to represent the attribute values.

Showing and Hiding Attributes

To show or hide selected attributes:

  1. Click View>Select Attributes to Show. IGV displays a list of attributes.
  2. Select (or clear) an attribute’s check box to show (or hide) the attribute.
  3. Click OK. IGV updates the display to show only the selected attributes.

To show or hide all attributes:

 

Sorting, Grouping, and Filtering

By default, IGV displays tracks in the order in which they are loaded (i.e., the order of the data in the files). Alternatively, it is possible to sort the tracks by attribute, region of interest, or track list. You can also group or filter tracks.

Sorting by Attribute

If tracks are grouped, IGV sorts the tracks in each group. To sort groups by attribute, first sort the ungrouped tracks by the desired attributes, then group the tracks.

To sort tracks based on an attribute value:

Alternatively, use the Sort Tracks command for additional options:

  1. Click Tracks>Sort Tracks. IGV displays the Sort window:
  2. Select the attributes to sort by and whether to sort based on ascending or descending values.
  3. Click Ok.

Sorting by Region of Interest

If tracks are grouped, IGV sorts the tracks in each group. It then sorts the groups using a composite score for the group, which IGV defines as the maximum score from the tracks in that group.

To sort tracks in the data panel based on a region of interest:

  1. Define a region of interest on the genome, as shown below.
  2. Click the red bar above the defined region and select an option from the pop-up menu:
    • Sort by amplification: Affects tracks of copy number data. Sorts tracks based on copy number values in this region, from highest to lowest.
    • Sort by deletion: Affects tracks of copy number data. Sorts tracks based on copy number values in this region, from lowest to highest.
    • Sort by breakpoint amplitudes: Affects tracks of copy number data.  Sorts tracks by the sum of the absolute value of copy number changes within the region.
    • Sort by expression: Affects tracks of gene expression data. Sorts tracks based on gene expression in this region, from highest to lowest.
    • Sort by value: Sorts tracks based on the values of the track data in this region, from highest to lowest.
    • Delete: Removes this region-of-interest annotation.

Sorting and Filtering by Track List

To display selected tracks in a specific order:

  1. Create a sample information file that contains two columns. The first column must be labeled 'Array' and lists the tracks that you want to display. The second column can be labeled 'Order' (or any other label) and lists the sort order for the tracks.
    For example, the following sample information file provides a track list for use with segmented_data_080520.seg:
    Array Order
    primary_GBM_10 a
    primary_GBM_20 b
    primary_GBM_30 c
    primary_GBM_40 d
    primary_GBM_50 e
    primary_GBM_60 f
    primary_GBM_70 g
  2. Load the data file and the sample information file that you created in step 1.
  3. Apply a filter to display only those tracks that contain a value for the Order attribute. To do so, select Tracks>Filter Tracks and apply the following filter:
    Order is not equal to <leave the text field blank>
  4. Sort the tracks based on the Order attribute. To do so, select Tracks>Sort Tracks and select the Order attribute.

Grouping Tracks by Attribute

To group tracks by attribute:

Filtering Track Data

You can filter track data to display only tracks that meet certain criteria.

To filter tracks:

  1. Select Tracks>Filter Tracks. IGV displays the Filter Tracks window:
     
  2. Enter a filter criterion by selecting an attribute from the first drop-box and an operator from the second drop-box and entering an attribute value in the text box.
  3. Optionally, add additional criteria:
    • Click the plus (+) button to enter another filter criterion.
    • Click the minus (-) button to remove a filter criterion.
  4. Determine how the criteria should be combined by choosing one of the following, at the top of the window:
    • Match all of the following to combine the criteria using a logical AND
    • Match any of the following to combine the criteria using a logical OR
  5. Click OK to apply the filter.

To clear the filter:

  1. Select Tracks>Filter Tracks.
  2. At the top of the Filter Tracks window, click Show all tracks, then click OK

Saving and Restoring Sessions

Saving and Restoring

You can save the current state of an IGV session to a named session file. You can use that file to restore the IGV session yourself or share it with colleagues, as long as they have access to the session file and any data files that were loaded when the session file was saved. For example, if the data files are loaded into IGV from a shared directory and the IGV session file is saved to that shared directory, anyone with access to the directory can restore the saved IGV session.

To save a session:

  1. Click File>Save Session.
  2. In the Save Session window, select a directory and session file name and click OK.

To restore a saved session:

  1. Click File>Open Session.
  2. In the Open Session window, select a session file and click OK. IGV ends the current session and restores the saved session.

Session File

IGV Version 1.5 and Greater

Overview

Sessions are an integral part of IGV, allowing users to share their data and views with other users simply and accurately.  Session files describe the session in XML. If you wish to manually create or edit a session file, use the information below to better understand the components of each session file.

Session XML Hierarchy
Description of Session Components

Required - These elements are required in a session file. All session files must follow XML standards.

Optional  - These elements are optional in a session file and are added by IGV to help determine the placement of the data and visual style choices.

Session Example

The XML below is an example of a simple Session created by IGV

------------------------------------------------------------------------------------------------------------------------------

<?xml version="1.0" encoding="UTF-8"?>

<Global genome="hg18" locus="All" version="3">

<Resources>

<Resource url="http://genome.cse.ucsc.edu/cgi-bin/hgTrackUi?g=rnaGene" label="RNA Genes" name="RNA Genes" path="http://www.broadinstitute.org/igvdata/annotations/hg18/rna_genes.bed"/>

<Resource url="http://genome.cse.ucsc.edu/cgi-bin/hgTrackUi?g=wgRna" label="sno/miRNA" name="sno/miRNA" path="http://www.broadinstitute.org/igvdata/annotations/hg18/sno_mirna.bed"/>

</Resources>

<Panel height="445" name="DataPanel" width="1000">

<Track color="0,0,178" colorScale="ContinuousColorScale;0.0;20.0;255,255,255;0,0,178" displayName="Non coding RNA" expand="false" height="45" id="http://www.broadinstitute.org/igvdata/annotations/hg18/rna_genes.bed" name="RNA Genes" renderer="BASIC_FEATURE" visible="true" windowFunction="count">

<DataRange baseline="0.0" drawBaseline="true" flipAxis="false" maximum="20.0" minimum="0.0" type="LINEAR"/>

</Track>

<Track color="0,0,178" colorScale="ContinuousColorScale;0.0;20.0;255,255,255;0,0,178" displayName="sno miRNA" expand="false" height="45" id="http://www.broadinstitute.org/igvdata/annotations/hg18/sno_mirna.bed" name="sno/miRNA" renderer="BASIC_FEATURE" visible="true" windowFunction="count">

<DataRange baseline="0.0" drawBaseline="true" flipAxis="false" maximum="20.0" minimum="0.0" type="LINEAR"/>

</Track>

</Panel>

<Panel height="65" name="FeaturePanel" width="1000">

<Track color="0,0,178" colorScale="ContinuousColorScale;0.0;20.0;255,255,255;0,0,178" displayName="RefSeq genes" expand="false" height="30" id="Genes" name="Genes" renderer="BASIC_FEATURE" visible="true" windowFunction="count">

<DataRange baseline="0.0" drawBaseline="true" flipAxis="false" maximum="20.0" minimum="0.0" type="LINEAR"/>

</Track>

</Panel>

</Global>

 

Server Configuration

Configuring a Genome Server

Upload a new genome to your own server to share with others:

  1. Use the steps described in Loading a Genome to create an initial .genome file and sequence directory.  The sequence directory contains a file for each chromosome/contig sequence in the genome. 
  2. Copy the sequence directory to a web-accessible directory.
  3. Copy the .genome file (which is a zip archive) to a temporary directory and unzip it.
  4. Remove the .genome file from the temporary directory.
  5. Open the property.txt file in a text editor and edit the following properties:
    • id: Unique identifier for the genome (for example, "hg18"). This property is input to some commands in igvtools. 
    • name: This is the name that appears in the pull-down genome list in IGV.
    • sequenceLocation: http:// <path to sequence directory> (this is the location of the sequence directory discussed in step 1 & 2)
  6. Zip all the files in the temporary directory, including the edited property.txt file, and name the resulting archive <id>.genome.  This naming convention is mandatory for igvtools.
  7. Copy the <id>.genome file to a web-accessible directory.
  8. To make your genome appear in the users' pull-down list, create a genomes list file. The IGV default genomes list file can be used as a starting point. Each line in the genomes list is formatted as follows:
    Format <name>  *tab*  <URL of the .genome file>  *tab*  <id>
    Example Human hg18         http://www.broadinstitute.org/igvdata/genomes/hg18.genome         hg18
  9. To test in IGV, change the genome URL in the Advanced preferences tab (View>Preferences) to: http://<path to your genomes.txt file>.  You must quit IGV and restart for this preference to take effect. The genome should appear in the drop-down list.

 

Configuring a Data Server

 

By default, the File>Load from Server option in IGV provides access to public datasets stored on the IGV data server. You can host your own web accessible datasets by creating server registry and configuration files.

 To create a custom load from server menu

  1. For each top level node in the hierarchy, create an XML file that describes the datafiles accessible under that node.  Each datafile is specified by a Resource element which has 2 attributes: a name to be displayed to the user and a URL to the file.   The resources can be organized into categories, which in turn can be nested to form a tree structure.  An example follows.

    <?xml version="1.0" encoding="UTF-8"?>
    <Global name="Example Project"  infolink="http://www.broadinstitute.org/igv/" version="1">
        <Category name="Category">
            <Category name="Subcategory One">
            <Resource name="pi.12mer.tdf"
                      path="http://www.broadinstitute.org/igvdata/annotations/hg18/conservation/pi.12mer.wig.tdf" />
            </Category>
            <Category name="Subcategory One">
            <Resource name="omega.12mer.tdf"
                      path="http://www.broadinstitute.org/igvdata/annotations/hg18/conservation/omega.12mer.tdf" />
            </Category>
        </Category>
    </Global>

    The
    infolink attribute, which displays an information link for the resource, may be included in any <Global> or <Category> element.
     
  2. Create a registry (text) file that lists the XML files. For example:

    http://www.broadinstitute.org/igvdata/annotations/hg18/hg18_annotations.xml
    http://www.broadinstitute.org/igvdata/tcga_external.xml
    http://www.broadinstitute.org/igvdata/mmgp.xml
    http://www.broadinstitute.org/igvdata/epigenetics_public.xml
    http://www.broadinstitute.org/igvdata/1KG/1KG.xml
    http://www.mycompany.org/igvdata/example_project.xml

  3. In IGV, on the Advanced tab of the Preferences window, update the Data Registry URL to point to your registry file.

IGV points to exactly one data registry file. If you'd like your data server to provide access to the public datasets on the IGV data server, include them in your registry file, as shown above.

 

Password Protected Directories

Overview

When a user enters a password-protected URL, IGV prompts for a user name and password. If the username/password combination is incorrect, IGV will continue to ask the user to authenticate until the combination is entered correctly or the user clicks Cancel.

Setting up password protection on an Apache server

There are many ways to set up a password-protected site.  The following describes one method of handling this on an Apache server using an ".htaccess" file.

Setting up a password requires:

  • an Access File
  • a Password File

The Access File (.htaccess) is located in the restricted directory.  It should contain the following information:

AuthUserFile /home/[path]/. htpasswd
AuthName "Private IGV Folder"
AuthType Basic
Require valid-user

The first line should contain the path to the Password File.

The Password File (.htpasswd) should be placed in a directory that is accessible internally, not through the web. This is can be the home directory, but it must be a location that is not externally visible.  An example password file might look like this:

user1:kJs1GPxWtLet2

The file contains the usernames and passwords for all authenticated users, with one user per line. In the example line,  the username is "user1" and the password is "kJx1GPxWtLet2," which is an encrypted password representing the human-readable word, "password."

To make the authentication lines, users can contact IT staff or use one of several websites that help generate them.  The one used for this line was http://www.kxs.net/support/ htaccess_pw.html. This website provides a string that can be used in the .htpasswd file.

To test use a web browser to access a file in  the password-protected directory URL.   You should be prompted for a username and password.

Example

The following files are password protected using the procedure described above. Try loading into IGV using File>Load from URL:. You will be prompoted for a username and password, enter "guest" for username and "password" for password.

  • http://www.broadinstitute.org/igvdata/private/SignalK562H3k36me3.tdf
  • http://www.broadinstitute.org/igvdata/private/cpgIslands.hg18.bed

 

 

Motif Finder

Motif Finder

Search for a particular nucleotide sequence in the reference genome. The results are displayed as features in two new tracks. By default, the results from the positive strand are displayed in blue, and results from the negative strand in red. To change the color, right-click on the track and select Change Track Color... from the pop-up menu.

Steps:

1. Bring up the motif finder dialog, via Tools>Find Motif...

2. Enter the sequence for which to search, using one of following three formats:

3. Enter names for the feature tracks that will show where the sequence matches the positive and negative strands of the reference genome. 

Example

Since we entered a short sequence, it gets a large number of hits. Looking at the results directly upstream of the gene GBP4, we see a match on the postive strand and two on the negative strand. Note that by default, the search result tracks are displayed in Expanded mode, so you can see overlapping matches.

 

igvtools

The igvtools utility provides a set of tools for pre-processing data files. File names must contain an accepted file extension, e.g. test-xyz.bam. Tools include:

The igvtools can be run (1) directly from IGV or (2) downloaded as a separate utility and run from the command line.

Running igvtools from the Command Line

Downloading igvtools

The igvtools utilities can be downloaded from the Downloads page on the IGV website. 

Starting with shell scripts

The igvtools utilities can be invoked, with or without the graphical user interface (GUI), from one of the following scripts:

     igvtools (command-line version for Linux and  MacOS 10.x)
     igvtools_gui (GUI version for Linux and  MacOS 10.x)
     igvtools_gui.command (alternate double-clickable GUI version for MacOS 10.x)

     igvtools.bat (command-line version for Windows)
     igvtools_gui.bat (GUI version for Windows)

Once the GUI version has been launched, the commands and options are the same as when you run igvtools from the IGV interface.

The general form of the command-line version is:

     igvtools [command] [options][arguments]     --or--     igvtools.bat [command] [options][arguments]

Recognized commands, options, arguments, and file types are described below.

Starting with Java

igvtools can also be started directly using Java.  This option allows more control over Java parameters, such as the maximum memory to allocate.  In the example shown below, igvtools is started with 1500 MB of memory allocated and launched in the location where you have unpacked IGVTools.

      java -Xmx1500m --module-path=lib @igv.args --module=org.igv/org.broad.igv.tools.IgvTools [command] [options][arguments]

To start with a GUI the command is 

     java -Xmx1500m --module-path=lib @igv.args --module=org.igv/org.broad.igv.tools.IgvTools gui

Note that the command line has become more complex with Java 11 compared to Java 8.  We recommend the shell scripts above for most users.

Memory settings

The igvtools scripts allocate a fixed amount of memory.  If this is more than the amoount available on your platform, you will get an error along the lines of "Could not start the Virtual Machine".  If this happens you will need to edit the scripts to reduce the amount of memory requested, or use the Java startup option.  The memory is set via a "-Xmx" parameter. For example  -Xmx1500m requests 1500 MB,  -Xmx1g requests 1 gigabyte.

For more information about other settings, see the downloaded file igvtools_readme.txt.

Commands

toTDF

The toTDF command converts a sorted data input file to a binary tiled data (.tdf) file. Use this command to pre-process large datasets for improved IGV performance. 

Supported input file formats are: .wig, .cn, .snp, .igv, and .gct.

Note: This tool was previously known as tile

Usage:

          igvtools toTDF [options]  [inputFile] [outputFile] [genome]

Required arguments:

          inputFile    The input file (see supported formats above).

          outputFile   Binary output file.  Must end in ".tdf".

          genome      A genome id or path to a .chrom.sizes or .genome file.  Default is hg18.

Options:

 -z num  

Specifies the maximum zoom level to precompute. The default value is 7 and is sufficient for most files. To reduce file size at the expense of IGV performance this value can be reduced.

  -f  list    

A comma delimited list specifying window functions to use when reducing the data to precomputed tiles.   Possible values are min, max, and mean.  By default only the mean is calculated.

-p file   

Specifies a .bed file to be used to map probe identifiers to locations.  This option is useful when preprocessing . gct files.  The .bed file should contain 4 columns:
                          chr start end name
where name is the probe name in the .gct file.

Example:

          igvtools toTDF -z 5  copyNumberFile.cn copyNumberFile.tdf hg18

Notes:

Data file formats, with the exception of .gct files, must be sorted by start position.  Files can be sorted with the sort command described below.  Attempting to preprocess an unsorted file will result in an error.

Count

The count command computes average feature density over a specified window size across the genome. Common usages include computing coverage for alignment files and counting hits in ChIP-seq experiments. By default, the resulting file will be displayed as a bar chart when loaded into IGV.

Supported input file formats are: .sam, .bam, .aligned, .psl, .pslx, and .bed.

Usage:

          igvtools count [options] [inputFile] [outputFile] [genome]

Required arguments:

          inputFile    The input file (see supported formats above).

         outputFile   The output file, which can be binary .tdf or ASCII .wig format.

The output filename must end in ".tdf" or ".wig", or be the special string "stdout". To indicate that you want to output both a .tdf and a .wig file, list both output filenames as a single string, separated by a comma with no other delimiters. If the output file is named "stdout" the output will be written to the standard output stream in .wig format.

          genome      A genome id or path to a .chrom.sizes or .genome file.  Default is hg18.

Options:

-z, --maxZoom num

Specifies the maximum zoom level to precompute.

-w, --windowSize num

The window size over which coverage is averaged. Defaults to 25 bp.

 -e, --extFactor num

The read or feature is extended by the specified distance in bp prior to counting. This option is useful for chip-seq and rna-seq applications. The value is generally set to the average fragment length of the library minus the average read length.

--preExtFactor num

The read is extended upstream from the 5' end by the specified distance.

--postExtFactor num

Effectively overrides the read length, defines the downstream extent from the 5' end. Intended for use with preExtFactor.

-f, --windowFunctions list

A comma delimited list specifying window functions to use when reducing the data to precomputed tiles. Possible values are min, max, mean, median, p2, p10, p90, and p98. The "p" values represent percentile, so p2=2nd percentile, etc.

--strands [arg]

By default, counting is combined among both strands. This setting outputs the count for each strand separately. Legal argument values are 'read' or 'first'. 'read' Separates count by 'read' strand, 'first' uses the first in pair strand".  Results are saved in a separate column for .wig output, and a separate track for TDF output.

--bases

Count the occurrence of each base (A,G,C,T,N). Takes no arguments. Results are saved in a separate column for .wig output, and a separate track for TDF output.

--query [querystring]

Only count a specific region. Query string has syntax <chr>:<start>-<end>. e.g. chr1:100-1000. Input file must be indexed. 

--minMapQuality [mqual]

Set the minimum mapping quality of reads to include. Default is 0.

--includeDuplicates

Include duplicate alignments in count. Default false. If this flag is included, duplicates are counted. Takes no arguments

--pairs

Compute coverage from paired alignments counting the entire insert as covered. When using this option only reads marked "proper pairs" are used.

Example:

          igvtools count -z 5 -w 25 -e 250 alignments.bam  alignments.cov.tdf  hg18

Notes: 

The input file must be sorted by start position. See the sort command below.

Index

Creates an index for an alignment or feature file. Index files are required for loading alignment files into IGV, and can significantly improve performance for large feature files. Note that the index file is not directly loaded into IGV. Rather, IGV looks for the index file when the alignment or feature file is loaded. This command does not take an output file argument. Instead, the filename is generated by appending ".sai" (for alignments) or ".idx" (for features) to the input filename as IGV relies on this naming convention to find the index . The input file must be sorted by start position (see sort command, below). 

Supported input file formats are: .sam, .bam, .aligned, .vcf, .psl, and .bed.

Usage:

          igvtools index [inputFile]

Notes:

The "sai" index is an IGV format, it does not work with samtools or any other application.

Sort

Sorts the input file by start position, as required.

Supported input file formats are: .cn, .igv, .sam, .bam, .aligned, .psl, .bed, and .vcf.

Usage:

          igvtools  sort [options] [inputFile]  [outputFile]

Required arguments:

          inputFile 

          outputFile 

The special string "stdout" can be used as [outputFile], in which case the output will be written to the standard output stream instead of a file.

Options:

  -t tmpdir 

Specify a temporary working directory.  For large input files this directory will be used to store intermediate results of the sort. The default is the users temp directory.

  -m maxRecords 

The maximum number of records to keep in memory during the sort. The default value is 500000. Increase this number if you receive "too many open files" errors. Decrease it if you experience "out of memory" errors.

Version

  Prints the igvtools version number to the console.

Running igvtools from the IGV Interface

Select Tools>Run igvtools to open the igvtools window.  This window allows you to run the toTDF, Count, Sort, and Index tools:

  1. Select the tool that you want from the Command drop-down list.  The toTDF tool is selected by default.
  2. Specify the necessary files and/or genome.
  3. Select any of the alternative options you want.
  4. Click Run.

Information about the run will appear in the Messages box.  Note that if you exit the IGV application, any tool that is in progress will be terminated. 

toTDF

The toTDF tool converts a sorted data input file to a binary tiled data (.tdf) file.  Use this tool to pre-process large datasets for improved IGV performance.

Select:

  • the Input File (supported formats are .wig, .cn, .snp, .igv, and .gct)
  • the Output File (must end in .tdf)
  • the Genome (default is whatever genome is selected in IGV)

Options you can change include:

  • Zoom Levels: specifies the maximum zoom level to precompute.  The default is 7; this is sufficient for most files. This value can be reduced to reduce file size, though it will impair IGV performance.
  • Window Functions: allows user to select the window functions to use when reducing data to precomputed tiles.  Mean is the default, but you can also select Min, Max, or Median, as well as percentiles of the data.
  • Probe to Loci Mapping: specifies a .bed file to be used to map probe identifiers to locations.  This is useful when preprocessing .gct files.  The .bed file should contain 4 columns: chr start end name (where name is the probe name in the .gct file).

Count

Count computes average feature density over a specified window size across the genome. Common usages include computing coverage for alignment files and counting hits in Chip-seq experiments.  By default, the resulting file will be displayed as a bar chart when loaded into IGV.  To display feature intensity in IGV, the density must be computed with this option, and the resulting file must be named <feature track filename>.tdf.

Select:

  • the Input File, which must be sorted by start position (see the Sort tool, below). Supported file formats are .sam, .bam, .aligned, .psl, .pslx, and .bed.
  • the Output File (must end in .tdf or .wig)
  • the Genome (default is whatever genome is selected in IGV)

Options you can change include:

  • Zoom Levels: specifies the maximum zoom level to precompute.  The default is 7; this is sufficient for most files. This value can be reduced to reduce file size, though it will impair IGV performance.
  • Window Functions: allows user to select the window functions to use when reducing data to precomputed tiles.  Mean is the default, but you can also select Min, Max, or Median, as well as percentiles of the data.
  • Window Size: specifies the window size over which coverage is averaged, in base pairs; default is 25 bp

Index

This command creates an index for an alignment file or a feature file.  Index files are required for loading alignment files into IGV, and can significantly improve performance for large feature files. Note that you do not directly load the index file into IGV. Rather, IGV looks for a corresponding index file when the alignment or feature file is loaded.  This command does not take an output file argument. Instead, the filename is generated by appending ".sai" (for alignments) or ".idx" (for features) to the input filename as IGV relies on this naming convention to find the index . The input file must be sorted by start position (see the Sort tool, below).

Select the Input File.  Supported file formats are .sam, .bam, .aligned, .vcf, .psl, and .bed.

Sort

Sort sorts the input file by start position.

Select:

  • the Input File. Supported file formats: .cn, .igv, .sam, .bam, .aligned, .psl, .bed, and .vcf.
  • the Output File

Options you can change include:

  • Temp Directory: specifies a temporary working directory.  For large input files, this directory will be used to store intermediate results of the sort. The default is the user's Temp directory.
  • Max Records: specifies the maximum number of records to keep in memory during the sort.  The default value is 500000.  Increase this number if you receive "too many open files" errors.   Decrease it if you experience "out of memory" errors.

 

BLAT search

You can do a BLAT (BLAST-like Alignment Tool) search from a user-specified sequence, feature, alignment, or region of interest, of a sequence up to 8 kb in length.

The default search engine is the BLAT server hosted at the UCSC Genome Browser. UCSC's BLAT search supports most UCSC derived genomes including human and mouse genomes. Change to use a different BLAT server in Advanced Preferences.

BLAT Feature Track

Each query sequence appears as a new Blat feature track in the lower panel of IGV's display. The Screenshot (2015.04.01) shows five different Blat feature tracks for the following sequences:

  1. Red highlighted read
  2. Blue highlighted RNA-Seq read spanning an intron
  3. An exon feature
  4. An ROI covering an intronic region
  5. An ROI spanning a region covering examples a–d.

Manipulate this track just like other feature tracks as outlined in the Feature Tracks section of the Pop-up Menus page.

 

BLAT Results Panel

Results are presented in a new window that displays the query sequence, location of hits, match score, and other metrics as shown in the Screenshot (2015.04.01). Hits are listed in descending order of alignment score.

For the example hit highlighted in the Screenshot above, the original search sequence is returned as the top hit. The read used in the search was an aligned RNA-Seq read spanning an intron (example b), which the BLAT results show is a singly gapped alignment as indicated by the 1 under the column T gap count.

For example, for ROI2 (marked 1 above), clicking on the second hit in the results panel (marked 2 in Screenshot below) navigates the view away from chromosome 19 to the hit locus on chromosome 22 (marked 3). This same region contains a hit for example c, a BLAT search done with an exon feature. Because the exon feature has a higher alignment score than ROI2, its Blat feature is shaded darker.