Using IGV Through GenePattern

Posted on Thursday, August 30, 2012 at 12:23PM by The GenePattern Team

Overview

The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated datasets. It supports a wide variety of data types including sequence alignments, microarrays, and genomic annotations. Up until recently, this tool was only available outside of GenePattern, though it did accept GenePattern file formats. IGV can now be launched from a module available on the GenePattern Server or downloaded from the GenePattern Repository.

With this new development, users can pass their GenePattern result files directly to IGV through GenePattern.

IGV in GenePattern

The GenePattern IGV module launches the same application that is available from the IGV website. If you are a user of both the client IGV (either launching from the IGV website or your desktop) and GenePattern, this means you are using the same version of IGV complete with your preferences, home directory, saved genomes, and other such IGV saved presets. For all users, this means you are getting the latest version of IGV each time you run IGV, regardless of whether that is from GenePattern or from the IGV client.

As mentioned above, having IGV in GenePattern now allows you to pass your GenePattern data files directly to IGV in the same way you would use a result file as input file for any other module. For instance, you'll now notice that (on servers where IGV is installed) output from a run of GISTIC will have IGV as a next option in the dropdown for the result file.

Supported File Formats

IGV supports many of the common GenePattern file formats such as: CBS, CN, GCT*, RES*, GISTIC, SEG, and LOH files. For more information about supported IGV file types click here.

You can also upload any other IGV-supported file type as you would any other input file; i.e., via Upload or URL.

Note: In order to properly view GCT or RES files in IGV, some preprocessing is needed and will be discussed in detail shortly.

Configuring IGV in GenePattern

When you run IGV from within GenePattern you are provided with a few optional configuration parameters which will instruct IGV how to display your data.

Currently these two parameters are "genome" and "locus".

The "genome" parameter allows you to select the genome which corresponds to your data file. If you choose not to specify a genome, IGV will launch with hg19 if this is the first time you've run IGV. If you've run IGV before, it will launch with the last genome you were viewing.

The "locus" parameter allows you to specify a locus or range of interest for your data. For example, you could specify chr5:90,339,000-90,349,000 and IGV would launch with your data and that region of chromosome 5 displayed. If you instead wanted to look for the gene EGFR, you would simply type "EGFR" into the text box. If you choose not to specify a locus or gene and this is the first time you've run IGV, IGV will launch with chromosome 1 selected. If you've run IGV before, it will launch with the chromosome you last viewed.

Viewing GCT and RES files

In order to properly view a GCT or RES file in IGV, some preprocessing is required.

The default display option for a GCT or RES file is the Heatmap. For the heatmap to make sense, the data must be row-centered, scaled and possibly have a threshold applied. Currently the workflow for this is as follows.

1. Run PreprocessDataset in GenePattern

If the data contains negative (non-log transformed) values, run it through PreprocessDataset. The default threshold there is 20.

2. Run data through IGVTools

Currently IGVTools is a stand-alone utility providing a set of tools for pre-processing data files.

For the preprocessing of unscaled GCT and RES files an option called "formatExp" is provided. It takes a non-log expression file and performs the following steps. (Note that these are the steps used for our internal expression data prior to viewing in IGV.)

  1. take log2 of data
  2. compute median and subtact from each log2 probe value (i.e. center on the median)
  3. compute the MAD (mean absolute deviation)
  4. divide each log2 probe value by the MAD

You can download this version of igvtools. For the latest version of igvtools, go to the IGV Downloads webpage.

After unzipping, igvtools can be used on the command line to transform the RES or GCT file as described above. The command line follows. (If you are on a Windows platform use "igvtools.bat" instead of "igvtools".)

./igvtools formatExp inputFile outputFile

To run this on your preprocessed dataset, save the resulting .preprocessed file and provide it as "inputFile" in the command line.

 

3. Run IGV with Scaled and Centered Data

Take the output from IGVTools and provide it as input to IGV.

Using IGV

Once you have launched IGV you may configure, drag, zoom, save, etc., as you would normally use IGV. For more information on how to use IGV, please visit IGV website.

For questions and comments about IGV, please send an email to the IGV team at igv-help@broadinsitute.org.

For questions and comments about GenePattern, please send an email to the GenePattern team at gp-help@broadinstitute.org.

Back to Blog