Known Issues

From GeneSetEnrichmentAnalysisWiki

Revision as of 23:47, 28 December 2012 by Liberzon (Talk | contribs)
Jump to: navigation, search

<a href="">GSEA Home</a> | <a href="">Downloads</a> | <a href="">Molecular Signatures Database</a> | <a href="">Documentation</a> | <a href="">Contact</a>


GSEA version 2

Java heap space / OutOfMemoryError

Problem: When running an analysis in GSEA. the following error occurs:

---- Stack Trace ----

  1. of exceptions: 1

Java heap space------
java.lang.OutOfMemoryError: Java heap space

Cause: The error is either due to improper memory allocation, or because you have reached the limits on your machine.


  1. Start GSEA by clicking the Launch button on the Downloads page of the GSEA web site.
  2. Run GSEA on a more powerful computer.
  3. Use no more than 1,000 permutations.
  4. Use individual collections or even subcollections of gene sets instead of using all gene sets from the entire MSigDB data base.
  5. First, collapse gene identifiers to symbols using Chip2Chip tool, then run GSEA on the collapsed data set.
    When running GSEA on the collapsed dataset, make sure that 'Collapse dataset(s)' = false
  6. First, create rank ordered list of genes outside GSEA, then run GSEA on the ranked list using GSEAPreranked tool
  7. Use the -Xmx option to specify sufficient maximum amount of memory for the program by running GSEA from the command line.


Firewall / FTP connection issues

Problem: When you try to access the CHIP annotation files or the Gene Set Database / MSigDB Browser, you see an error to the effect:

"Error listing Broad website//  Connection reset//"

Cause: This is probably because you are behind a network firewall or someother network configuration that prevents you from accessing FTP servers on port 500. The Broad chip files and gene sets are placed on a publically accessible Broad FTP server. The GSEA Java Desktop program tries to access the Broad FTP site to provide you easy access to the files but the network configuration blocks access.


  1. Go to GSEA program main page > Options > Preferences and make sure that 'Connect over the internet' is checked.
  2. See if you can temporarily disable your firewall when using GSEA.
  3. Consult with your local network administrator to see if they have any suggestions or prior experience such issues.
  4. Download the gene set (GMT) and CHIP files to your local file system as follows:
    1. Download this large zip file to your local file system. It contains all current (as of September, 2010) CHIP files. Expand it with a program like WinZip or gzip. Here you can download the equivalent zip file for v2.5 release (April 8, 2008).
    2. Start GSEA.
    3. In GSEA, turn off the internet connection mode. Click Options>Preferences.

      On the General preferences page, clear the 'Connect over the Internet' option and click OK.

    4. Use the Load Data page to load the local annotation files and gen set files.
    5. On the Run GSEA page, select the local annotation files and gene set files rather than using the files from the GSEA website.

"No probe called" error

Problem: When you run GSEA, sometimes the following errors appear in the log file:

ERROR - No Probe called: USP9X /// USP9Y on this chip (chip name is >GENE_SYMBOL<)
ERROR - Turning off subsequent error notifications

Solution: You can ignore these errors. The three slashes (///) indicate that the chip file contains ambiguous mappings, typical for Affymetrix notation, where a probe set on the chip cannot be mapped to exactly one HUGO gene symbol. GSEA displays this error and ignores such ambigous probes.

Avoid collapsing ranked list of features to gene symbols

Collapsing dataset to symbols means that GSEA takes expression dataset and collapses probes to symbols before computing the ranking metric values. When done this way, GSEA has two ways to deal with multiple occurrences of expression values corresponding to the same gene symbol. By default, it will retain the maximal expression value; alternatively, it will use median expression value. Both choices make reasonable sense when applied to gene expression values. In the Pre-Ranked mode, however, GSEA is faced with the ranks already computed by an unspecified procedure. With the"Collapse dataset to gene symbols"="true", Pre-Ranked GSEA tool will always pick the largest positive value among several instances of ranking metric values for the same gene. This can sometimes produce unanticipated results because the original assumptions for gene expression do not necessarily apply to an arbitrary ranking metric in the pre-ranked list, so that the ordered ranked list might substantially differ from the input values. Therefore, collapsing of ranked list is appropriate if and only if all its features are unique and have one to one correspondence to human gene symbols.

We thus recommend making the ranked list with human gene symbols as gene identifiers and running GSEAPreranked with the parameter "Collapse dataset to gene symbols"="false".


Browse MSigDB doesn't load custom database XML files

Problem: You have created your own custom database XML file but it does not load into the GSEA browser.
At this time, this functionality is not yet implemented. The browser can load exclusively msigb_v2.5.xml or msigdb_v3.xml files from our FTP server. Anything else in the path or URL will generate error.

Solution: To use your own gene sets, please arrange them as GMT or GMX files as described here.


Error in memory.size

Problem: When running the example programs provided for R, the following error occurs:

[1] " *** Running GSEA Analysis..."
Error in memory.size(size) : don't be silly!: your machine has a 4Gb address limit

Cause: This is produced by the following line early in the GSEA.1.R file:


This line set the memory limit to a large size as a work around to a platform problem with an earlier R version.

Solution: The easiest fix is just to comment out that line:

# memory.limit(6000000000)

This will allocate the default amount of memory. If after this change the program runs out of memory, change the line to:

memory.limit(max. size in Mbytes available)

16 warnings on R version 2.5 or higher

Problem: When running the example programs provided for R, the following warnings occur:

1: '\%' is an unrecognized escape in a character string
2: unrecognized escape removed from "Tag \%"
3: '\%' is an unrecognized escape in a character string
4: unrecognized escape removed from "Gene \%"
5: '\%' is an unrecognized escape in a character string
6: unrecognized escape removed from "\%"
7: '\%' is an unrecognized escape in a character string
8: unrecognized escape removed from " \%)"
9: '\.' is an unrecognized escape in a character string
10: '\.' is an unrecognized escape in a character string
11: unrecognized escapes removed from "\.report\."
12: '\.' is an unrecognized escape in a character string
13: '\.' is an unrecognized escape in a character string
14: unrecognized escapes removed from "\.report\."
15: '\.' is an unrecognized escape in a character string
16: unrecognized escape removed from "\."

Solution: These warnings occur when you have R version 2.5 and higher installed. To fix, remove the single backslashes in front of these characters as there is no need to escape them in R versions 2.5 and higher.

GSEA on Linux

Browser links do not work under Linux

Problem: When running the GSEA desktop application under Linux, buttons and links that would normally open a browser window do not open the browser window.

Work-around: After running an analysis, you cannot click on the Success link to display the result. However, you can go to the directory that contains the analysis report output and open the index.html file in that directory.

Personal tools