Known Issues

From GeneSetEnrichmentAnalysisWiki

Revision as of 07:50, 25 September 2016 by Eby (Talk | contribs)
(diff) ← Older revision | Current revision (diff) | Newer revision → (diff)
Jump to: navigation, search

GSEA Home | Downloads | Molecular Signatures Database | Documentation | Contact


GSEA version 2

Java heap space / OutOfMemoryError

Problem: When running an analysis in GSEA. the following error occurs:

---- Stack Trace ----

  1. of exceptions: 1

Java heap space------
java.lang.OutOfMemoryError: Java heap space

Cause: The error is either due to improper memory allocation, or because you have reached the limits on your machine.


  1. Start GSEA by clicking the Launch button on the Downloads page of the GSEA web site and choose the option with larger number for memory allocation. As of Jan 18, 2014, the options are:
    • 1GB (for 32 or 64-bit java)
    • 2GB (for 64-bit java only)
    • 4GB (for 64-bit java only)
  2. Run GSEA on a more powerful computer.
  3. Use no more than 1,000 permutations.
  4. Use individual collections or even subcollections of gene sets instead of using all gene sets from the entire MSigDB data base.
  5. First, collapse gene identifiers to symbols using Chip2Chip tool, then run GSEA on the collapsed data set.
    When running GSEA on the collapsed dataset, make sure that 'Collapse dataset(s)' = false
  6. First, create rank ordered list of genes outside GSEA, then run GSEA on the ranked list using GSEAPreranked tool
  7. Use the -Xmx option to specify sufficient maximum amount of memory for the program by running GSEA from the command line.


Firewall / FTP connection issues

Problem: When you try to access the GMT gene set data files, CHIP annotation files or the <strong MSigDB Browser</strong>, you see an error to the effect:

Error listing Broad website
Connection rese

Cause: This usually happens when your computer is behind a firewall or another network configuration that prevents it from accessing FTP servers. The Broad chip files and gene sets are on a public Broad FTP server. The GSEA program tries to connect to the FTP site and use the files but the network configuration blocks the access.


  1. In the navigation bar at the top of the GSEA program window, go to Options and make sure that Connect over the internet is checked.
  2. See if you can temporarily disable your firewall when using GSEA.
  3. Consult with your local network administrator to see if they have any suggestions or prior experience with the FTP access issues.
  4. Download the gene set (GMT) and CHIP files to your local file system as follows:
    1. Go to the GSEA Downloads page
    2. Scroll down the table to Other resources and click on download zip file.

      This file contains all data files for the latest version of MSigDB. To get older, archival releases, scroll up and click the link to download the corresponding zip file. Unzip the file.

    3. Start GSEA. In the navigation bar at the top of the GSEA program window, go to Options and turn off the internet connection mode.
    4. Use the Load Data page to load the local CHIP and GMT files.
    5. On the Run GSEA page, select the local annotation files and gene set files rather than using the files from the GSEA website.

"No probe called" error

Problem: When you run GSEA, sometimes the following errors appear in the log file:

ERROR - No Probe called: USP9X /// USP9Y on this chip (chip name is >GENE_SYMBOL<)
ERROR - Turning off subsequent error notifications

Solution: You can ignore these errors. The three slashes (///) indicate that the chip file contains ambiguous mappings, typical for Affymetrix notation, where a probe set on the chip cannot be mapped to exactly one HUGO gene symbol. GSEA displays this error and ignores such ambigous probes.

Avoid collapsing ranked list of features to gene symbols

Collapsing dataset to symbols means that GSEA takes expression dataset and collapses probes to symbols before computing the ranking metric values. When done this way, GSEA has two ways to deal with multiple occurrences of expression values corresponding to the same gene symbol. By default, it will retain the maximal expression value; alternatively, it will use median expression value. Both choices make reasonable sense when applied to gene expression values. In the Pre-Ranked mode, however, GSEA is faced with the ranks already computed by an unspecified procedure. With the"Collapse dataset to gene symbols"="true", Pre-Ranked GSEA tool will always pick the largest positive value among several instances of ranking metric values for the same gene. This can sometimes produce unanticipated results because the original assumptions for gene expression do not necessarily apply to an arbitrary ranking metric in the pre-ranked list, so that the ordered ranked list might substantially differ from the input values. Therefore, collapsing of ranked list is appropriate if and only if all its features are unique and have one to one correspondence to human gene symbols.

We thus recommend making the ranked list with human gene symbols as gene identifiers and running GSEAPreranked with the parameter "Collapse dataset to gene symbols"="false".


Browse MSigDB doesn't load custom database XML files

Problem: You have created your own custom database XML file but it does not load into the GSEA browser.
At this time, this functionality is not yet implemented. The browser can load exclusively msigb_v2.5.xml or msigdb_v3.xml files from our FTP server. Anything else in the path or URL will generate error.

Solution: To use your own gene sets, please arrange them as GMT or GMX files as described here.


Error in memory.size

Problem: When running the example programs provided for R, the following error occurs:

[1] " *** Running GSEA Analysis..."
Error in memory.size(size) : don't be silly!: your machine has a 4Gb address limit

Cause: This is produced by the following line early in the GSEA.1.R file:


This line set the memory limit to a large size as a work around to a platform problem with an earlier R version.

Solution: The easiest fix is just to comment out that line:

# memory.limit(6000000000)

This will allocate the default amount of memory. If after this change the program runs out of memory, change the line to:

memory.limit(max. size in Mbytes available)

16 warnings on R version 2.5 or higher

Problem: When running the example programs provided for R, the following warnings occur:

1: '\%' is an unrecognized escape in a character string
2: unrecognized escape removed from "Tag \%"
3: '\%' is an unrecognized escape in a character string
4: unrecognized escape removed from "Gene \%"
5: '\%' is an unrecognized escape in a character string
6: unrecognized escape removed from "\%"
7: '\%' is an unrecognized escape in a character string
8: unrecognized escape removed from " \%)"
9: '\.' is an unrecognized escape in a character string
10: '\.' is an unrecognized escape in a character string
11: unrecognized escapes removed from "\.report\."
12: '\.' is an unrecognized escape in a character string
13: '\.' is an unrecognized escape in a character string
14: unrecognized escapes removed from "\.report\."
15: '\.' is an unrecognized escape in a character string
16: unrecognized escape removed from "\."

Solution: These warnings occur when you have R version 2.5 and higher installed. To fix, remove the single backslashes in front of these characters as there is no need to escape them in R versions 2.5 and higher.

GSEA on Linux

Browser links do not work under Linux

Problem: When running the GSEA desktop application under Linux, buttons and links that would normally open a browser window do not open the browser window.

Work-around: After running an analysis, you cannot click on the Success link to display the result. However, you can go to the directory that contains the analysis report output and open the index.html file in that directory.

Personal tools