Difference between revisions of "Known Issues"

From GeneSetEnrichmentAnalysisWiki
Jump to navigation Jump to search
 
(119 intermediate revisions by 3 users not shown)
Line 1: Line 1:
<a href="http://www.broad.mit.edu/gsea/">GSEA Home</a> <a href="../../software/software_index.html">Software</a> | <a href="../../msigdb/msigdb_index.html">MSigDB</a> | [[Main_Page|Documentation]] | <a href="../../resources/resources_index.html">Resources</a><br />
+
[http://www.broadinstitute.org/gsea/ GSEA Home] |
<br />
+
[http://www.broadinstitute.org/gsea/downloads.jsp Downloads] |
<br />
+
[http://www.broadinstitute.org/gsea/msigdb/ Molecular Signatures Database] |
<h3>Error in memory.size when running GSEA-R</h3>
+
[http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Main_Page Documentation] |
<p><strong>Problem</strong>: When running the example programs provided for R, the following error occurs:<br />
+
[http://www.broadinstitute.org/gsea/contact.jsp Contact]
 +
<br>
 +
 
 +
<h1>GSEA version 2</h1>
 +
<h3>Java heap space / OutOfMemoryError </h3>
 +
<p><strong>Problem</strong>: When running an analysis in GSEA. the following error occurs:</p>
 +
<p><tt>        ---- Stack Trace ----<br />
 +
# of exceptions: 1<br />
 +
------Java heap space------<br />
 +
java.lang.OutOfMemoryError: Java heap space        </tt> </p>
 +
<p><strong>Cause</strong>: The error is either due to improper memory allocation,        or because you have reached the limits on your machine. </p>
 +
<p><strong>Solutions</strong>:        </p>
 +
<ol>
 +
    <li>Start GSEA by clicking the <span style="font-variant: small-caps;">Launch</span>  button on the                      [http://www.broadinstitute.org/gsea/downloads.jsp Downloads] page of the GSEA web site and choose the option with larger number for memory allocation. As of Jan 18, 2014, the options are:
 +
<ul><li>1GB (for 32 or 64-bit java)</li>
 +
<li>2GB (for 64-bit java only)</li>
 +
<li>4GB (for 64-bit java only)</li>
 +
</ul>
 +
</li>
 +
    <li>Run GSEA on a more powerful computer.</li>
 +
    <li>Use no more than 1,000 permutations.</li>
 +
    <li>Use individual collections or even subcollections of gene sets instead of using all gene sets from the entire MSigDB data base.</li>
 +
    <li>First, collapse gene identifiers to symbols using                      [http://www.broadinstitute.org/gsea/doc/GSEAUserGuideTEXT.htm#_Chip2Chip_Page Chip2Chip] tool,                    then run GSEA on the collapsed data set.                    <br />
 +
    When running GSEA on the collapsed dataset, make sure that <tt>'Collapse dataset(s)' = false</tt></li>
 +
    <li>First, create rank ordered list of genes outside GSEA, then run GSEA on the ranked list using [http://www.broadinstitute.org/gsea/doc/GSEAUserGuideTEXT.htm#_GSEAPreranked_Page GSEAPreranked tool] </li>
 +
  <li>Use the <tt>-Xmx</tt> option to specify sufficient maximum amount of memory for the program by [http://www.broadinstitute.org/gsea/doc/GSEAUserGuideTEXT.htm#_Running_GSEA_from running GSEA from the command line].</li>
 +
</ol>
 +
<p>&nbsp;</p>
 +
<hr width="100%" size="2" hr="" />
 +
 
 +
<h3>Firewall / FTP connection issues</h3>
 +
<p><strong>Problem</strong>: When you try to access the <strong>GMT</strong> gene set data files, <strong>CHIP</strong> annotation files or the <strong MSigDB Browser</strong>, you see an error to the effect: </p>
 +
<p><font color="#ff0000">Error listing Broad website<br>Connection rese</font></p>
 +
<p><strong>Cause</strong>:  This usually happens when your computer is behind a firewall or another network configuration that prevents it from accessing FTP servers. The Broad chip files and gene sets are on a public Broad FTP server. The GSEA program tries to connect to the FTP site and use the files but the network configuration blocks the access.</p>
 +
<p><strong>Solutions:</strong><br />
 
</p>
 
</p>
<p> [1] &quot; *** Running GSEA Analysis...&quot;<br />
+
<ol>
Error in memory.size(size) : don't be silly!: your machine has a 4Gb address limit<br />
+
    <li>In the navigation bar at the top of the GSEA program window, go to <strong>Options</strong> and make sure that <strong>Connect over the internet</strong> is checked.</li>
<br />
+
    <li> See if you can temporarily disable your firewall when using GSEA.</li>
<strong>Solution</strong>: This is produced by the following line early in the GSEA.1.R file:<br />
+
    <li>Consult with your local network administrator to see if they have any suggestions or prior experience with the FTP access issues. </li>
 +
    <li>Download the gene set (GMT) and CHIP files to your local file system as follows:
 +
    <ol>
 +
        <li>Go to the [http://www.broadinstitute.org/gsea/downloads.jsp GSEA Downloads] page</li>
 +
        <li>Scroll down the table to <strong>Other resources</strong> and click on <strong>download zip file</strong>.
 +
<p>This file contains all data files for the latest version of MSigDB. To get older, archival releases, scroll up and click the link to download the corresponding zip file. Unzip the file.</p>
 +
        <li>Start GSEA. In the navigation bar at the top of the GSEA program window, go to <strong>Options</strong> and turn off the internet connection mode.</li>
 +
        <li>Use the Load Data page to load the local CHIP and GMT files.</li>
 +
        <li>On the Run GSEA page, select the local annotation files and gene set files rather than using the files from the GSEA website.</li>
 +
    </ol>
 +
    </li>
 +
</ol>
 +
<hr width="100%" size="2" />
 +
 
 +
<h3>&quot;No probe called&quot; error</h3>
 +
<p><strong>Problem</strong>: When you run GSEA, sometimes the following errors appear in the log file:</p>
 +
<p><tt>ERROR - No Probe called: USP9X /// USP9Y on this chip (chip name is &gt;GENE_SYMBOL&lt;)<br />
 +
ERROR - Turning off subsequent error notifications</tt></p>
 +
<p><strong>Solution</strong>: You can ignore these errors. The three slashes (<tt>///</tt>) indicate that the chip file contains ambiguous mappings,  typical for Affymetrix notation, where a probe set on the chip cannot be mapped to exactly one HUGO gene symbol.  GSEA displays this error and ignores such ambigous probes.
 
</p>
 
</p>
<pre> memory.limit(6000000000)</pre>
+
<h3>Avoid collapsing ranked list of features to gene symbols</h3>
<p> This line set the memory limit to a large size as a work around to a platform problem with an earlier R version. <br />
+
<p>Collapsing dataset to symbols means that GSEA takes expression dataset and collapses probes to symbols <strong>before</strong> computing the ranking metric values. When done this way, GSEA has two ways to deal with multiple occurrences of expression values corresponding to the same gene symbol. By default, it will retain the maximal expression value; alternatively, it will use median expression value. Both choices make reasonable sense when applied to gene expression values. In the Pre-Ranked mode, however, GSEA is faced with the ranks <strong>already computed</strong> by an unspecified procedure. With the<tt>"Collapse dataset to gene symbols"="true"</tt>, Pre-Ranked GSEA tool will <strong>always</strong> pick the largest positive value among several instances of ranking metric values for the same gene. This can sometimes produce unanticipated results because the original assumptions for gene expression do not necessarily apply to an arbitrary ranking metric in the pre-ranked list, so that the ordered ranked list might substantially differ from the input values. Therefore, collapsing of ranked list is appropriate if and only if all its features are unique and have one to one correspondence to human gene symbols. </p>
The easiest fix is just to comment out that line:<br />
+
<p><strong><font color="red">We thus recommend making the ranked list with human gene symbols as gene identifiers and running GSEAPreranked  with the parameter </font><tt>"Collapse dataset to gene symbols"="false"</tt></strong>. </p>
 +
<p>&nbsp;</p>
 +
 
 +
<h3>Browse MSigDB doesn't load custom database XML files</h3>
 +
<p><strong>Problem</strong>: You have created your own custom database XML file but it does not load into the GSEA browser.<br/>
 +
 
 +
At this time, this functionality is not yet implemented. The browser can load  exclusively msigb_v2.5.xml or msigdb_v3.xml files from our FTP server. Anything else in the path or URL will generate error.</p>
 +
 
 +
<p><strong>Solution</strong>: To use your own gene sets, please arrange them as GMT or GMX files as described [http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats#Gene_Set_Database_Formats here].</p>
 +
 
 +
 
 +
 
 +
 
 +
<h1>GSEA-R</h1>
 +
<h3>Error in memory.size</h3>
 +
<p><strong>Problem</strong>: When running the example programs provided for R, the following error occurs: </p>
 +
<p><tt>        [1] &quot; *** Running GSEA Analysis...&quot;<br />
 +
Error in memory.size(size) : don't be silly!: your machine has a 4Gb address limit        </tt> </p>
 +
<p><strong>Cause</strong>: This is produced by the following line early in the GSEA.1.R file:</p>
 +
<p><tt> memory.limit(6000000000)</tt></p>
 +
<p> This line set the memory limit to a large size as a work around to a platform problem with an earlier R version.</p>
 +
<p><strong>Solution</strong>: The easiest fix is just to comment out that line:<br />
 
</p>
 
</p>
<pre> #      memory.limit(6000000000)</pre>
+
<p><tt> #      memory.limit(6000000000)</tt></p>
<p> This will allocate the default amount of memory. If after this change the program runs out of memory, change the line to:<br />
+
<p> This will allocate the default amount of memory. If after this change the program runs out of memory, change the line to:</p>
</p>
+
<p><tt> memory.limit(max. size in Mbytes available)</tt></p>
<pre> memory.limit(max. size in Mbytes available)<br /><br /><br /></pre>
 
<hr width="100%" size="2" />
 
<h3>Firewall / FTP connection issues for CHIP annotations or Gene Set Databases with (GSEA v2)<br />
 
</h3>
 
<strong>Problem</strong>: When you try to access the <strong>CHIP</strong> annotation files or the <strong>Gene Set Database</strong> / <strong>MSigDB Browser</strong> you see an error t the effect: <font color="#ff0000">&quot;Error listing Broad website//&nbsp; Connection reset//&quot;</font><br />
 
<br />
 
<strong>Cause</strong>: This is probably because you are behind a network firewall or someother network configuration that prevents you from accessing FTP servers on port 500. The Broad chip files and gene sets are placed on a publically accessible Broad FTP server. The GSEA Java Desktop program tries to access the Broad FTP site to provide you easy access to the files but the network configuration blocks access.<br />
 
<br />
 
<strong>Work-around:</strong> (1) See if you can temporarilly disable your firewall when using GSEA (2) Consult with your local network administrator to see if they have any suggestions or prior experience such issues (3) <strong>Download </strong>the .CHIP, GeneSet databases and MSigDB XML file from the link below to your local file system. Expand it with WinZIP and then load the files into the program as *local files* rather than over the network.<br />
 
<pre>This large ZIP file contains ALL current (as of March 19, 2007) .CHIP annotations, GENE_SET databases and MSigDB.xml file:<br /><br /><a href="http://www.broad.mit.edu/cancer/software/gsea/resources/files_to_download_locally_on_firewall_issues.zip">www.broad.mit.edu/cancer/software/gsea/resources/files_to_download_locally_on_firewall_issues.zip</a><br /><br />We are working on a URL based access for the next release.<br /><br /></pre>
 
 
<hr width="100%" size="2" />
 
<hr width="100%" size="2" />
<h3>Error running leading edge analysis (GSEA v2)<br />
+
 
</h3>
+
<h3>16 warnings on R version 2.5 or higher</h3>
<strong>Problem</strong>: When you select a report for leading edge analysis, the following error sometimes occurs:<br />
+
<p><strong>Problem: </strong>When running the example programs provided for R, the following warnings occur:</p>
<br />
+
<p><tt>              1: '\%' is an unrecognized escape in a character string<br />
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp; java.lang.NullPointerException<br />
+
2: unrecognized escape removed from &quot;Tag \%&quot;<br />
at org.genepattern.gsea.LeadingEdgeWidget.setData(EIKM)<br />
+
3: '\%' is an unrecognized escape in a character string<br />
at xapps.gsea.LeadingEdgeReportWidget.setData(EIKM)<br />
+
4: unrecognized escape removed from &quot;Gene \%&quot;<br />
at xapps.gsea.LeadingEdgeReportWidget$1.run(EIKM)<br />
+
5: '\%' is an unrecognized escape in a character string<br />
at java.lang.Thread.run(Unknown Source) <strong><br />
+
6: unrecognized escape removed from &quot;\%&quot;<br />
<br />
+
7: '\%' is an unrecognized escape in a character string<br />
Solution</strong>: Corrected in GSEA v2.0.1.<br />
+
8: unrecognized escape removed from &quot; \%)&quot;<br />
 +
9: '\.' is an unrecognized escape in a character string<br />
 +
10: '\.' is an unrecognized escape in a character string<br />
 +
11: unrecognized escapes removed from &quot;\.report\.&quot;<br />
 +
12: '\.' is an unrecognized escape in a character string<br />
 +
13: '\.' is an unrecognized escape in a character string<br />
 +
14: unrecognized escapes removed from &quot;\.report\.&quot;<br />
 +
15: '\.' is an unrecognized escape in a character string<br />
 +
16: unrecognized escape removed from &quot;\.&quot;        </tt> </p>
 +
<p><strong>Solution: </strong> These warnings occur when you have R version 2.5 and higher installed. To fix, remove the single backslashes in front of these characters as there is no need to escape them in R versions 2.5 and higher.
 
<hr width="100%" size="2" />
 
<hr width="100%" size="2" />
<h3>java.lang.OutOfMemoryError (GSEA v1)<br />
+
 
</h3>
+
<h1>GSEA on Linux</h1>
<span style="font-weight: bold;">Problem</span>: On the Mac, you can run GSEA from the command line, but when you attempt to use the GSEA application from the desktop you receive errors similar to the following:<br />
+
<h3>Browser links do not work under Linux</h3>
<br />
+
<strong>Problem</strong>: When running the GSEA desktop application under Linux, buttons and links that would normally open a browser window do not open the browser window.<br />
---- Full Error Message ---- <br />
 
na <br />
 
---- Stack Trace ---- <br />
 
# of exceptions: 1 <br />
 
------null------ <br />
 
java.lang.OutOfMemoryError <br />
 
<br />
 
<span style="font-weight: bold;">Solution</span>: Corrected in GSEA v2. In GSEA v1, this is a memory issue with the gsea installer on the Mac. As a workaround, use the following command to launch the GSEA application rather than double clicking the icon: <br />
 
<br />
 
java -Xmx1800m xapps.gsea.Main<br />
 
<br />
 
<hr width="100%" size="2" />
 
<h3>java.lang.NullPointerException (GSEA v1)<br />
 
</h3>
 
<span style="font-weight: bold;">Problem</span>: By default, a gene set enrichment analysis uses phenotype permutations. If you have too few samples for phenotype permutation, the following error occurs:<br />
 
<br />
 
---- Stack Trace ----<br />
 
# of exceptions: 1<br />
 
------null------<br />
 
java.lang.NullPointerException<br />
 
&nbsp;&nbsp;&nbsp; at edu.mit.broad.genome.alg.DatasetStatsCore.calc2ClassCategoricalMetricMarkerScores(DatasetStatsCore.java:236)<br />
 
&nbsp;&nbsp;&nbsp; at edu.mit.broad.genome.alg.markers.PermutationTestBuilder.&lt;init&gt;(PermutationTestBuilder.java:94)<br />
 
&nbsp;&nbsp;&nbsp; at edu.mit.broad.genome.alg.gsea.KSTests.shuffleTemplate_canned_templates(KSTests.java:360)<br />
 
&nbsp;&nbsp;&nbsp; at edu.mit.broad.genome.alg.gsea.KSTests.shuffleTemplate(KSTests.java:291)<br />
 
&nbsp;&nbsp;&nbsp; at edu.mit.broad.genome.alg.gsea.KSTests.executeGsea(KSTests.java:156)<br />
 
&nbsp;&nbsp;&nbsp; at edu.mit.broad.genome.alg.gsea.KSTests.executeGsea(KSTests.java:130)<br />
 
&nbsp;&nbsp;&nbsp; at xtools.gsea.AbstractGsea2Tool.execute_one(AbstractGsea2Tool.java:103)<br />
 
&nbsp;&nbsp;&nbsp; at xtools.gsea.AbstractGsea2Tool.execute_one_with_reporting(AbstractGsea2Tool.java:137)<br />
 
&nbsp;&nbsp;&nbsp; at xtools.gsea.Gsea.execute(Gsea.java:111)<br />
 
&nbsp;&nbsp;&nbsp; at edu.mit.broad.xbench.tui.TaskManager$ToolRunnable.run(TaskManager.java:468)<br />
 
&nbsp;&nbsp;&nbsp; at java.lang.Thread.run(Unknown Source)<br />
 
 
<br />
 
<br />
<span style="font-weight: bold;">Solution</span>: Corrected in GSEA v2. In GSEA v1, use gene_set permutation rather than phenotype permutation. For more information, see the description of the <em>Permutation type</em> parameter on the [http://www.broad.mit.edu/cancer/software/gsea/doc/GSEAUserGuideFrame.htm?Run_GSEA_Page Run GSEA Page] in the <em style="">GSEA User Guide</em>.
+
<strong>Work-around</strong>: After running an analysis, you cannot click on the Success link to display the result. However, you can go to the directory that contains the analysis report output and open the index.html file in that directory.

Latest revision as of 03:50, 25 September 2016

GSEA Home | Downloads | Molecular Signatures Database | Documentation | Contact

GSEA version 2

Java heap space / OutOfMemoryError

Problem: When running an analysis in GSEA. the following error occurs:

---- Stack Trace ----

  1. of exceptions: 1

Java heap space------
java.lang.OutOfMemoryError: Java heap space

Cause: The error is either due to improper memory allocation, or because you have reached the limits on your machine.

Solutions:

  1. Start GSEA by clicking the Launch button on the Downloads page of the GSEA web site and choose the option with larger number for memory allocation. As of Jan 18, 2014, the options are:
    • 1GB (for 32 or 64-bit java)
    • 2GB (for 64-bit java only)
    • 4GB (for 64-bit java only)
  2. Run GSEA on a more powerful computer.
  3. Use no more than 1,000 permutations.
  4. Use individual collections or even subcollections of gene sets instead of using all gene sets from the entire MSigDB data base.
  5. First, collapse gene identifiers to symbols using Chip2Chip tool, then run GSEA on the collapsed data set.
    When running GSEA on the collapsed dataset, make sure that 'Collapse dataset(s)' = false
  6. First, create rank ordered list of genes outside GSEA, then run GSEA on the ranked list using GSEAPreranked tool
  7. Use the -Xmx option to specify sufficient maximum amount of memory for the program by running GSEA from the command line.

 


Firewall / FTP connection issues

Problem: When you try to access the GMT gene set data files, CHIP annotation files or the <strong MSigDB Browser, you see an error to the effect:

Error listing Broad website
Connection rese

Cause: This usually happens when your computer is behind a firewall or another network configuration that prevents it from accessing FTP servers. The Broad chip files and gene sets are on a public Broad FTP server. The GSEA program tries to connect to the FTP site and use the files but the network configuration blocks the access.

Solutions:

  1. In the navigation bar at the top of the GSEA program window, go to Options and make sure that Connect over the internet is checked.
  2. See if you can temporarily disable your firewall when using GSEA.
  3. Consult with your local network administrator to see if they have any suggestions or prior experience with the FTP access issues.
  4. Download the gene set (GMT) and CHIP files to your local file system as follows:
    1. Go to the GSEA Downloads page
    2. Scroll down the table to Other resources and click on download zip file.

      This file contains all data files for the latest version of MSigDB. To get older, archival releases, scroll up and click the link to download the corresponding zip file. Unzip the file.

    3. Start GSEA. In the navigation bar at the top of the GSEA program window, go to Options and turn off the internet connection mode.
    4. Use the Load Data page to load the local CHIP and GMT files.
    5. On the Run GSEA page, select the local annotation files and gene set files rather than using the files from the GSEA website.

"No probe called" error

Problem: When you run GSEA, sometimes the following errors appear in the log file:

ERROR - No Probe called: USP9X /// USP9Y on this chip (chip name is >GENE_SYMBOL<)
ERROR - Turning off subsequent error notifications

Solution: You can ignore these errors. The three slashes (///) indicate that the chip file contains ambiguous mappings, typical for Affymetrix notation, where a probe set on the chip cannot be mapped to exactly one HUGO gene symbol. GSEA displays this error and ignores such ambigous probes.

Avoid collapsing ranked list of features to gene symbols

Collapsing dataset to symbols means that GSEA takes expression dataset and collapses probes to symbols before computing the ranking metric values. When done this way, GSEA has two ways to deal with multiple occurrences of expression values corresponding to the same gene symbol. By default, it will retain the maximal expression value; alternatively, it will use median expression value. Both choices make reasonable sense when applied to gene expression values. In the Pre-Ranked mode, however, GSEA is faced with the ranks already computed by an unspecified procedure. With the"Collapse dataset to gene symbols"="true", Pre-Ranked GSEA tool will always pick the largest positive value among several instances of ranking metric values for the same gene. This can sometimes produce unanticipated results because the original assumptions for gene expression do not necessarily apply to an arbitrary ranking metric in the pre-ranked list, so that the ordered ranked list might substantially differ from the input values. Therefore, collapsing of ranked list is appropriate if and only if all its features are unique and have one to one correspondence to human gene symbols.

We thus recommend making the ranked list with human gene symbols as gene identifiers and running GSEAPreranked with the parameter "Collapse dataset to gene symbols"="false".

 

Browse MSigDB doesn't load custom database XML files

Problem: You have created your own custom database XML file but it does not load into the GSEA browser.
At this time, this functionality is not yet implemented. The browser can load exclusively msigb_v2.5.xml or msigdb_v3.xml files from our FTP server. Anything else in the path or URL will generate error.

Solution: To use your own gene sets, please arrange them as GMT or GMX files as described here.



GSEA-R

Error in memory.size

Problem: When running the example programs provided for R, the following error occurs:

[1] " *** Running GSEA Analysis..."
Error in memory.size(size) : don't be silly!: your machine has a 4Gb address limit

Cause: This is produced by the following line early in the GSEA.1.R file:

memory.limit(6000000000)

This line set the memory limit to a large size as a work around to a platform problem with an earlier R version.

Solution: The easiest fix is just to comment out that line:

# memory.limit(6000000000)

This will allocate the default amount of memory. If after this change the program runs out of memory, change the line to:

memory.limit(max. size in Mbytes available)


16 warnings on R version 2.5 or higher

Problem: When running the example programs provided for R, the following warnings occur:

1: '\%' is an unrecognized escape in a character string
2: unrecognized escape removed from "Tag \%"
3: '\%' is an unrecognized escape in a character string
4: unrecognized escape removed from "Gene \%"
5: '\%' is an unrecognized escape in a character string
6: unrecognized escape removed from "\%"
7: '\%' is an unrecognized escape in a character string
8: unrecognized escape removed from " \%)"
9: '\.' is an unrecognized escape in a character string
10: '\.' is an unrecognized escape in a character string
11: unrecognized escapes removed from "\.report\."
12: '\.' is an unrecognized escape in a character string
13: '\.' is an unrecognized escape in a character string
14: unrecognized escapes removed from "\.report\."
15: '\.' is an unrecognized escape in a character string
16: unrecognized escape removed from "\."

Solution: These warnings occur when you have R version 2.5 and higher installed. To fix, remove the single backslashes in front of these characters as there is no need to escape them in R versions 2.5 and higher.


GSEA on Linux

Browser links do not work under Linux

Problem: When running the GSEA desktop application under Linux, buttons and links that would normally open a browser window do not open the browser window.

Work-around: After running an analysis, you cannot click on the Success link to display the result. However, you can go to the directory that contains the analysis report output and open the index.html file in that directory.