GSEA v4.2.x Release Notes

From GeneSetEnrichmentAnalysisWiki
Revision as of 22:41, 1 March 2022 by Eby (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

GSEA Home | Downloads | Molecular Signatures Database | Documentation | Contact

GSEA Desktop v4.2.3 (Mar 2022)

GSEA v4.2.3 is a security release, removing Log4J entirely from the code base. All users are encouraged to update!

This also fixes an additional bug in the weighted_p1.5 scoring mode. If you have used this mode in the past, we recommend re-running your analysis with GSEA 4.2.3 to evaluate the possible differences. Minimum dataset size warnings have been added as well, to note that GSEA should be run with data from all expressed genes rather than a reduced subset or "Top DEGs" list.

GSEA Desktop v4.2.2 (Jan 2022)

GSEA v4.2.2 is a security release, updating to Log4J 2.17.1. All users are encouraged to update!

GSEA Desktop v4.2.1 (Dec 2021)

GSEA v4.2.1 is a security release, updating to Log4J 2.17.0. All users are encouraged to update!

There is one minor bug fix to the TXT parser to fix an error when no Description column is present. There are no other changes.

GSEA Desktop v4.2.0 (Dec 2021)

The GSEA v4.2.0 release includes a number of improvements and bug fixes, including:

  • Added a Spearman Correlation metric for continuous phenotypes.
  • Added a new Absolute Max of Probes collapse mode.
  • Updated to Log4J 2.16.0. Note however, we do not believe any version of GSEA Desktop is impacted by the vulnerability of earlier Log4j versions because it is a desktop application and does not expose any input forms to users over the web. If you are exposing GSEA through a website or other networked server then we recommend you update to 4.2.0 immediately.
  • Added a feature to allow saving the resulting dataset when the Collapse or Remap_Only options are set for a GSEA analysis. If the 'Create GCT files' option under Advanced Fields is set to true, the dataset will be saved as a GCT in the edb sub-folder of the analysis result directory.
  • Modified to save the console log to a 'gsea.log' file in gsea_home'.

There are also updates for better handling of missing values in the input datasets in the file parsers and computations. GSEA ignores missing values in general but there were certain situations where this was not the case. These happened primarily around missing tab fields and explicit NA or NaN input values, but there were also improvements to the handling of missing values overall.

  • Added more prominent warnings in the logs, the UI, and the reports when there are missing values in the input.
  • Modified the GCT, TXT, RNK, and PCL parsers to better handle these cases. NA values were formerly not treated as missing and would cause a numeric parsing error. Likewise for quoted empty values. These are now treated simply as missing values aand ignored.
  • Fixed bugs in most metric calculations where the missing values were not ignored as intended. This affected all metrics except signal-to-noise (S2N, the default) and tTest.
  • Fixed the collapse calculations to also ignore missing values among the individual probes in the same way as the metric computations. This can affect the calculation of mean or median, for example.

Likewise, there are also updates to provide warnings about explicit infinite values in the input dataset. Such values can cause unexpected results during computation or plotting and are not recommended. Infinite values in the input will, however, be handled and used as-is in the metric computations.

Infinite values coming out of the metric computations will be adjusted to a small value when using the various "weighted" scoring modes, to avoid interfering with the rest of the enrichment results and any subsequent reporting. This has the effect of de-emphasizing that particular gene in any scoring.

This adjustment has historically been applied to the "weighted" scoring modes but was not previously documented. For the "weighted" mode, the value is adjusted to 0.01. For the "weighted_p1.5" and "weighted_p2" modes it is adjusted to 0.000001. The adjustment is not applied to the Classic K-S scoring mode since the expression values are not directly used with this mode.

A similar adjustment is also made to infinite values during plotting to avoid errors from the charting library being unable to render such values.

Warnings are also provided for Infinite or NaN values coming out of metric computations (resulting from division-by-zero or taking the root of a negative value, for example).

The vast majority of datasets should be unaffected by these changes as such values should be relatively rare. If you have run analyses on datasets with missing, NA, NaN, or Infinite values and are concerned about changes to the results, we recommend re-running the analysis with GSEA 4.2.0 to evaluate the possible differences.

Beyond that, there are a number of miscellaneous improvements and bug fixes. Chief among these are:

  • Fixed a bug in the calculation of the weighted_p1.5 scoring mode. If you have used this mode in the past, we recommend re-running your analysis with GSEA 4.2.0 to evaluate the possible differences.
  • Changed the FDR q-value scale on the NES vs Significance plot. This was formerly 0-100 but has been changed to 0.0-1.0 to match the values in the report table.
  • Added minimum-sample warnings and errors for the continuous phenotype metrics. Fixed a bug where the minimum-sample check was not applied with gene_set permutation mode.
  • Added a warning about use of the FDR when only one gene set is being analyzed. Reported FDRs are not an accurate representation of the actual false discovery rate when derived from a single gene set.
  • Modified the launcher scripts to fix some issues with recent Java 11 releases on newer versions of macOS and to better support symlinks on Mac and Linux.
  • Fixed bugs with GMT caching and the gene set subset-select feature on Windows.
  • Fixed a bug with some UI parameter widgets handling empty values.
  • Fixed a bug where the analysis RPT file was not saved if there was an error.
  • Fixed a bug with GMT & CHIP sorting for MSigDB point releases.
  • Fixed some issues with blank fields in the CHIP parser.
  • Fixed some bugs in the GCT & TXT export functions.
  • Improved the error message for a missing phenotype selection.
  • Updated the CHIP Download link in the Help menu to use our new location.
  • Fixed a UI dialog-centering bug.
  • Added GSEA & MSigDB citation info to the report.