MSigDB v6.0 Release Notes

From GeneSetEnrichmentAnalysisWiki
Revision as of 11:19, 10 April 2017 by Liberzon (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

GSEA Home | Downloads | Molecular Signatures Database | Documentation | Contact

This page describes the changes made to the gene set collections for Release 6.0 of the Molecular Signatures Database (MSigDB).

Change of license terms

MSigDB v6.0 is now available under a Creative Commons license, with additional terms for some sub-collections of gene sets. Note however that earlier versions of MSigDB are still under the older license terms. See the License Term page for details.

Updates to C2:CGP - curated gene sets

  • Added two sets (submitted by Dr. Quintens Roel, Belgian Nuclear Research Centre).
  • Updated information of one set at the request of its contributor, Dr. Dimitris Anastassiou (Columbia University).
  • Amended errors in two BASSO_HAIRY_CELL_LEUKEMIA set.

    Changes to C3 - motif gene sets

    We updated the brief and full descriptions for the C3 gene sets. We also changed gene set names to eliminate problem characters:

  • Commas (,) were replaced with underscores (_).
  • Dashes (-) were removed.
  • Dollar signs ($) were removed, along with the preceding "V".

    For example:

  • The MSigDB 5.2 gene set named RCGCANGCGY_V$NRF1_Q6 is now named RCGCANGCGY_NRF1_Q6 in MSigDB v6.0.
  • The MSigDB 5.2 gene set named V$MYOD_01 is now named MYOD_01 in MSigDB v6.0.
  • The MSigDB 5.2 gene set named TACTTGA,MIR-26A,MIR-26B is now named TACTTGA_MIR26A_MIR26B in MSigDB v6.0.

    Changes to C5 collection - gene ontology

    Grouping sets by gene ontology (GO) terms can potentially produce identical sets for different GO terms. We have a procedure in place to detect and resolve these cases. Unfortunately, in the previous release of the database, several identical sets slipped through. Alerted by astute users of MSigDB (thank you!), we improved the procedure and were able to identify and eliminate 249 duplicate sets.

    Other changes

    The Downloads page provides GMT files with gene sets composed of HUGO gene symbols and Entrez gene symbols. GMT files with sets composed of original gene identifiers are no longer provided. This information is available in the XML file of the database and on the individual gene set pages online.