MSigDB v7.1 Release Notes

From GeneSetEnrichmentAnalysisWiki
Revision as of 16:01, 20 March 2020 by Acastanza (talk | contribs)
Jump to navigation Jump to search

GSEA Home | Downloads | Molecular Signatures Database | Documentation | Contact

This page describes the changes made to the gene set collections for Release 7.1 of the Molecular Signatures Database (MSigDB). This is a minor release that includes updates to gene symbol mappings, updated data from external resources, and new datasets for potential transcription factor and microRNA regulatory target genes.

Note: Due to substantial changes in MSigDB, it is recommended that users migrate to GSEA 4.0.0+ when utilizing MSigDB 7.0+ resources.
Advisory: It is strongly recommended that users of MSigDB 7.1 always use the GSEA "Collapse dataset to gene symbols" feature with the provided Symbol Remapping chip file if your dataset was generated with a transcriptome other than Ensembl v99/GENCODE v33.
This advisory has been updated to reflect MSigDB symbol annotations as of the 7.1 update.

Updates to MSigDB Gene Symbol Mapping Procedures

Update to Ensembl annotations

Beginning in MSigDB 7.0, identifiers for genes are mapped to their HGNC approved Gene Symbol and NCBI Gene ID through annotations extracted from Ensembl's BioMart data service. MSigDB 7.1 incorporates annotation information exported from Ensembl release 99. All analysis run against MSigDB 7.1 gene sets should ensure that the dataset gene symbols match this Ensembl version/GENCODE release 33. Alternatively MSigDB 7.1 provides CHIP files designed to be used with the GSEA Collapse/Remap dataset feature which may be used to re-annotate the dataset.

  • Gene annotations supplied in the MSigDB 7.1 release are derived from Ensembl version 99 corresponding to GENCODE release 33 and reflect the HGNC Gene Symbols as of the GENCODE 33 freeze date of August 2019.

Change to gene orthology mapping procedure for non-human genes

Previously in MSigDB 7.0 we implemented a ranking procedure whereby the best human orthologue for each non-human gene was selected using solely Ensembl orthology table statistics. MSigDB 7.1 replaces this procedure. MSigDB 7.1 utilizes best match orthology tables exported via the Alliance of Genome Resources orthology API. This implements a best match procedure derived based on consensus best matching designed in collaboration with Mouse Genome Informatics at the Jackson Lab.

CHIP file updates

All CHIP files previously provided in the standard MSigDB 7.0 release have been updated for MSigDB 7.1 in accordance with previously described procedures.

Updates to Gene Sets by Collection