MSigDB v2022.1.Mm Release Notes

From GeneSetEnrichmentAnalysisWiki
Jump to navigation Jump to search

GSEA Home | Downloads | Molecular Signatures Database | Documentation | Contact

Important Notices

This page describes updates made to the Molecular Signatures Database for release 2022.1. This release introduces several major changes to previous conventions. MSigDB is now split into two major divisions; a series of gene set collections that are provided in the namespace of human gene symbols, and a series of gene set collections that are provided in the namespace of mouse gene symbols. As such the versioning convention of MSigDB has changed to adopt the format Year.Release.Species. This initial release in the new format is versioned 2022.1.Hs for the human collections and 2022.1.Mm for the mouse collections. Likewise, CHIP files have been updated to reflect this convention, as well as the specific series of collections (i.e. human or mouse) that they are targeted towards.

Note that in order to access the MSigBD mouse collections through the GSEA UI, the latest version of GSEA (4.3.0) is required.

MSigDB v2022.1 is based on gene annotation data from Ensembl Release 107 (Jul 2022).

Initial Release of Mouse Collections (MSigDB v2022.1.Mm)

The initial release of the MSigDB Mouse Collections contains the following 6 collections, with some collection numbers reserved for future development. Please see the Collection Details Page for collection-specific general information.

MH: mouse-ortholog hallmark gene sets

The MSigDB Hallmarks collection is being made available in an orthology converted form to aid in initial exploratory analysis of mouse datasets utilizing orthology mappings to MGI IDs provided by the Mouse Genome Informatics (MGI) institute at The Jackson Laboratory.

M1: positional gene sets

Ensembl IDs for genes were retrieved from the cytogenetic band annotations provided in Ensembl 102 release, corresponding to the GRCm38 assembly as cytogentic band annotatations for GRCm39 are not presently available.

M2: curated gene sets

M2:CGP

932 gene sets consisting of:

  • 869 miscellaneous gene sets derived from studies originally conducted in mouse models were copied from the Human C2:CGP subcollection and are included in M2:CGP in their native mouse namespace
  • 21 gene sets describing the mouse turmor models and xenografts curated by Jill Recla of the Mouse Genome Informatics (MGI) institute at The Jackson Laboratory
  • 42 gene sets describing neurogenic fates and cortical patterning from mouse experiments provided by Robert Hevner

M2:CP

The initial release of the mouse C2:CP collection contains:

  • 252 gene sets from the BioCarta mouse database
  • 1249 gene sets from the Reactome mouse database
  • 186 gene sets from the WikiPathways mouse dabase

M3: regulatory target gene sets

  • Transcription factor target gene sets from the Gene Transcription Regulation Database (GTRD) corresponding to experiments performed using mouse ChIP-seq experiments
  • miRNA target gene sets from computationally predicted mouse gene targets of miRNAs using the MirTarget algorithm. Data was curated from miRDB v6.0 target predictions with MirTarget scores >80 (high confidence predictions). miRNAs catalogued in miRDB v6.0 are derived from miRBase v22 (March 2018).

M5: ontology gene sets

M5:GO which is divided into three sub-collections:

  • BP: GO Biological process (7775 gene sets). Gene sets derived from the Biological Process Ontology.
  • CC: GO Cellular component (1039 gene sets). Gene sets derived from the Cellular Component Ontology.
  • MF: GO Molecular function (1746 gene sets). Gene sets derived from the Molecular Function Ontology.

M5:MPT

92 gene sets mined from the Mammalian Phenotype Ontology database corresponding to tumor specific ontology terms.

M8: cell type signature gene sets

Two initial groups of gene sets are being provided in this initial release

CHIP file release

  • MSigDB 2022.1.Mm gene annotations and gene mapping CHIP files are being provided utilizing data from Ensembl 107.
  • Gene orthology annotations for mapping human and rat genes to their best match mouse orthologs are being provided utilizing information from Alliance of Genome Resources orthology database release 5.2.1 (2022-07-15).

Compendia expression profiles

The investigate gene sets tool provides a mouse transcriptomic expression atlas derived from the Mouse Transcriptomic BodyMap compendium allowing for visualization of the expression of gene set genes across 17 mouse tissues.