MSigDB v2.5 Release Notes

From GeneSetEnrichmentAnalysisWiki
Revision as of 09:55, 6 January 2010 by Hkuehn (talk | contribs)
Jump to navigation Jump to search

<a href="">GSEA Home</a> | <a href="">Downloads</a> | <a href="">Molecular Signatures Database</a> | Documentation | <a href="">Contact</a>

This page describes the changes made to the gene set collections for Release 2.5 of the Molecular Signatures Database (MSigDB).

C1: Positional gene sets

No changes were made.
For a description of this collection, see the <a href="">Browse Collections</a> page.

C2: Curated gene sets (+205)

Gene sets from two sources were added to the C2 collection:

  • CGP: chemical and genetic perturbations (+5 gene sets). Gene sets by Bild et al. (Nature 439, 353 – 357, 2006) based on microarray analysis of expression profiles of key oncogenes in a model system where expression of these oncogenes transformed otherwise quiescent cells.
  • CP: canonical pathways (+200 gene sets). Gene sets from the KEGG PATHWAY ( database of molecular interaction and reaction networks for metabolism, various cellular processes, and human diseases.

In addition, links to GenMAPP have been fixed. Specifically, broken links have been corrected in the External links section of the gene set page for any gene set derived from GenMAPP.

C3: Motif gene sets

No changes were made.
For a description of this collection, see the <a href="">Browse Collections</a> page.

C4: Computational gene sets (+456)

C4 now contains two subcollections:

  • CGN: cancer gene neighborhoods (+0 gene sets). Gene sets defined by expression neighborhoods centered on 380 cancer-associated genes (Brentani, Caballero et al. 2003). This is the C4 collection from the previous MSigDB release.
  • CM: cancer modules (+456 gene sets). Gene sets identical to the modules described in Segal et al. (Nature Genetics  36, 1090 – 1098, 2004). Gene sets in this subcollection are made of transcriptionally coregulated genes that share a common function and have been found significantly deregulated in tumors. Starting with a list of 2,849 gene sets from a variety of resources such as Gene Ontology, KEGG and others, the authors extracted 456 statistically significant regulatory modules from a large collection of published microarray data spanning 22 tumor types. This is an entirely new subcollection.


C5: GO gene sets (+1454)

Gene sets in this new collection are derived from the controlled vocabulary of the Gene Ontology (GO) project: The Gene Ontology Consortium. Gene Ontology: tool for the unification of biology. Nature Genet. (2000) 25: 25-29 ( The gene sets are named by GO term and contain genes annotated by that term.

This collection is divided into three subcollections:

  • CC: GO Cellular component (+233 gene sets). Gene sets derived from the Cellular Component Ontology.
  • MF: GO Molecular function (+396 gene sets). Gene sets derived from the Molecular Function Ontology.
  • BP: GO Biological process (+825 gene sets). Gene sets derived from the Biological Process Ontology.

 GSEA users: Gene set enrichment analysis identifies gene sets consisting of co-regulated genes; GO gene sets are based on ontologies and do not generally consist of co-regulated genes.


For more information

For complete descriptions of all collections or to download the updated gene sets, go to the <a href="">Browse Collections</a> page.