MSigDB v2.5 Release Notes

From GeneSetEnrichmentAnalysisWiki
Revision as of 08:47, 25 March 2008 by Hkuehn (talk | contribs)
Jump to navigation Jump to search

<a href="">GSEA Home</a> | <a href="">Downloads</a> | <a href="">Molecular Signatures Database</a> | Documentation | <a href="">Contact</a>

This page describes the changes made to the gene set collections for Release 2.5 of the Molecular Signatures Database (MSigDB).

C1: Positional gene sets

No changes were made.

C2: Curated gene sets (+205)

Gene sets from two sources were added to the C2 collection:

  • CPG: chemical and genetic perturbations (+5 gene sets). Gene sets by Bild et al. (Nature 439, 353 – 357, 2006) based on microarray analysis of expression profiles of key oncogenes in a model system where expression of these oncogenes transformed otherwise quiescent cells.
  • CP: canonical pathways (+200 gene sets). Gene sets from the KEGG PATHWAY ( database of molecular interaction and reaction networks for metabolism, various cellular processes, and human diseases.

In addition, links to GenMAPP have been fixed. Specifically, broken links have been corrected in the External links section of the gene set cards for gene sets derived from GenMAPP.

C3: Motif gene sets

No changes were made.

C4: Computational gene sets (+456)

C4 now contains two subcollections:

  • CGN: cancer gene neighborhoods (+0 gene sets). Gene sets defined by expression neighborhoods centered on 380 cancer-associated genes (Brentani, Caballero et al. 2003). This is the C4 collection from the previous MSigDB release.
  • CM: cancer modules (+456 gene sets). Gene sets defined by Segal et al. (Nature Genetics  36, 1090 – 1098, 2004). Briefly, the authors compiled gene sets (‘modules’) from a variety of resources such as KEGG, GO, and others. By mining a large compendium of cancer-related microarray data, they identified 456 such modules as significantly changed in a variety of cancer conditions.


C5: GO gene sets (+1454)

Gene sets in this new collection are derived from the controlled vocabulary of the Gene Ontology (GO) project: The Gene Ontology Consortium. Gene Ontology: tool for the unification of biology. Nature Genet. (2000) 25: 25-29 ( The gene sets are named by GO term and contain genes annotated by that term.

This collection is divided into three subcollections:

  • CC: GO Cellular component (+233 gene sets). Gene sets derived from the Cellular Component Ontology.
  • MF: GO Molecular function (+396 gene sets). Gene sets derived from the Molecular Function Ontology.
  • BP: GO Biological process (+825 gene sets). Gene sets derived from the Biological Process Ontology.

 GSEA users: Gene set enrichment analysis identifies gene sets consisting of co-regulated genes; GO gene sets are based on ontologies and do not generally consist of co-regulated genes.


For more information

For complete descriptions of all collections, see the <a href="">Molecular Signatures Database</a> page.
To download the updated gene sets, go to the <a href="">Browse Collections</a> page.


<tbody> </tbody>
Date Release Description Release Notes
March 2008 2.5* C1 (+0); C2 (+205); C3 (+0); C4 (+456); C5 (+1454)


MSigDB growth 0308.gif