The Molecular Signature Database (MSigDB) is a collection of gene sets maintained by the GSEA team. The team welcomes and appreciates contributions to this shared resource and encourages users to submit their gene sets to email@example.com. The MSigDB contains five categories of genes sets. Our thanks to the contributors.
Category C1, Positional (chromosomal location): Contains gene sets corresponding to the genes on each of the 24 human chromosomes, as well as sets corresponding to cytogenetic bands. These gene sets can be helpful in identifying effects related to epigenetic silencing, dosage compensation, copy number polymorphisms, and aneuploidy or other chromosomal deletions/amplifications.
Category C2, Curated (functional): Contains gene sets of metabolic and signaling pathways gleaned from the following publicly available manually curated databases:
- BioCarta: http://www.biocarta.com/
- Gene arrays, BioScience corporation: http://www.superarray.com/
- Gene ontology: http://www.geneontology.org/
- KEGG: http://www.kegg.jp/
- Reactome: http://www.reactome.org/
- Signaling pathway database: http://www.grt.kyushu-u.ac.jp/spad/menu.html
- Signaling gateway: http://www.signaling-gateway.org/
- Signal transduction knowledge environment: http://stke.sciencemag.org/
- Sigmal Aldrich pathways: http://www.sigmaaldrich.com/Area_of_Interest/Biochemicals/Enzyme_Explorer/Key_Resources.html
- L2L, John Newman and Alan Weiner, Department of Biochemistry, University of Washington: http://depts.washington.edu/l2l/
In addition, contains gene sets representing gene expression signatures of genetic and chemical perturbations that have been culled from experimental results in the literature.
Several colleagues have contributed to this effort, including:
Jean-Pierre Bourquin (Orkin lab)
Ben Ebert (Golub lab)
Yujin Hoshida (Golub lab)
Jean Junior (Student)
Lauren Kazmierski (MIT, UROP)
John Newman (L2L, University of Washington)
Nikolaos Papanikolaou (Aristoteles University of Thessaloniki)
Jessica Robertson (GSEA team)
Leona Saunders (GSEA team)
Kate Stafford (MIT, UROP)
Lisa Sturla (Pomeroy lab)
Please see the gene set page, which describes the gene set, for the source/contributor of a gene set.
Category C3, Motif (motif-based): Each gene set in this category contains genes that lie downstream of a motif that is conserved across the human, mouse, rat, and dog genomes. The motifs are catalogued in Xie, et al, 2005 and represent known or likely regulatory elements in promoters and 3'-untranslated regions. Please cite Xie et al if using this database.
Category C4, Computed (correlational): Correlation gene sets are groups of genes defined by computationally mining large-scale experimental datasets for co-expressed genes built from the neighborhoods of cancer related genes (seeds).
Category C5, GO: Gene sets in this category are named by GO term (http://www.geneontology.org/) and contain genes annotated by that term.
Gene sets can be explored and annotated using the MSigDB page. Our thanks to GATHER: Gene Annotation Tool to Help Explain Relationships at http://gather.genome.duke.edu/ for inspiring the layout of our page to Investigate Gene Sets.