<a href="http://www.broadinstitute.org/gsea/">GSEA Home</a> | <a href="http://www.broadinstitute.org/gsea/downloads.jsp">Downloads</a> | <a href="http://www.broadinstitute.org/gsea/msigdb/">Molecular Signatures Database</a> | Documentation | <a href="http://www.broadinstitute.org/gsea/contact.jsp">Contact</a>
The Molecular Signature Database (MSigDB) is a collection of gene sets maintained by the GSEA team. The team welcomes and appreciates contributions to this shared resource and encourages users to submit their gene sets to firstname.lastname@example.org. The MSigDB contains five categories of genes sets. Our thanks to the contributors.
Category C1, Positional (chromosomal location): Contains 24 gene sets corresponding to the genes on each of the 24 human chromosomes, as well as 301 sets corresponding to cytogenetic bands. These gene sets can be helpful in identifying effects related to epigenetic silencing, dosage compensation, copy number polymorphisms, and aneuploidy or other chromosomal deletions/amplifications.
Category C2, Curated (functional): Contains gene sets of metabolic and signaling pathways gleaned from the following publicly available manually curated databases:
- BioCarta: http://www.biocarta.com/
- Signaling pathway database: http://www.grt.kyushu-u.ac.jp/spad/menu.html
- Signaling gateway: http://www.signaling-gateway.org/
- Signal transduction knowledge environment: http://stke.sciencemag.org/
- Human protein reference database: http://www.hprd.org/
- GenMAPP: http://www.genmapp.org/
- Gene ontology: http://www.geneontology.org/
- Sigmal Aldrich pathways: http://www.sigmaaldrich.com/Area_of_Interest/Biochemicals/Enzyme_Explorer/Key_Resources.html
- Gene arrays, BioScience corporation: http://www.superarray.com/
- Human cancer genome anatomy consortium: http://cgap.nci.nih.gov/
- L2L, John Newman and Alan Weiner, Department of Biochemistry, University of Washington: http://depts.washington.edu/l2l/
In addition, contains gene sets representing gene expression signatures of genetic and chemical perturbations that have been culled from experimental results in the literature.
Several colleagues have contributed to this effort, including:
Jean-Pierre Bourquin (Orkin lab)
Ben Ebert (Golub lab)
Yujin Hoshida (Golub lab)
Jean Junior (Student)
John Newman (L2L, Washington University)
Kate Stafford (MIT, UROP)
Lisa Sturla (Pomeroy lab)
Please see the gene set page, which describes the gene set, for the source/contributor of a gene set.
Category C3, Motif (motif-based): Each gene set in this category contains genes that lie downstream of a motif that is conserved across the human, mouse, rat, and dog genomes. The motifs are catalogued in <a href="http://www.broad.mit.edu/cgi-bin/cancer/publications/pub_paper.cgi?mode=view&paper_id=116"> Xie, et al, 2005</a> and represent known or likely regulatory elements in promoters and 3'-untranslated regions. Please cite Xie et al if using this database.
Category C4, Computed (correlational): Correlation gene sets are groups of genes defined by computationally mining large-scale experimental datasets for co-expressed genes built from the neighborhoods of ~400 cancer related genes (seeds).
Category C5, GO: Gene sets in this category are named by GO term (http://www.geneontology.org/) and contain genes annotated by that term.
Gene sets can be explored and annotated using the MSigDB page. For the notion of the gene set annotations page, our thanks to GATHER: Gene Annotation Tool to Help Explain Relationships at <a target="_blank" href="http://gather.genome.duke.edu/">http://gather.genome.duke.edu/</a>.