Difference between revisions of "MSigDB v7.3 Release Notes"

From GeneSetEnrichmentAnalysisWiki
Jump to navigation Jump to search
(Initial incomplete notes)
 
Line 24: Line 24:
  
 
<h3>C8: cell type signature gene sets</h3>
 
<h3>C8: cell type signature gene sets</h3>
 +
 +
<h3>Redundant Terms Annotations</h3>
 +
Gene set sub-collections updated in this release that have undergone redundancy filtering for inclusion in MSigDB now have an additional field on the gene set page "Redundant Terms". This field contains the source database IDs of other candidate gene sets that clustered with the selected set, and exhibited a Jaccard coefficients >0.85 with the selected set but were not selected on the basis of tree distance or set size. These database IDs link to the source resource's page for that term as in the EXTERNAL_DETAILS_URL field.
  
 
<h2>Updates to Existing Gene Sets by Collection</h2>
 
<h2>Updates to Existing Gene Sets by Collection</h2>
Line 37: Line 40:
  
 
<h3>C2:CP:WikiPathways</h3>
 
<h3>C2:CP:WikiPathways</h3>
WikiPathways gene sets have been updated to reflect the state of WikiPathways Release 20210210 (+XX gene sets).
+
WikiPathways gene sets have been updated to reflect the state of WikiPathways Release 20210310 (+XX gene sets).
 
<h3>C3 regulatory target gene sets</h3>
 
<h3>C3 regulatory target gene sets</h3>
  

Revision as of 14:57, 10 March 2021

GSEA Home | Downloads | Molecular Signatures Database | Documentation | Contact

This page describes the changes made to the gene set collections for Release 7.3 of the Molecular Signatures Database (MSigDB). This release includes a reorganization of C7 to accommodate the addition of vaccination response gene sets provided by the Human Immunology Project Consortium among other minor updates and additions.

Note: Due to substantial changes introduced in MSigDB 7.0, using GSEA 4.0.0+ is recommended when utilizing MSigDB 7.0+ resources.
Advisory: It is strongly recommended that users of MSigDB 7.3 always use the GSEA "Collapse/Remap to gene symbols" feature with the provided Symbol Remapping chip file if your dataset was generated with a transcriptome other than Ensembl v103/GENCODE v37.

New Additions and Changes to Collection Organization

C2:CGP

Gene sets describing the molecular effect of over expression of S1PR3 in Leukemia (PMID33458693), and signatures describing the effects of anti-TNF therapy on inflammatory bowel disease (PMID33429950) as well as gene sets contributed by the following individuals have been added to C2:CGP

  • Jorge Benitez, University of California, San Diego - BENITEZ_GBM_PROTEASOME_INHIBITION_RESPONSE Signature, (PMID33428749)
  • Martin Fischer, Leibniz Institute on Aging, Fritz Lipmann Institute - RIEGE_DELTANP63_DIRECT_TARGETS_UP Signature, (PMID33263276)

C7: immunologic signature gene sets

C8: cell type signature gene sets

Redundant Terms Annotations

Gene set sub-collections updated in this release that have undergone redundancy filtering for inclusion in MSigDB now have an additional field on the gene set page "Redundant Terms". This field contains the source database IDs of other candidate gene sets that clustered with the selected set, and exhibited a Jaccard coefficients >0.85 with the selected set but were not selected on the basis of tree distance or set size. These database IDs link to the source resource's page for that term as in the EXTERNAL_DETAILS_URL field.

Updates to Existing Gene Sets by Collection

C1 (positional gene sets)

C1 has been updated to reflect the primary assembly of the current release of the Human Genome as present in Ensembl 103 and GENCODE 37 (GRCh38) (+XX gene set). Gene annotations for this collection are derived from the Chromosome and Karyotype band tracks from the Ensembl BioMart (version 103) and reflect the gene architecture as represented on the primary assembly.

C2:CP:Reactome

  • Reactome gene sets have been updated to reflect the state of the Reactome pathway architecture as of Reactome v75 (+XX gene sets).
  • As previously described in the Reactome release notes for MSigDB 7.0, in order to limit redundancy between gene sets within the Reactome sub-collection we applied a filtering procedure based on Jaccard coefficients and distance from the top level of the Reactome event hierarchy.

C2:CP:WikiPathways

WikiPathways gene sets have been updated to reflect the state of WikiPathways Release 20210310 (+XX gene sets).

C3 regulatory target gene sets

C3:GTRD has been updated to GTRD v20.06.

C5:GO (Gene Ontology)

Gene sets in these sub-collections are derived from the controlled vocabulary of the Gene Ontology (GO) project: The Gene Ontology Consortium. Gene Ontology: tool for the unification of biology (Nature Genet 2000). The gene sets are named by GO term and contain genes annotated by that term. This collection has been updated to the most recent GO annotations as present in the GO-basic obo file released on 2021-XX-XX and NCBI gene2go annotations downloaded on 2021-MM-DD.

This collection is divided into three sub-collections:

  • BP: GO Biological process (+XX gene sets). Gene sets derived from the Biological Process Ontology.
  • CC: GO Cellular component (+XX gene sets). Gene sets derived from the Cellular Component Ontology.
  • MF: GO Molecular function (+XX gene sets). Gene sets derived from the Molecular Function Ontology.

Gene sets in GO subcollection previously had the universal prefix "GO_", this prefix has been updated to be sub-collection specific. Gene sets in GO:BP now begin with "GOBP_", GO:CC now beign wiht "GOCC_", and GO:MF now begin with "GOMF_". This change should enable better "at a glace" determinations of which GO sub-collection was the origin of a specific gene set hit in analysis pipelines.

These updates were generated in accordance with the procedure described in the GO release notes for MSigDB 7.0.

C5:HPO (Hyman Phenotype Ontology)

Gene sets in this subcollection have been updated to reflect the 2021-02-08 release of the Human Phenotype Ontology database. This sub-collection has been redundancy filtered through a procedure comparable to that of the GO and Reactome sub-collections.

CHIP file updates

All CHIP files previously provided in the standard MSigDB 7.2 release have been updated for MSigDB 7.3 in accordance with previously described procedures.

Gene orthology annotations for mapping mouse and rat genes to their best match human orthologs have been updated to Alliance of Genome Resources orthology database release 3.2.