Difference between revisions of "MSigDB v2023.1.Hs Release Notes"

From GeneSetEnrichmentAnalysisWiki
Jump to navigation Jump to search
(Created page with '<span class="plainlinks"> [http://www.broadinstitute.org/gsea/ GSEA Home] | [http://www.broadinstitute.org/gsea/downloads.jsp Downloads] | [http://www.broadinstitute.org/gsea/ms…')
 
 
(5 intermediate revisions by one other user not shown)
Line 11: Line 11:
 
This page describes updates made to the Molecular Signatures Database Human Collections for release 2023.1 (MSigDB 2023.1.Hs).
 
This page describes updates made to the Molecular Signatures Database Human Collections for release 2023.1 (MSigDB 2023.1.Hs).
  
'''In order to access the MSigBD mouse collections through the GSEA UI, the latest version of GSEA (4.3+) is required.'''
+
'''In order to access the MSigBD mouse collections through the GSEA UI, the GSEA 4.3.0 or newer is required.'''
  
 
MSigDB v2023.1 is based on gene annotation data from Ensembl Release 109 (Feb 2023).
 
MSigDB v2023.1 is based on gene annotation data from Ensembl Release 109 (Feb 2023).
  
  
<h1>Updates to Human Collections (MSigDB v2022.1.Hs)</h1>
+
<h1>Updates to Human Collections (MSigDB v2023.1.Hs)</h1>
  
 
<h2>C1: positional gene sets</h2>
 
<h2>C1: positional gene sets</h2>
Updated human gene annotations to Ensembl 107.
+
Updated human gene annotations to Ensembl 109 (+1 gene set).
 
<h2>C2:CGP</h2>
 
<h2>C2:CGP</h2>
  
15 Gene sets contributed by MSigDB users have been added to C2:CGP
+
6 Gene sets contributed by MSigDB users have been added to C2:CGP
 
<ul>
 
<ul>
     <li><span class="plainlinks">[https://gsea-msigdb.org/gsea/msigdb/human/geneset/]</span></li>
+
     <li><span class="plainlinks">[https://gsea-msigdb.org/gsea/msigdb/human/geneset/SAUL_SEN_MAYO SAUL_SEN_MAYO]</span></li>
 +
    <li><span class="plainlinks">[https://gsea-msigdb.org/gsea/msigdb/human/geneset/MA_RAT_AGING_UP MA_RAT_AGING_UP]</span></li>
 +
    <li><span class="plainlinks">[https://gsea-msigdb.org/gsea/msigdb/human/geneset/MA_RAT_AGING_DN MA_RAT_AGING_DN]</span></li>
 +
    <li><span class="plainlinks">[https://gsea-msigdb.org/gsea/msigdb/human/geneset/NOURUZI_NEPC_ASCL1_TARGETS NOURUZI_NEPC_ASCL1_TARGETS]</span></li>
 +
    <li><span class="plainlinks">[https://gsea-msigdb.org/gsea/msigdb/human/geneset/KOHN_EMT_EPITHELIAL KOHN_EMT_EPITHELIAL]</span></li>
 +
    <li><span class="plainlinks">[https://gsea-msigdb.org/gsea/msigdb/human/geneset/KOHN_EMT_MESENCHYMAL KOHN_EMT_MESENCHYMAL]</span></li>
 
</ul>
 
</ul>
 
<br>
 
<br>
Line 31: Line 36:
  
 
<ul>
 
<ul>
     <li>Reactome gene sets have been updated to reflect the state of the Reactome pathway architecture as of '''Reactome v83''' (+XX gene sets).</li>
+
     <li>Reactome gene sets have been updated to reflect the state of the Reactome pathway architecture as of '''Reactome v83''' (+19 gene sets).</li>
 
     <li>As previously described in the [[MSigDB_v7.0_Release_Notes#C2:CP:Reactome_-_Major_overhaul | Reactome release notes for MSigDB 7.0]], in order to limit redundancy between gene sets within the Reactome sub-collection we applied a filtering procedure based on Jaccard coefficients and distance from the top level of the Reactome event hierarchy.</li>
 
     <li>As previously described in the [[MSigDB_v7.0_Release_Notes#C2:CP:Reactome_-_Major_overhaul | Reactome release notes for MSigDB 7.0]], in order to limit redundancy between gene sets within the Reactome sub-collection we applied a filtering procedure based on Jaccard coefficients and distance from the top level of the Reactome event hierarchy.</li>
 
</ul>
 
</ul>
  
 
<h2>C2:CP:WikiPathways</h2>
 
<h2>C2:CP:WikiPathways</h2>
WikiPathways gene sets have been updated to the February 10, 2023 release (+XX gene sets).
+
WikiPathways gene sets have been updated to the February 10, 2023 release (+21 gene sets).
  
 
<h2>C3:TFT:GTRD</h2>
 
<h2>C3:TFT:GTRD</h2>
<p>GTRD data was updated to the 21.12 release.</p>
+
<p>GTRD data was updated to the 21.12 release. (-12 gene sets)</p>
  
 
<h2>C5:GO (Gene Ontology)</h2>
 
<h2>C5:GO (Gene Ontology)</h2>
Line 46: Line 51:
 
<p>This collection is divided into three sub-collections:</p>
 
<p>This collection is divided into three sub-collections:</p>
 
<ul>
 
<ul>
     <li><strong>BP</strong>: GO Biological process (+XX gene sets). Gene sets derived from the Biological Process Ontology, and are prefixed with "GOBP_".</li>
+
     <li><strong>BP</strong>: GO Biological process (-12 gene sets). Gene sets derived from the Biological Process Ontology, which are prefixed with "GOBP_".</li>
     <li><strong>CC</strong>: GO Cellular component (+XX gene sets). Gene sets derived from the Cellular Component Ontology, and are prefixed with "GOCC_".</li>
+
     <li><strong>CC</strong>: GO Cellular component (-26 gene sets). Gene sets derived from the Cellular Component Ontology, which are prefixed with "GOCC_".</li>
     <li><strong>MF</strong>: GO Molecular function (+XX gene sets). Gene sets derived from the Molecular Function Ontology, and are prefixed with "GOMF_"..</li>
+
     <li><strong>MF</strong>: GO Molecular function (+9 gene sets). Gene sets derived from the Molecular Function Ontology, which are prefixed with "GOMF_"..</li>
 
</ul>
 
</ul>
  
Line 55: Line 60:
 
<h2>C5:HPO (Human Phenotype Ontology)</h2>
 
<h2>C5:HPO (Human Phenotype Ontology)</h2>
  
Gene sets in this sub-collection have been updated to reflect the 2023-01-27 release of the Human Phenotype Ontology database (+XX gene sets). This sub-collection has been redundancy filtered through a procedure comparable to that of the GO and Reactome sub-collections.
+
Gene sets in this sub-collection have been updated to reflect the 2023-01-27 release of the Human Phenotype Ontology database (+263 gene sets). This sub-collection has been redundancy filtered through a procedure comparable to that of the GO and Reactome sub-collections.
  
 
<h2>C8 cell type signature gene sets</h2>
 
<h2>C8 cell type signature gene sets</h2>
<p>Added gene sets describing lung cell type identity signatures from <span class="plainlinks">[https://pubmed.ncbi.nlm.nih.gov/36493756/ He P., Lim K., et al. 2022 A human fetal lung cell atlas uncovers proximal-distal gradients of differentiation and key regulators of epithelial fates.] <span class="plainlinks">[(https://lungcellatlas.org)]</span> (+XX gene sets) </p>
+
<p>Added gene sets describing lung cell type identity signatures from <span class="plainlinks">[https://pubmed.ncbi.nlm.nih.gov/36493756/ He P., Lim K., et al. 2022 A human fetal lung cell atlas uncovers proximal-distal gradients of differentiation and key regulators of epithelial fates.]</span> <span class="plainlinks">(https://lungcellatlas.org)</span> (+126 gene sets)</p>
 
   
 
   
 
<h2>CHIP file updates</h2>
 
<h2>CHIP file updates</h2>
Line 65: Line 70:
 
     <li>Gene orthology annotations for mapping mouse and rat genes to their best match human orthologs have been updated to <span class="plainlinks">[https://www.alliancegenome.org/ Alliance of Genome Resources]</span> orthology database release 5.3.0 (2022-10-28)</li>
 
     <li>Gene orthology annotations for mapping mouse and rat genes to their best match human orthologs have been updated to <span class="plainlinks">[https://www.alliancegenome.org/ Alliance of Genome Resources]</span> orthology database release 5.3.0 (2022-10-28)</li>
 
</ul>
 
</ul>
 +
 +
<h2>SQLite Database</h2>
 +
<p>With this release we have created a new SQLite database for the fully annotated gene sets in both the Human (2023.1.Hs) and the Mouse (2023.1.Ms) resources. Each ships as a single-file database usable with any compliant SQLite client. This new format brings the MSigDB contents and metadata with all of the searchability and manipulative power of a full relational database. See our [[MSigDB_SQLite_Database|documentation]] for more details on the contents and usage.</p>
 +
<p>Note that we will continue producing the XML file for now, but it should be considered deprecated with the intention to eventually be entirely removed in a future release.</p>

Latest revision as of 17:18, 6 April 2023

GSEA Home | Downloads | Molecular Signatures Database | Documentation | Contact

Important Notices

This page describes updates made to the Molecular Signatures Database Human Collections for release 2023.1 (MSigDB 2023.1.Hs).

In order to access the MSigBD mouse collections through the GSEA UI, the GSEA 4.3.0 or newer is required.

MSigDB v2023.1 is based on gene annotation data from Ensembl Release 109 (Feb 2023).


Updates to Human Collections (MSigDB v2023.1.Hs)

C1: positional gene sets

Updated human gene annotations to Ensembl 109 (+1 gene set).

C2:CGP

6 Gene sets contributed by MSigDB users have been added to C2:CGP


C2:CP:Reactome

  • Reactome gene sets have been updated to reflect the state of the Reactome pathway architecture as of Reactome v83 (+19 gene sets).
  • As previously described in the Reactome release notes for MSigDB 7.0, in order to limit redundancy between gene sets within the Reactome sub-collection we applied a filtering procedure based on Jaccard coefficients and distance from the top level of the Reactome event hierarchy.

C2:CP:WikiPathways

WikiPathways gene sets have been updated to the February 10, 2023 release (+21 gene sets).

C3:TFT:GTRD

GTRD data was updated to the 21.12 release. (-12 gene sets)

C5:GO (Gene Ontology)

Gene sets in these sub-collections are derived from the controlled vocabulary of the Gene Ontology (GO) project: The Gene Ontology Consortium. Gene Ontology: tool for the unification of biology (Nature Genet 2000). The gene sets are named by GO term and contain genes annotated by that term. This collection has been updated to the most recent GO annotations as present in the GO-basic obo file released on 2023-01-01 and NCBI gene2go annotations downloaded on 2023-02-10.

This collection is divided into three sub-collections:

  • BP: GO Biological process (-12 gene sets). Gene sets derived from the Biological Process Ontology, which are prefixed with "GOBP_".
  • CC: GO Cellular component (-26 gene sets). Gene sets derived from the Cellular Component Ontology, which are prefixed with "GOCC_".
  • MF: GO Molecular function (+9 gene sets). Gene sets derived from the Molecular Function Ontology, which are prefixed with "GOMF_"..

These updates were generated in accordance with the procedure described in the GO release notes for MSigDB 7.0.

C5:HPO (Human Phenotype Ontology)

Gene sets in this sub-collection have been updated to reflect the 2023-01-27 release of the Human Phenotype Ontology database (+263 gene sets). This sub-collection has been redundancy filtered through a procedure comparable to that of the GO and Reactome sub-collections.

C8 cell type signature gene sets

Added gene sets describing lung cell type identity signatures from He P., Lim K., et al. 2022 A human fetal lung cell atlas uncovers proximal-distal gradients of differentiation and key regulators of epithelial fates. (https://lungcellatlas.org) (+126 gene sets)

CHIP file updates

  • MSigDB 2023.1.Hs gene annotations and gene mapping CHIP files have been updated to data from Ensembl 109.
  • Gene orthology annotations for mapping mouse and rat genes to their best match human orthologs have been updated to Alliance of Genome Resources orthology database release 5.3.0 (2022-10-28)

SQLite Database

With this release we have created a new SQLite database for the fully annotated gene sets in both the Human (2023.1.Hs) and the Mouse (2023.1.Ms) resources. Each ships as a single-file database usable with any compliant SQLite client. This new format brings the MSigDB contents and metadata with all of the searchability and manipulative power of a full relational database. See our documentation for more details on the contents and usage.

Note that we will continue producing the XML file for now, but it should be considered deprecated with the intention to eventually be entirely removed in a future release.