Difference between revisions of "MSigDB XML description"

From GeneSetEnrichmentAnalysisWiki
Jump to navigation Jump to search
m
m
Line 38: Line 38:
 
     <tbody>
 
     <tbody>
 
         <tr>
 
         <tr>
             <th>XML ATTRIBUTE         </th>
+
             <th>XML ATTRIBUTE</th>
             <th>DESCRIPTION                               </th>
+
             <th>DESCRIPTION</th>
 
             <th>&nbsp;</th>
 
             <th>&nbsp;</th>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
 
             <td>STANDARD_NAME </td>
 
             <td>STANDARD_NAME </td>
             <td>Gene set name.                                                                                                                                  </td>
+
             <td>Gene set name</td>
 
             <td>required</td>
 
             <td>required</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>SYSTEMATIC_NAME                 </td>
+
             <td>SYSTEMATIC_NAME</td>
             <td>Gene set name for internal indexing purposes.                                                                              </td>
+
             <td>Gene set name for internal indexing purposes</td>
 
             <td>required</td>
 
             <td>required</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>HISTORICAL_NAMES                 </td>
+
             <td>HISTORICAL_NAMES</td>
             <td>Comma-separated list of older gene set names, starting from VERSION=&quot;V.2.5&quot;  of MSigDB.</td>
+
             <td>Comma-separated list of older gene set names, starting from VERSION=&quot;V.2.5&quot;  of MSigDB</td>
 
             <td>optional</td>
 
             <td>optional</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>ORGANISM                                 </td>
+
             <td>ORGANISM</td>
             <td>Organism name.                                                                                                                                    </td>
+
             <td>Organism name</td>
 
             <td>required</td>
 
             <td>required</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>PMID                                           </td>
+
             <td>PMID</td>
             <td>PubMed ID for the source publication.                                                                                            </td>
+
             <td>PubMed ID for the source publication</td>
 
             <td>optional</td>
 
             <td>optional</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
 
             <td>AUTHORS                                  </td>
 
             <td>AUTHORS                                  </td>
             <td>Authors of the gene set source publication, according to PubMed ID.                                      </td>
+
             <td>Authors of the gene set source publication, according to PubMed ID</td>
 
             <td>optional</td>
 
             <td>optional</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
 
             <td>GEOID</td>
 
             <td>GEOID</td>
             <td>A GEO or ArrayExpress ID for the raw microarray data in GEO or ArrayExpress repository.</td>
+
             <td>A GEO or ArrayExpress ID for the raw microarray data in GEO or ArrayExpress repository</td>
 
             <td>optional</td>
 
             <td>optional</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>GENESET_LISTING_URL         </td>
+
             <td>GENESET_LISTING_URL</td>
             <td>URL of the original source that listed the gene set members.</td>
+
             <td>URL of the original source that listed the gene set members</td>
 
             <td>optional</td>
 
             <td>optional</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>EXTERNAL_DETAILS_URL       </td>
+
             <td>EXTERNAL_DETAILS_URL</td>
             <td>URL of the original source page of the gene set.                                                                            </td>
+
             <td>URL of the original source page of the gene set</td>
 
             <td>optional</td>
 
             <td>optional</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
 
             <td>CHIP                                            </td>
 
             <td>CHIP                                            </td>
             <td>Indicates the type of the original gene set members, equivalent to the CHIP file, e.g., &quot;HG-U133A&quot;</td>
+
             <td>Indicates the type of the original gene set members, equivalent to the CHIP file, e.g., &quot;HG-U133A&quot;</td>
 
             <td>required</td>
 
             <td>required</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>CATEGORY_CODE                     </td>
+
             <td>CATEGORY_CODE</td>
             <td>Gene set collection code, e.g., C2.                                                                                                        </td>
+
             <td>Gene set collection code, e.g., C2</td>
 
             <td>required</td>
 
             <td>required</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
 
             <td>SUB_CATEGORY_CODE</td>
 
             <td>SUB_CATEGORY_CODE</td>
             <td>Gene set subcategory code, e.g., CGP.</td>
+
             <td>Gene set subcategory code, e.g., CGP</td>
 
             <td>optional</td>
 
             <td>optional</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
 
             <td>CONTRIBUTOR                          </td>
 
             <td>CONTRIBUTOR                          </td>
             <td>Name of the person or institution that contributed the gene set to MSigDB.                                    </td>
+
             <td>Name of the person or institution that contributed the gene set to MSigDB</td>
 
             <td>required</td>
 
             <td>required</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
 
             <td>CONTRIBUTOR_ORG                </td>
 
             <td>CONTRIBUTOR_ORG                </td>
             <td>Name of the organization associated with the gene set contributor.                                                  </td>
+
             <td>Name of the organization associated with the gene set contributor</td>
 
             <td>required</td>
 
             <td>required</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>DESCRIPTION_BRIEF                 </td>
+
             <td>DESCRIPTION_BRIEF</td>
             <td>Brief description of the gene set.                                                                                                      </td>
+
             <td>Brief description of the gene set</td>
 
             <td>required</td>
 
             <td>required</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>DESCRIPTION_FULL                 </td>
+
             <td>DESCRIPTION_FULL</td>
             <td>Full description of the gene set or abstract of the source publication.                                        </td>
+
             <td>Full description of the gene set or abstract of the source publication</td>
 
             <td>optional</td>
 
             <td>optional</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>TAGS                                           </td>
+
             <td>TAGS</td>
             <td>Optional tags to enhance gene set annotations; currently not in use.                                        </td>
+
             <td>Optional tags to enhance gene set annotations; currently not in use</td>
 
             <td>optional</td>
 
             <td>optional</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>MEMBERS                                   </td>
+
             <td>MEMBERS</td>
             <td>Comma-separated list of gene set members as they originally appeared in the source.</td>
+
             <td>Comma-separated list of gene set members as they originally appeared in the source</td>
 
             <td>required</td>
 
             <td>required</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>MEMBERS_SYMBOLIZED           </td>
+
             <td>MEMBERS_SYMBOLIZED</td>
             <td>Comma-separated list of gene set members in the form of human gene symbols.                      </td>
+
             <td>Comma-separated list of gene set members in the form of human gene symbols</td>
 
             <td>required</td>
 
             <td>required</td>
 
         </tr>
 
         </tr>
 
         <tr>
 
         <tr>
             <td>MEMBERS_EZID                         </td>
+
             <td>MEMBERS_EZID</td>
             <td>Comma-separated list of gene set members in the form of human Entrez Gene IDs                   </td>
+
             <td>Comma-separated list of gene set members in the form of human Entrez Gene IDs</td>
 
             <td>required</td>
 
             <td>required</td>
 
         </tr>
 
         </tr>
Line 146: Line 146:
 
         <tr>
 
         <tr>
 
             <td>MEMBERS_MAPPING</td>
 
             <td>MEMBERS_MAPPING</td>
             <td>Pipe-separated list of mappings between gene set members in the form of:  <br />
+
             <td>Pipe-separated list of mappings between gene set members in the form of:  <br>
 
             MEMBERS, MEMBERS_SYMBOLIZED, MEMBERS_EZID</td>
 
             MEMBERS, MEMBERS_SYMBOLIZED, MEMBERS_EZID</td>
 
             <td>required</td>
 
             <td>required</td>
Line 157: Line 157:
 
         <tr>
 
         <tr>
 
             <td>REFINEMENT_DATASETS</td>
 
             <td>REFINEMENT_DATASETS</td>
             <td>Pipe-separated list of GEO or ArrayExpress identifiers of microarray data used to refine hallmark signatures<br />
+
             <td>Pipe-separated list of GEO or ArrayExpress identifiers of microarray data used to refine hallmark signatures<br>
 
             GEO or ArrayExpress ID, comparison details</td>
 
             GEO or ArrayExpress ID, comparison details</td>
 
             <td>applies to hallmarks only</td>
 
             <td>applies to hallmarks only</td>
Line 163: Line 163:
 
       <tr>
 
       <tr>
 
             <td>VALIDATION_DATASETS</td>
 
             <td>VALIDATION_DATASETS</td>
             <td>Pipe-separated list of GEO or ArrayExpress identifiers of microarray data used to validate hallmark signatures<br />
+
             <td>Pipe-separated list of GEO or ArrayExpress identifiers of microarray data used to validate hallmark signatures<br>
 
             GEO or ArrayExpress ID, comparison details |</td>
 
             GEO or ArrayExpress ID, comparison details |</td>
 
             <td>applies to hallmarks only</td>
 
             <td>applies to hallmarks only</td>
Line 169: Line 169:
 
       <tr>
 
       <tr>
 
             <td>STATUS</td>
 
             <td>STATUS</td>
             <td>Indicates gene set status. In the current release, all gene sets have their  STATUS=&quot;public&quot;.              </td>
+
             <td>Indicates gene set status. In the current release, all gene sets have their  STATUS=&quot;public&quot;</td>
 
             <td>required</td>
 
             <td>required</td>
 
         </tr>
 
         </tr>
 
     </tbody>
 
     </tbody>
 
</table>
 
</table>

Revision as of 10:29, 17 March 2015

<a href="http://www.broadinstitute.org/gsea/">GSEA Home</a> | <a href="http://www.broadinstitute.org/gsea/downloads.jsp">Downloads</a> | <a href="http://www.broadinstitute.org/gsea/msigdb/">Molecular Signatures Database</a> | <a href="http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Main_Page">Documentation</a> | <a href="http://www.broadinstitute.org/gsea/contact.jsp">Contact</a>
The MSigDB database in XML format captures both the content (i.e., gene members) and annotation about the gene sets in a given release of MSigDB. This page describes the tags and attributes of the XML file.

Attributes of the MSIGDB tag document the whole database in the file.

<tbody> </tbody>
XML ATTRIBUTE DESCRIPTION  
MSIGDB NAME Name of the database required
VERSION Version of the database required
BUILD_DATE Date the XML file was built required



Attributes of the GENESET tags document individual gene sets in the file.

<tbody> </tbody>
XML ATTRIBUTE DESCRIPTION  
STANDARD_NAME Gene set name required
SYSTEMATIC_NAME Gene set name for internal indexing purposes required
HISTORICAL_NAMES Comma-separated list of older gene set names, starting from VERSION="V.2.5" of MSigDB optional
ORGANISM Organism name required
PMID PubMed ID for the source publication optional
AUTHORS Authors of the gene set source publication, according to PubMed ID optional
GEOID A GEO or ArrayExpress ID for the raw microarray data in GEO or ArrayExpress repository optional
GENESET_LISTING_URL URL of the original source that listed the gene set members optional
EXTERNAL_DETAILS_URL URL of the original source page of the gene set optional
CHIP Indicates the type of the original gene set members, equivalent to the CHIP file, e.g., "HG-U133A" required
CATEGORY_CODE Gene set collection code, e.g., C2 required
SUB_CATEGORY_CODE Gene set subcategory code, e.g., CGP optional
CONTRIBUTOR Name of the person or institution that contributed the gene set to MSigDB required
CONTRIBUTOR_ORG Name of the organization associated with the gene set contributor required
DESCRIPTION_BRIEF Brief description of the gene set required
DESCRIPTION_FULL Full description of the gene set or abstract of the source publication optional
TAGS Optional tags to enhance gene set annotations; currently not in use optional
MEMBERS Comma-separated list of gene set members as they originally appeared in the source required
MEMBERS_SYMBOLIZED Comma-separated list of gene set members in the form of human gene symbols required
MEMBERS_EZID Comma-separated list of gene set members in the form of human Entrez Gene IDs required
MEMBERS_MAPPING Pipe-separated list of mappings between gene set members in the form of:
MEMBERS, MEMBERS_SYMBOLIZED, MEMBERS_EZID
required
FOUNDER_NAMES Pipe-separated list of v4.0 MSigDB founder gene sets for the hallmark signatures applies to hallmarks only
REFINEMENT_DATASETS Pipe-separated list of GEO or ArrayExpress identifiers of microarray data used to refine hallmark signatures
GEO or ArrayExpress ID, comparison details
applies to hallmarks only
VALIDATION_DATASETS Pipe-separated list of GEO or ArrayExpress identifiers of microarray data used to validate hallmark signatures
GEO or ArrayExpress ID, comparison details |
applies to hallmarks only
STATUS Indicates gene set status. In the current release, all gene sets have their STATUS="public" required