Difference between revisions of "MSigDB XML description"
Jump to navigation
Jump to search
m |
m |
||
(17 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
− | MSigDB database in XML format captures both the content (i.e., gene members) and annotation about the gene sets in a given release of MSigDB. This page describes the tags and attributes of the XML file. | + | [http://www.broadinstitute.org/gsea/ GSEA Home] | |
− | <br/> | + | [http://www.broadinstitute.org/gsea/downloads.jsp Downloads] | |
− | <br/> | + | [http://www.broadinstitute.org/gsea/msigdb/ Molecular Signatures Database] | |
+ | [http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Main_Page Documentation] | | ||
+ | [http://www.broadinstitute.org/gsea/contact.jsp Contact] | ||
+ | <br> | ||
+ | The MSigDB database in XML format captures both the content (i.e., gene members) and annotation about the gene sets in a given release of MSigDB. This page describes the tags and attributes of the XML file. <br /> | ||
+ | <br /> | ||
<p>Attributes of the <strong>MSIGDB</strong> tag document the whole database in the file.</p> | <p>Attributes of the <strong>MSIGDB</strong> tag document the whole database in the file.</p> | ||
<table width="75%" cellspacing="2" cellpadding="5" border="2"> | <table width="75%" cellspacing="2" cellpadding="5" border="2"> | ||
− | |||
<tr> | <tr> | ||
− | <th>XML ATTRIBUTE | + | <th>XML ATTRIBUTE</th> |
− | <th>DESCRIPTION | + | <th>DESCRIPTION</th> |
<th> </th> | <th> </th> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td> MSIGDB NAME</td> | <td> MSIGDB NAME</td> | ||
− | <td>Name of the database | + | <td>Name of the database</td> |
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>VERSION | + | <td>VERSION</td> |
− | <td>Version of the database | + | <td>Version of the database</td> |
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>BUILD_DATE | + | <td>BUILD_DATE</td> |
<td>Date the XML file was built</td> | <td>Date the XML file was built</td> | ||
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
− | |||
</table> | </table> | ||
<br /> | <br /> | ||
Line 31: | Line 34: | ||
<p>Attributes of the <strong>GENESET</strong> tags document individual gene sets in the file.</p> | <p>Attributes of the <strong>GENESET</strong> tags document individual gene sets in the file.</p> | ||
<table width="75%" cellspacing="2" cellpadding="5" border="2"> | <table width="75%" cellspacing="2" cellpadding="5" border="2"> | ||
− | |||
<tr> | <tr> | ||
− | <th>XML ATTRIBUTE | + | <th>XML ATTRIBUTE</th> |
− | <th>DESCRIPTION | + | <th>DESCRIPTION</th> |
<th> </th> | <th> </th> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td>STANDARD_NAME </td> | <td>STANDARD_NAME </td> | ||
− | <td>Gene set name | + | <td>Gene set name</td> |
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>SYSTEMATIC_NAME | + | <td>SYSTEMATIC_NAME</td> |
− | <td>Gene set name for internal indexing purposes | + | <td>Gene set name for internal indexing purposes</td> |
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>HISTORICAL_NAMES | + | <td>HISTORICAL_NAMES</td> |
<td>Comma-separated list of older gene set names, starting from VERSION="V.2.5" of MSigDB</td> | <td>Comma-separated list of older gene set names, starting from VERSION="V.2.5" of MSigDB</td> | ||
<td>optional</td> | <td>optional</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>ORGANISM | + | <td>ORGANISM</td> |
− | <td>Organism name | + | <td>Organism name</td> |
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>PMID | + | <td>PMID</td> |
− | <td>PubMed ID for the source publication | + | <td>PubMed ID for the source publication</td> |
<td>optional</td> | <td>optional</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>AUTHORS | + | <td>AUTHORS</td> |
− | <td>Authors of the gene set source publication, according to PubMed ID | + | <td>Authors of the gene set source publication, according to PubMed ID</td> |
<td>optional</td> | <td>optional</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
<td>GEOID</td> | <td>GEOID</td> | ||
− | <td>A GEO or ArrayExpress ID for the raw microarray data in GEO or ArrayExpress repository | + | <td>A GEO or ArrayExpress ID for the raw microarray data in GEO or ArrayExpress repository</td> |
<td>optional</td> | <td>optional</td> | ||
</tr> | </tr> | ||
+ | <tr> | ||
+ | <td>EXACT_SOURCE</td> | ||
+ | <td>Description of the exact source of the set - usually a specific figure or table in the source publication.</td> | ||
+ | <td>optional</td> | ||
+ | </tr> | ||
<tr> | <tr> | ||
− | <td> | + | <td>GENESET_LISTING_URL</td> |
− | <td>URL of the original source that listed the gene set members | + | <td>URL of the original source that listed the gene set members</td> |
<td>optional</td> | <td>optional</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>EXTERNAL_DETAILS_URL | + | <td>EXTERNAL_DETAILS_URL</td> |
− | <td>URL of the original source page of the gene set | + | <td>URL of the original source page of the gene set</td> |
<td>optional</td> | <td>optional</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>CHIP | + | <td>CHIP</td> |
− | <td>Indicates the type of the original gene set members, equivalent to the CHIP file, e.g., "HG-U133A" | + | <td>Indicates the type of the original gene set members, equivalent to the CHIP file, e.g., "HG-U133A"</td> |
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>CATEGORY_CODE | + | <td>CATEGORY_CODE</td> |
− | <td>Gene set collection code, e.g., C2 | + | <td>Gene set collection code, e.g., C2</td> |
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
Line 98: | Line 105: | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>CONTRIBUTOR | + | <td>CONTRIBUTOR</td> |
− | <td>Name of the person or institution that contributed the gene set to MSigDB | + | <td>Name of the person or institution that contributed the gene set to MSigDB</td> |
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>CONTRIBUTOR_ORG | + | <td>CONTRIBUTOR_ORG</td> |
− | <td>Name of the organization associated with the gene set contributor | + | <td>Name of the organization associated with the gene set contributor</td> |
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>DESCRIPTION_BRIEF | + | <td>DESCRIPTION_BRIEF</td> |
− | <td>Brief description of the gene set | + | <td>Brief description of the gene set</td> |
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>DESCRIPTION_FULL | + | <td>DESCRIPTION_FULL</td> |
− | <td>Full description of the gene set or abstract of the source publication | + | <td>Full description of the gene set or abstract of the source publication</td> |
<td>optional</td> | <td>optional</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>TAGS | + | <td>TAGS</td> |
− | <td>Optional tags to enhance gene set annotations; currently not in use | + | <td>Optional tags to enhance gene set annotations; currently not in use</td> |
<td>optional</td> | <td>optional</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>MEMBERS | + | <td>MEMBERS</td> |
− | <td>Comma-separated list of gene set members as they originally appeared in the source | + | <td>Comma-separated list of gene set members as they originally appeared in the source</td> |
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>MEMBERS_SYMBOLIZED | + | <td>MEMBERS_SYMBOLIZED</td> |
− | <td>Comma-separated list of gene set members in the form of human gene symbols | + | <td>Comma-separated list of gene set members in the form of human gene symbols</td> |
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
<tr> | <tr> | ||
− | <td>MEMBERS_EZID | + | <td>MEMBERS_EZID</td> |
− | <td>Comma-separated list of gene set members in the form of human Entrez Gene IDs | + | <td>Comma-separated list of gene set members in the form of human Entrez Gene IDs</td> |
+ | <td>required</td> | ||
+ | </tr> | ||
+ | <tr> | ||
+ | </tr> | ||
+ | <tr> | ||
+ | <td>MEMBERS_MAPPING</td> | ||
+ | <td>Pipe-separated list of mappings between gene set members in the form of: <br> | ||
+ | MEMBERS, MEMBERS_SYMBOLIZED, MEMBERS_EZID</td> | ||
+ | <td>required</td> | ||
+ | </tr> | ||
+ | <tr> | ||
+ | <td>FOUNDER_NAMES</td> | ||
+ | <td>Pipe-separated list of v4.0 MSigDB founder gene sets for the hallmark signatures</td> | ||
+ | <td>applies to hallmarks only</td> | ||
+ | </tr> | ||
+ | <tr> | ||
+ | <td>REFINEMENT_DATASETS</td> | ||
+ | <td>Pipe-separated list of GEO or ArrayExpress identifiers of microarray data used to refine hallmark signatures<br> | ||
+ | GEO or ArrayExpress ID, comparison details</td> | ||
+ | <td>applies to hallmarks only</td> | ||
+ | </tr> | ||
+ | <tr> | ||
+ | <td>VALIDATION_DATASETS</td> | ||
+ | <td>Pipe-separated list of GEO or ArrayExpress identifiers of microarray data used to validate hallmark signatures<br> | ||
+ | GEO or ArrayExpress ID, comparison details</td> | ||
+ | <td>applies to hallmarks only</td> | ||
+ | </tr> | ||
+ | <tr> | ||
+ | <td>STATUS</td> | ||
+ | <td>Indicates gene set status. In the current release, all gene sets have their STATUS="public"</td> | ||
<td>required</td> | <td>required</td> | ||
</tr> | </tr> | ||
− | |||
</table> | </table> |
Latest revision as of 09:49, 2 May 2017
GSEA Home |
Downloads |
Molecular Signatures Database |
Documentation |
Contact
The MSigDB database in XML format captures both the content (i.e., gene members) and annotation about the gene sets in a given release of MSigDB. This page describes the tags and attributes of the XML file.
Attributes of the MSIGDB tag document the whole database in the file.
XML ATTRIBUTE | DESCRIPTION | |
---|---|---|
MSIGDB NAME | Name of the database | required |
VERSION | Version of the database | required |
BUILD_DATE | Date the XML file was built | required |
Attributes of the GENESET tags document individual gene sets in the file.
XML ATTRIBUTE | DESCRIPTION | |
---|---|---|
STANDARD_NAME | Gene set name | required |
SYSTEMATIC_NAME | Gene set name for internal indexing purposes | required |
HISTORICAL_NAMES | Comma-separated list of older gene set names, starting from VERSION="V.2.5" of MSigDB | optional |
ORGANISM | Organism name | required |
PMID | PubMed ID for the source publication | optional |
AUTHORS | Authors of the gene set source publication, according to PubMed ID | optional |
GEOID | A GEO or ArrayExpress ID for the raw microarray data in GEO or ArrayExpress repository | optional |
EXACT_SOURCE | Description of the exact source of the set - usually a specific figure or table in the source publication. | optional |
GENESET_LISTING_URL | URL of the original source that listed the gene set members | optional |
EXTERNAL_DETAILS_URL | URL of the original source page of the gene set | optional |
CHIP | Indicates the type of the original gene set members, equivalent to the CHIP file, e.g., "HG-U133A" | required |
CATEGORY_CODE | Gene set collection code, e.g., C2 | required |
SUB_CATEGORY_CODE | Gene set subcategory code, e.g., CGP | optional |
CONTRIBUTOR | Name of the person or institution that contributed the gene set to MSigDB | required |
CONTRIBUTOR_ORG | Name of the organization associated with the gene set contributor | required |
DESCRIPTION_BRIEF | Brief description of the gene set | required |
DESCRIPTION_FULL | Full description of the gene set or abstract of the source publication | optional |
TAGS | Optional tags to enhance gene set annotations; currently not in use | optional |
MEMBERS | Comma-separated list of gene set members as they originally appeared in the source | required |
MEMBERS_SYMBOLIZED | Comma-separated list of gene set members in the form of human gene symbols | required |
MEMBERS_EZID | Comma-separated list of gene set members in the form of human Entrez Gene IDs | required |
MEMBERS_MAPPING | Pipe-separated list of mappings between gene set members in the form of: MEMBERS, MEMBERS_SYMBOLIZED, MEMBERS_EZID |
required |
FOUNDER_NAMES | Pipe-separated list of v4.0 MSigDB founder gene sets for the hallmark signatures | applies to hallmarks only |
REFINEMENT_DATASETS | Pipe-separated list of GEO or ArrayExpress identifiers of microarray data used to refine hallmark signatures GEO or ArrayExpress ID, comparison details |
applies to hallmarks only |
VALIDATION_DATASETS | Pipe-separated list of GEO or ArrayExpress identifiers of microarray data used to validate hallmark signatures GEO or ArrayExpress ID, comparison details |
applies to hallmarks only |
STATUS | Indicates gene set status. In the current release, all gene sets have their STATUS="public" | required |