Gsea enhancements

From GeneSetEnrichmentAnalysisWiki
Revision as of 12:29, 24 April 2006 by Hkuehn (talk | contribs)
Jump to navigation Jump to search

Notes

1. By default the hyperlinks from the software will point to the PROD server. To make them connect to the DEV server, use the -Ddebug=true flag.

2. Rejected/Revised bugs removed from this page: see GSEA_enh_history.doc.


Beta testing 4/18:

1. Created a tiny dataset with 4 samples; create phenotype on the fly with 1 sample in ClassA and 3 in ClassB; got this error. If I create phenotype on the fly with 2 samples in each class, life is good.

Nan hit score for feature: TACC2 ---- Stack Trace ---- # of exceptions: 1 ------Nan hit score for feature: TACC2------ java.lang.IllegalStateException: Nan hit score for feature: TACC2 at edu.mit.broad.genome.alg.gsea.KSCore.calculateKSScore_all_modes(KSCore.java:137) at edu.mit.broad.genome.alg.gsea.KSCore.calculateKSScore(KSCore.java:46) at edu.mit.broad.genome.alg.gsea.KSTests.shuffleTemplate_canned_templates(KSTests.java:377) at edu.mit.broad.genome.alg.gsea.KSTests.shuffleTemplate(KSTests.java:292) at edu.mit.broad.genome.alg.gsea.KSTests.executeGsea(KSTests.java:156) at edu.mit.broad.genome.alg.gsea.KSTests.executeGsea(KSTests.java:130) at xtools.gsea.Gsea.execute_one(Gsea.java:152) at xtools.gsea.Gsea.execute(Gsea.java:104) at edu.mit.broad.xbench.tui.TaskManager$ToolRunnable.run(TaskManager.java:464) at java.lang.Thread.run(Unknown Source) 2. From the command line, gene set names have to be case sensitive. They should be case INsensitive. (Tested using xtools.gsea.LeadingEdgeTool, where -gsets is comma-separated list of gene set names.)

If 2 samples or less in a class then signal to noise, tTest will not work. Need to use ratio of means.
If you want to trap the error and provide a "standard" error box with a help button; I'm happy to write wiki text for it.

Found 4/4:

1. Specify Bhattacharyya with Continous pheno, should get error 1011 (get hardcoded error). Metric for ranking genes parameter.

Bhattacharyya is a continuous metric so isnt this correct?
When I run Bhattacharyya with a continuous phenotype I get the following error (no error help button):

 Tool execution error
 Message: Template is not biphasic. Name: 100_g_at_profile_in_p53_dataset_hgu95av2.cls#100_g_at # splits= 50
 This metric can only be used with 2 class comparisons

-------------------------------------------------------------
java.lang.RuntimeException: Template is not biphasic. Name: 100_g_at_profile_in_p53_dataset_hgu95av2.cls#100_g_at # splits= 50
 This metric can only be used with 2 class comparisons
    at edu.mit.broad.genome.alg.VectorSplitter._barf_not_biphasic(VectorSplitter.java:71)....

2. Specify Pearson with Categorical, should get error 1010 (get hardcoded error). Metric for ranking genes parameter.

Pearson is allowed for categorical & continuous
When I run Pearson with a Categorical phenotype, I get the following error (no help button); now that I look more closely, it doesn't seem to be related to the Categorical/Continuous thing...

Message: For input string: "MUT"

-------------------------------------------------------------
java.lang.NumberFormatException: For input string: "MUT"
    at sun.misc.FloatingDecimal.readJavaFormatString(Unknown Source)
    at java.lang.Float.parseFloat(Unknown Source)....

Found/requested 3/22 (with installed Gsea2):

Run GSEA, add a phenotypes file, create phenotypes on the fly (works fine), click the Show Phenotypes from all Sources. Should show labels from both the file you added and the one you just created, but now shows only one at a time. (This seems to have broken in the 3/21 build; I'm pretty sure it was working in the 3/20 build).
Open  (4/24)