How do I run ABSOLUTE?To run ABSOLUTE, you'll need R version 2.12.0 or higher with the numDeriv package installed. You can install this with the following command: install.packages("numDeriv") If you're using HAPSEG as your input, you will also need to install the HAPSEG package supplied in the release bundle. The subdirectory R_packages contains both source and Windows binary versions. Instructions on how to install these packages can be found at http://cran.r-project.org/doc/manuals/R-admin.html#Installing-packages. Alternatively you can supply a tab delimited segmentation file (e.g. from array CGH or massively parralel sequencing experiments) - this file must contain the columns "Chromosome", "Start", "End", "Num_Probes" and "Segment_Mean". Your file may contain other columns besides these but as a minimum these columns must be specified. To run in this mode you must also specify the copy_num_type argument as "total". To run ABSOLUTE first identify your input file (either a HAPSEG segdat or your segmentation file) and issue the following command: library(ABSOLUTE)
RunAbsolute(seg.dat.fn, sigma.p, max.sigma.h, min.ploidy, max.ploidy, primary.disease, platform, sample.name, results.dir, max.as.seg.count, max.non.clonal, max.neg.genome, copy_num_type, maf.fn=NULL, min.mut.af=NULL, output.fn.base=NULL, verbose=FALSE)
An explanation of the arguments follows:
There are four optional, named arguments. To use this, specify them by name e.g. verbose=TRUE.
This will supply two output files. In both cases OFB corresponds to either the output.fn.base (if supplied) or the array name.
Multiple ABSOLUTE results can be summarized for an analyst to perform the manual review step using the CreateReviewObject function, which is run as follows:
There are four optional, named arguments.To use this, specify them by name e.g. verbose=TRUE.
This call will produce multiple files as output, where obj is the obj.name argument passed in.
Using these files the analyst will optionally override the solutions provided by createReviewObject. Tips for manually reviewing ABSOLUTE solutions can be found here. To override the default solutions from ABSOLUTE, prepend a column to the left of the obj_summary.PP-calls_tab.txt file (it can be named anything). In any row where the analyst chooses to override the default solution (which would be left blank or optionally with the value 1) put the solution number that you wish to use. Once you've annotated the file to your satisfaction you may trigger the final stage of ABSOLUTE with the ExtractReviewedResults function, which is run as follows:
There is one optional argument verbose. If you would like to use this setting specify verbose=TRUE. Running this function will create a directory out.dir.base/reviewed which will contain the following files:
ExampleHere is an example invocation of the ABSOLUTE on the mixing experiment data in Figure 2d, using input data from a previous HAPSEG run and the bundled data. This is intended to be run in the bundle directory. This can be run on a multicore system by uncommenting the registerDoMC call and specifying the number of cores that you wish to use. Besides ABSOLUTE this code also requires the use of foreach and optionally doMC. This code will create a directory output which will contain a a per-sample output directory as well as a subdirectory named abs_log which provides per-sample text file containing the standard out being emitted from R. DoAbsolute <- function(scan, sif) { registerDoSEQ() library(ABSOLUTE) plate.name <- "DRAWS" genome <- "hg18" platform <- "SNP_250K_STY" primary.disease <- sif[scan, "PRIMARY_DISEASE"] sample.name <- sif[scan, "SAMPLE_NAME"] sigma.p <- 0 max.sigma.h <- 0.02 min.ploidy <- 0.95 max.ploidy <- 10 max.as.seg.count <- 1500 max.non.clonal <- 0 max.neg.genome <- 0 copy_num_type <- "allelic" seg.dat.fn <- file.path("output", scan, "hapseg", paste(plate.name, "_", scan, "_segdat.RData", sep="")) results.dir <- file.path(".", "output", scan, "absolute") print(paste("Starting scan", scan, "at", results.dir)) log.dir <- file.path(".", "output", "abs_logs") if (!file.exists(log.dir)) { dir.create(log.dir, recursive=TRUE) } if (!file.exists(results.dir)) { dir.create(results.dir, recursive=TRUE) } sink(file=file.path(log.dir, paste(scan, ".abs.out.txt", sep=""))) RunAbsolute(seg.dat.fn, sigma.p, max.sigma.h, min.ploidy, max.ploidy, primary.disease, platform, sample.name, results.dir, max.as.seg.count, max.non.clonal, max.neg.genome, copy_num_type, verbose=TRUE) sink() } arrays.txt <- "./paper_example/mix250K_arrays.txt" sif.txt <- "./paper_example/mix_250K_SIF.txt" ## read in array names scans <- readLines(arrays.txt)[-1] sif <- read.delim(sif.txt, as.is=TRUE) library(foreach) ## library(doMC) ## registerDoMC(20) foreach (scan=scans, .combine=c) %dopar% { DoAbsolute(scan, sif) } obj.name <- "DRAWS_summary" results.dir <- file.path(".", "output", "abs_summary") absolute.files <- file.path(".", "output", scans, "absolute", paste(scans, ".ABSOLUTE.RData", sep="")) library(ABSOLUTE) CreateReviewObject(obj.name, absolute.files, results.dir, "allelic", verbose=TRUE) ## At this point you'd perform your manual review and mark up the file ## output/abs_summary/DRAWS_summary.PP-calls_tab.txt by prepending a column with ## your desired solution calls. After that (or w/o doing that if you choose to accept ## the defaults, which is what running this code will do) run the following command: calls.path = file.path("output", "abs_summary", "DRAWS_summary.PP-calls_tab.txt") modes.path = file.path("output", "abs_summary", "DRAWS_summary.PP-modes.data.RData") output.path = file.path("output", "abs_extract") ExtractReviewedResults(calls.path, "test", modes.path, output.path, "absolute", "allelic")
|