The R code and example file to run power and sample size calculator can be downloaded here: powerSampleSizeCalculator

Example usage for estimating sample size from preliminary data

Step 1) Install dependencies:

 

source("http://bioconductor.org/biocLite.R")

biocLite(pkgs=c("edgeR", "DESeq", "DESeq2", "sSeq", "EBSeq"))

install.packages("MASS")

 

 

1) Load the source script from the directory." This contains all the code required to run the simulation:

 

setwd(<my_directory>)

source("rs_simulations.r")

 

 

2) Read a dataset to a matrix and construct a vector indicating the conditions of each sample." The Bottomly dataset, downloaded from http://bowtie-bio.sourceforge.net/recount/ is provided as an example:

 

read.table("bottomly_count_table.txt", sep = "\t", header=T) -> rawdata

rownames(rawdata) <- rawdata$gene

rawdata <- as.matrix(rawdata[,-1])

head(rawdata)

 

condition = c(rep("A", 10),rep("B", 11))

print(condition)

 

 

3) Estimate dataset parameters using the "estimate_params" function." This takes a long time, so save the parameters for later use as well:

 

estimate_params(rawdata=rawdata, condition=condition, designtype="one factor") -> params

save(params, file="bottomly_params.Rdata")

 

 

4) Run simulations with the "RS_simulation" function." This function returns a matrix of results for each simulation in the conditions tested."

 

Results = RS_simulation(sims=5, params=params, budget=3000, designtype = "one factor", nmax = 20, nmin = 5, program="DESeq")

 

plot(rownames(results),rowMeans(results, na.rm=T), main="DESeq simulations on Bottomly Dataset", xlab = "number of replicates", ylab = "Power")