Decompose input contrasts to decoded and residual fractions

Decompose input contrasts (gene expression deltas) to decoded (generic) and residual (unique) components according to a contrast encoder-decoder pre-trained on a large corpus of public RNAseq experiments.

decomposeVar(
  M,
  MD = NULL,
  treatm = NULL,
  cntr = NULL,
  processInput = TRUE,
  organism = c("Human", "Mouse"),
  featureType = c("AUTO", "ENSEMBL_GENE_ID", "GENE_SYMBOL", "ENTREZ_GENE_ID",
    "ARCHS4_ID"),
  pseudocount = 4,
  verbose = TRUE
)

Arguments

M: Matrix of raw gene counts.
MD: Matrix of gene deltas (optional). If MD is specified, M is assumed to be a raw gene count matrix specifying context for contrasts specified in MD. MD is then a matrix of gene deltas with the same dimensions as M. If MD is specified, treatm and cntr have to be NULL.
treatm, cntr: Vectors indicating column indices in M corresponding to treatments and controls. If treatm and cntr are specified, MD has to be NULL.
processInput: If set to TRUE (default) the count matrix will be preprocessed (library normalized, log2-transformed after addition of a pseudocount, NA values will be set to 0).
organism: Selects the autoencoder model trained on data from this species. One of "Human" or "Mouse".
featureType: Set to "AUTO" for automatic feature id-type detection. Alternatively specify the type of supplied id features. Current supported types are "ENSEMBL_GENE_ID", "GENE_SYMBOL", "ENTREZ_GENE_ID" and "ARCHS4_ID".
pseudocount: Numerical scalar, added to raw counts in M when preprocessInput = TRUE.
verbose: Logical scalar indicating whether to print messages along the way.

Value

A SummarizedExperiment object with the decomposed contrasts in the assays and the decomposed variance as the colData.

Details

When calling decomposeVar(), you may see an ImportError on the console. This most likely does not have any negative consequences, rather it means that R and python may not be library compatible and that an automated fallback approach is being used (for more details see testload argument of basiliskStart).

Author

Panagiotis Papasaikas

Examples

MKL1_human <- readRDS(system.file("extdata", "GSE215150_MKL1_Human.rds",
package = "orthos"))

# Specifying M, treatm and cntr:
dec_MKL1_human <- decomposeVar(M = MKL1_human, treatm = c(2, 3), cntr = c(1, 1), 
                              organism = "Human", verbose = FALSE)
#> see ?orthosData and browseVignettes('orthosData') for documentation
#> loading from cache
#> require(“keras”)
#> see ?orthosData and browseVignettes('orthosData') for documentation
#> loading from cache
#> see ?orthosData and browseVignettes('orthosData') for documentation
#> loading from cache
                              
                              
# Alternatively by specifying M and MD:
pseudocount <- 4 
M  <- sweep(MKL1_human, 2,
            colSums(MKL1_human), FUN = "/") * 1e+06
M  <- log2(M + pseudocount)
DeltaM <- M[,c("MKL1","caMKL1")]-M[,"Ctrl"] # Matrix of contrasts
ContextM <- M[,c("Ctrl","Ctrl")] # Matrix with context for the specified contrasts
colnames(ContextM) <- colnames(DeltaM) # M and MD need identical dimnames                       
RES <- decomposeVar(M = ContextM, MD = DeltaM, processInput = FALSE)
#> Checking input...
#> demo_decomposed_contrasts_human_rds  already present in cache at: /Users/runner/Library/Caches/org.R-project.R/R/ExperimentHub/human_v212_NDF_c100_DEMOse.rds
#> demo_decomposed_contrasts_human_hdf5  already present in cache at: /Users/runner/Library/Caches/org.R-project.R/R/ExperimentHub/human_v212_NDF_c100_DEMOassays.h5
#> Detecting feature ids-type...
#> Feature ids-type detected: GENE_SYMBOL
#> 18079/59453 provided input features mapped against a total of 20411 model features.
#> 2332 missing features will be set to 0.
#> --> Missing features corresponding to non/lowly expressed genes in your context(s) are of no consequence.
#> --> The model is robust to small fractions (<10%) of missing genes that are expressed in your context(s).
#> --> Increased numbers of missing expressed genes in your input might result in model performance decline.
#> Encoding context...
#> see ?orthosData and browseVignettes('orthosData') for documentation
#> loading from cache
#> Encoding and decoding contrasts...
#> see ?orthosData and browseVignettes('orthosData') for documentation
#> loading from cache
#> see ?orthosData and browseVignettes('orthosData') for documentation
#> loading from cache
#> Preparing output...
#> Done!