Calculate normalized expression of a gene set — normGenesetExpression • swissknife

Calculate normalized expression for a set of genes in each cell from a SingleCellExperiment, using random sets of similarly expressed genes as background to account for cell quality and sequencing depth.

normGenesetExpression(
  sce,
  genes,
  expr_values = "logcounts",
  subset.row = NULL,
  R = 200,
  nbins = 100,
  BPPARAM = SerialParam()
)

Arguments

sce: SingleCellExperiment object.
genes: character vector with the genes in the set. Must be a subset of rownames(sce).
expr_values: Integer scalar or string indicating which assay of sce contains the expression values.
subset.row: Sample random genes only from these. If NULL (the default), the function will sample from all genes in sce. Alternatively, subset.row can be a logical, integer or character vector indicating the rows (genes) of sce to use for sampling. This allows for example to exclude highly variable genes from the sampling which are likely expressed only in certain cell types.
R: Integer scalar giving the number of random gene sets to sample for normalization.
nbins: Integer scalar, specifying the number of bins to group the average expression levels into before sampling (passed to sampleControlElements). Higher numbers of bins will increase the match to the target distribution(s), but may fail if there are few elements to sample from.
BPPARAM: An optional BiocParallelParam instance determining the parallel back-end to be used during evaluation.

Value

A numeric vector with normalized gene set scores for each cell in sce.

Author

Michael Stadler

Examples

if (require(SingleCellExperiment)) {
    # get sce
    example(SingleCellExperiment, echo=FALSE)
    rownames(sce) <- paste0("g", seq.int(nrow(sce)))
    
    # calculate gene set expression scores
    markers <- c("g1", "g13", "g27")
    scores <- normGenesetExpression(sce, markers, R = 50)
    
    # compare expression of marker genes with scores
    plotdat <- cbind(scores, t(logcounts(sce)[markers, ]))
    cor(plotdat)
    pairs(plotdat)
}