Calculate normalized expression for a set of genes in each cell
from a SingleCellExperiment
, using random sets of similarly
expressed genes as background to account for cell quality and
sequencing depth.
normGenesetExpression(
sce,
genes,
expr_values = "logcounts",
subset.row = NULL,
R = 200,
nbins = 100,
BPPARAM = SerialParam()
)
SingleCellExperiment
object.
character
vector with the genes in the set. Must be a
subset of rownames(sce)
.
Integer scalar or string indicating which assay of
sce
contains the expression values.
Sample random genes only from these. If NULL
(the default), the function will sample from all genes in sce
.
Alternatively, subset.row
can be a logical, integer or character
vector indicating the rows (genes) of sce
to use for sampling.
This allows for example to exclude highly variable genes from the sampling
which are likely expressed only in certain cell types.
Integer scalar giving the number of random gene sets to sample for normalization.
Integer scalar, specifying the number of bins to group the
average expression levels into before sampling (passed to
sampleControlElements
). Higher numbers of bins
will increase the match to the target distribution(s), but may fail if
there are few elements to sample from.
An optional BiocParallelParam
instance determining the parallel back-end to be used during evaluation.
A numeric
vector with normalized gene set scores for each
cell in sce
.
if (require(SingleCellExperiment)) {
# get sce
example(SingleCellExperiment, echo=FALSE)
rownames(sce) <- paste0("g", seq.int(nrow(sce)))
# calculate gene set expression scores
markers <- c("g1", "g13", "g27")
scores <- normGenesetExpression(sce, markers, R = 50)
# compare expression of marker genes with scores
plotdat <- cbind(scores, t(logcounts(sce)[markers, ]))
cor(plotdat)
pairs(plotdat)
}