Aggregate different rows assigned to the same ID by calculating a weighted mean

First row means are calculated to summarize across replicates identified by the groupCol in the colData. Then different row means that are assigned to the same feature ID given by the idCol in the rowData are summarized by calculating a weighted mean. This weighted mean is the sum of the squared row means divided by the sum of the row means. If all row means are 0, they remain 0 in the output.

weightedMeanByID(
  SE,
  assay,
  idCol = "GENEID",
  groupCol = "group",
  log2Transformed = TRUE
)

Arguments

SE: a SummarizedExperiment object that contains an assay with values to be aggregated, a colData column that assigns samples to their group and a rowData column with IDs to indicate which rows to combine.
assay: the name of the assay in the SummarizedExperiment object that should be aggregated.
idCol: the column name in the rowData of the SummarizedExperiment indicating the feature ID.
groupCol: the column name in the colData of the SummarizedExperiment indicating which columns belong to the same group and should be averaged as replicates, before the weighted mean is calculated across rows.
log2Transformed: a logical indicating whether values in the assay are log2 transformed. If log2Transformed is TRUE, an exponential transformation will be applied before aggregating the values and another log transformation afterwards.

Value

The output is a data.frame with one column for each of the unique names in the groupCol and one row for each of the unique IDs in the idCol. The row and column names are the respective unique values. The entries represent the weighted means for each unique feature ID. If all the input values were NA, the aggregated value is also NA, while for all zero, the output remains zero. If log2Transformed is true the output will be log2 transformed again.

Author

Fiona Ross

Examples

set.seed(123)
meansRows <- sample(1:100, 10, replace = TRUE)
dat <- unlist(lapply(meansRows, function(m) {
    rnorm(n = 5, mean = m, sd = 0.1*m)
}))
ma <- matrix(dat, nrow = 10, ncol = 5, byrow = TRUE)
IDs <- data.frame(ID = sample(c("A", "B", "C", "D"), size = 10, replace = TRUE))
Groups <- data.frame(group = c("Y","Y", "Z", "Z", "Z"))
mockSE <- SummarizedExperiment::SummarizedExperiment(
                               assays = list(counts = ma),
                               rowData = IDs,
                               colData = Groups)
weightedMeanByID(mockSE, "counts", idCol = "ID", log2Transformed = FALSE)                                
#>          Y        Z
#> A 41.41792 41.55580
#> B 62.67139 64.04632
#> D 37.82352 35.57733
#> C 47.53044 52.55207