This function takes read names or indices and subsets them from the
specified assays of RangedSummarizedExperiment
with read-level data. While subsetting of samples (the columns of the
SummarizedExperiment object) can be done easily (e.g. se[, 1]
),
the reads are grouped by sample in read-level assays, and this function
provides a more convenient way to subset these nested reads.
Arguments
- se
A
RangedSummarizedExperiment
, typically generated byreadModBam
orreadModkitExtract
.- reads
Defines reads to retain (or remove, if
invert=TRUE
). Either a character vector of read identifiers, or a named list in which names are samples fromcolnames(se)
and the elements are index vectors (character, integer or logical) defining the reads for each sample.- prune
A logical scalar. If
TRUE
(the default), samples for which the subsetting retains none of the reads will be completely removed from the returnedSummarizedExperiment
(also fromcolData
and from assays that do not store read-level data). IfFALSE
, such samples are retained (in the assays with read-level data as a zero-columnSparseMatrix
).- invert
A logical scalar. If
FALSE
(the default), only the reads defined byreads
are retained. Ifinvert=TRUE
, all reads except the ones inreads
are retained.
Value
A subset RangedSummarizedExperiment
object.
Examples
library(SummarizedExperiment)
modbamfiles <- system.file("extdata",
c("6mA_1_10reads.bam", "6mA_2_10reads.bam"),
package = "footprintR")
se <- readModBam(modbamfiles, "chr1:6940000-6955000", "a",
BPPARAM = BiocParallel::SerialParam())
lapply(assay(se, "mod_prob"), colnames)
#> $s1
#> [1] "s1-233e48a7-f379-4dcf-9270-958231125563"
#> [2] "s1-d52a5f6a-a60a-4f85-913e-eada84bfbfb9"
#> [3] "s1-92e906ae-cddb-4347-a114-bf9137761a8d"
#>
#> $s2
#> [1] "s2-034b625e-6230-4f8d-a713-3a32cd96c298"
#> [2] "s2-d03efe3b-a45b-430b-9cb6-7e5882e4faf8"
#>
# subset by read identifiers
seSub <- subsetReads(se, c("s1-233e48a7-f379-4dcf-9270-958231125563",
"s2-034b625e-6230-4f8d-a713-3a32cd96c298"))
lapply(assay(seSub, "mod_prob"), colnames)
#> $s1
#> [1] "s1-233e48a7-f379-4dcf-9270-958231125563"
#>
#> $s2
#> [1] "s2-034b625e-6230-4f8d-a713-3a32cd96c298"
#>
# subset by a list of indices
seSub <- subsetReads(se, list(s1 = c(1, 3),
s2 = c(TRUE, FALSE)))
lapply(assay(seSub, "mod_prob"), colnames)
#> $s1
#> [1] "s1-233e48a7-f379-4dcf-9270-958231125563"
#> [2] "s1-92e906ae-cddb-4347-a114-bf9137761a8d"
#>
#> $s2
#> [1] "s2-034b625e-6230-4f8d-a713-3a32cd96c298"
#>