Filter positions based on any combination of sequence context,
coverage, repetition of the same position, and presence of non-NA values.
Filters are applied in the order they are specified to the filters
argument. Any filter type can be repeated an arbitrary number of times.
Usage
filterPositions(
se,
filters = c("sequenceContext", "coverage", "all.na"),
sequenceContext = NULL,
assayNameCov = "Nvalid",
minCov = 1,
minNbrSamples = NULL,
assayNameAmbig = "Nvalid",
assayNameNA = "mod_prob"
)
Arguments
- se
A
SummarizedExperiment
object.- filters
A character vector. All values must be one of
"sequenceContext"
,"coverage"
,"repeated.positions"
and"all.na"
. Filters are applied in the order specified by this vector.- sequenceContext
A character vector with sequence contexts to retain. To apply this filter, the
"sequenceContext"
column must be present inrowData(se)
(seeaddSeqContext
).- assayNameCov
A character scalar indicating the assay to use to define the coverage. If this is a read-level assay, coverage is first calculated using
flattenReadLevelAssay(..., statistics = "Nvalid")
.- minCov
A numeric scalar indicating the lowest acceptable coverage in order to keep a position.
- minNbrSamples
A numeric scalar, or
NULL
. IfNULL
(default), the row sum ofassayNameCov
(i.e., the total coverage across all samples) is used for the coverage filtering. If notNULL
, a position is required to have at leastminCov
coverage in at leastminNbrSamples
to be retained.- assayNameAmbig
A character scalar indicating the assay to use to decide which row to retain if multiple rows represent the same genomic position (on different strands). The row with the largest row sum in this assay is retained.
- assayNameNA
A character scalar indicating the assay to use as the basis for filtering out positions with NA values across all reads. This should be a read level assay.
Examples
modbamfiles <- system.file("extdata", c("6mA_1_10reads.bam", "6mA_2_10reads.bam"),
package = "footprintR")
reffile <- system.file("extdata", "reference.fa.gz", package = "footprintR")
se <- readModBam(bamfiles = modbamfiles, regions = "chr1:6920000-6940000",
modbase = "a", verbose = FALSE,
BPPARAM = BiocParallel::SerialParam())
se <- flattenReadLevelAssay(se)
se <- addSeqContext(se, sequenceContextWidth = 3, sequenceReference = reffile)
sefilt <- filterPositions(se, c("sequenceContext", "coverage", "all.na"),
minCov = 5, sequenceContext = "TAG")