Given scores or statistical estimates for sequential windows,
process them according to the scoreAction
argument.
Typical actions are to smooth the scores and possibly identify
regions of interest by fusing consistent neighboring windows
along the genome.
Usage
processWindowScores(
x,
scoreCol = "dirNegLog10PValue",
scoreAction = c("smoothFuse", "smooth", "pass", "select"),
minperiod = 3,
thresh = 3,
maxGap = 50,
verbose = FALSE
)
Arguments
- x
GRanges
object with window-based scores or estimates. Ranges correspond to windows, and columns inmcols(x)
to scores.- scoreCol
Character scalar giving the column name in
mcols(x)
to use for the analysis. ForscanForHighScoringRegions
, it can be a vector, in which caseprocessWindowScores
will be run for each of them and the results will be returned as a list.- scoreAction
A character scalar indicating how to process the scores. Currently supported values are:
- pass
: Do not modify window scores (pass-through). Note that if
mcols(x)
has multiple columns, they will all be passed through (i.e.,scoreCol
is ignored). To return onlyscoreCol
, useselect
.- select
: Pass through only
scoreCol
.- smooth
: Smooth window scores (
minperiod
argument).- smoothFuse (default)
: Smooth window scores, identify high scoring elements (
thresh
argument) and fuse nearby high-scoring elements (maxGap
argument).
- minperiod
Numeric scalar that defines the low-pass filter parameter used to smooth the scores for segmentation.
minperiod
gives the minimal period (in number of windows) for the critical frequency in the score signal that should be retained. The default value is suitable to retain signals that occur in three neighboring windows. The filtering is performed usingfiltfilt
and thus requires thesignal
package to be installed.- thresh
A numeric scalar giving the minimal absolute window score (after smoothing, see
minperiod
argument) defining a region of interest. Higher values make the region detection more stringent.- maxGap
Numeric scalar giving the maximal gap between neighboring windows, in base pairs, from the end of the first to the start of the next, to be fused into a single region of interest.
- verbose
Logical scalar. If
TRUE
, report on progress.
Value
A GRanges
object. For scoreAction
equal to "pass"
or "smooth"
, the returned object has the same
dimensions as the input object x
. For scoreAction="smoothFuse"
,
the returned object will only contain thresholded, fused regions.
Examples
modbamfiles <- system.file("extdata",
c("6mA_1_10reads.bam", "6mA_1_10reads.bam",
"6mA_2_10reads.bam", "6mA_2_10reads.bam"),
package = "footprintR")
se <- quantifyWindowsInRegion(bamfiles = modbamfiles,
region = "chr1:6940000-6955000", modbase = "a",
BPPARAM = BiocParallel::SerialParam())
se$group <- c("group1", "group1", "group2", "group2")
gr <- getDifferentiallyModifiedWindows(se, groupCol = "group")
gr
#> GRanges object with 136 ranges and 6 metadata columns:
#> seqnames ranges strand | logFC logCPM LR
#> <Rle> <IRanges> <Rle> | <numeric> <numeric> <numeric>
#> [1] chr1 6940000-6940023 * | 6.696961 12.9861 22.480673
#> [2] chr1 6940012-6940035 * | 7.018526 13.4242 29.825436
#> [3] chr1 6940024-6940047 * | 5.675384 13.7867 13.472295
#> [4] chr1 6940036-6940059 * | 0.585622 13.7867 0.166427
#> [5] chr1 6940048-6940071 * | 1.214138 13.6466 1.284355
#> ... ... ... ... . ... ... ...
#> [132] chr1 6941572-6941595 * | 5.67334 11.9387 2.48245e-07
#> [133] chr1 6941584-6941607 * | 5.45565 11.8392 2.08965e-07
#> [134] chr1 6941596-6941619 * | 5.86244 12.0319 2.87441e-07
#> [135] chr1 6941608-6941631 * | 5.67334 11.9387 2.48245e-07
#> [136] chr1 6941620-6941643 * | 4.48815 11.4913 2.93492e-07
#> PValue FDR dirNegLog10PValue
#> <numeric> <numeric> <numeric>
#> [1] 2.12269e-06 1.44343e-04 5.673114
#> [2] 4.72749e-08 6.42938e-06 7.325370
#> [3] 2.42112e-04 8.23181e-03 3.615984
#> [4] 6.83307e-01 1.00000e+00 0.165384
#> [5] 2.57091e-01 1.00000e+00 0.589913
#> ... ... ... ...
#> [132] 0.999602 1 0.000172683
#> [133] 0.999635 1 0.000158431
#> [134] 0.999572 1 0.000185820
#> [135] 0.999602 1 0.000172683
#> [136] 0.999568 1 0.000187766
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
grFused <- processWindowScores(x = gr, scoreCol = "logFC", thresh = 5.0)
grFused
#> GRanges object with 5 ranges and 5 metadata columns:
#> seqnames ranges strand | logFCThresh numWindowsThresh direction
#> <Rle> <IRanges> <Rle> | <numeric> <integer> <factor>
#> [1] chr1 6940000-6940035 * | 6.88595 2 positive
#> [2] chr1 6940468-6940539 * | 6.24705 5 positive
#> [3] chr1 6940816-6940863 * | 5.87596 3 positive
#> [4] chr1 6940984-6941079 * | 5.76636 7 positive
#> [5] chr1 6941152-6941631 * | 5.52749 33 positive
#> logFC numWindows
#> <numeric> <integer>
#> [1] 6.88595 2
#> [2] 6.24705 5
#> [3] 5.87596 3
#> [4] 5.76636 7
#> [5] 5.38673 39
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths