Given scores or statistical estimates for windows, identify regions of interest by fusing consistent neighboring windows along the genome.
Usage
fuseWindows(
x,
scoreCol = "dirNegLog10PValue",
thresh = 3,
minperiod = 3,
maxGap = 50,
verbose = FALSE
)
Arguments
- x
GRanges
object with window-based scores or estimates. Ranges correspond to windows, and columns inmcols(x)
to scores.- scoreCol
Character scalar giving the column name in
mcols(x)
to use for the analysis.- thresh
A numeric scalar giving the minimal absolute window score (after smoothing, see
minperiod
argument) defining a region of interest. Higher values make the region detection more stringent.- minperiod
Numeric scalar that defines the low-pass filter parameter used to smooth the scores for segmentation.
minperiod
gives the minimal period (in number of windows) for the critical frequency in the score signal that should be retained. The default value is suitable to retain signals that occur in three neighboring windows. The filtering is performed usingfiltfilt
and thus requires thesignal
package to be installed.- maxGap
Numeric scalar giving the maximal gap between neighboring windows, in base pairs, from the end of the first to the start of the next, to be fused into a single region of interest.
- verbose
Logical scalar. If
TRUE
, report on progress.
Value
A GRanges
object with identified
regions of interest.
Examples
modbamfiles <- system.file("extdata",
c("6mA_1_10reads.bam", "6mA_1_10reads.bam",
"6mA_2_10reads.bam", "6mA_2_10reads.bam"),
package = "footprintR")
se <- quantifyWindowsInRegion(bamfiles = modbamfiles,
region = "chr1:6940000-6955000", modbase = "a",
BPPARAM = BiocParallel::SerialParam())
se$group <- c("group1", "group1", "group2", "group2")
gr <- getDifferentiallyModifiedWindows(se, groupCol = "group")
gr
#> GRanges object with 134 ranges and 6 metadata columns:
#> seqnames ranges strand | logFC logCPM LR
#> <Rle> <IRanges> <Rle> | <numeric> <numeric> <numeric>
#> [1] chr1 6940001-6940024 * | 6.922263 13.0368 25.338333
#> [2] chr1 6940013-6940036 * | 7.025231 13.4291 29.825436
#> [3] chr1 6940025-6940048 * | 5.122833 13.7647 9.165793
#> [4] chr1 6940037-6940060 * | 0.586798 13.7916 0.166427
#> [5] chr1 6940049-6940072 * | 1.214905 13.6515 1.284355
#> ... ... ... ... . ... ... ...
#> [130] chr1 6941549-6941572 * | 4.49461 11.4962 2.96406e-07
#> [131] chr1 6941561-6941584 * | 5.20575 11.7372 -1.76178e-07
#> [132] chr1 6941573-6941596 * | 5.67996 11.9437 2.50715e-07
#> [133] chr1 6941585-6941608 * | 5.46225 11.8441 2.11052e-07
#> [134] chr1 6941597-6941620 * | 6.03626 12.1242 -1.03612e-07
#> PValue FDR dirNegLog10PValue
#> <numeric> <numeric> <numeric>
#> [1] 4.81053e-07 3.22306e-05 6.317807
#> [2] 4.72749e-08 6.33483e-06 7.325370
#> [3] 2.46581e-03 6.60836e-02 2.608041
#> [4] 6.83307e-01 1.00000e+00 0.165384
#> [5] 2.57091e-01 1.00000e+00 0.589913
#> ... ... ... ...
#> [130] 0.999566 1 0.000188696
#> [131] 1.000000 1 0.000000000
#> [132] 0.999600 1 0.000173541
#> [133] 0.999633 1 0.000159220
#> [134] 1.000000 1 0.000000000
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths
grFused <- fuseWindows(x = gr, scoreCol = "logFC", thresh = 5.0)
grFused
#> GRanges object with 5 ranges and 5 metadata columns:
#> seqnames ranges strand | logFCThresh numWindowsThresh direction
#> <Rle> <IRanges> <Rle> | <numeric> <integer> <factor>
#> [1] chr1 6940001-6940036 * | 6.90561 2 up
#> [2] chr1 6940469-6940528 * | 6.49495 4 up
#> [3] chr1 6940817-6940864 * | 5.91258 3 up
#> [4] chr1 6940925-6941080 * | 5.82304 7 up
#> [5] chr1 6941153-6941620 * | 5.52120 33 up
#> logFC numWindows
#> <numeric> <integer>
#> [1] 6.90561 2
#> [2] 6.49495 4
#> [3] 5.91258 3
#> [4] 5.04266 12
#> [5] 5.40114 38
#> -------
#> seqinfo: 1 sequence from an unspecified genome; no seqlengths