Skip to contents

Given scores or statistical estimates for windows, identify regions of interest by fusing consistent neighboring windows along the genome.

Usage

fuseWindows(
  x,
  scoreCol = "dirNegLog10PValue",
  thresh = 3,
  minperiod = 3,
  maxGap = 50,
  verbose = FALSE
)

Arguments

x

GRanges object with window-based scores or estimates. Ranges correspond to windows, and columns in mcols(x) to scores.

scoreCol

Character scalar giving the column name in mcols(x) to use for the analysis.

thresh

A numeric scalar giving the minimal absolute window score (after smoothing, see minperiod argument) defining a region of interest. Higher values make the region detection more stringent.

minperiod

Numeric scalar that defines the low-pass filter parameter used to smooth the scores for segmentation. minperiod gives the minimal period (in number of windows) for the critical frequency in the score signal that should be retained. The default value is suitable to retain signals that occur in three neighboring windows. The filtering is performed using filtfilt and thus requires the signal package to be installed.

maxGap

Numeric scalar giving the maximal gap between neighboring windows, in base pairs, from the end of the first to the start of the next, to be fused into a single region of interest.

verbose

Logical scalar. If TRUE, report on progress.

Value

A GRanges object with identified regions of interest.

Author

Sebastien Smallwood, Charlotte Soneson, Michael Stadler

Examples

modbamfiles <- system.file("extdata",
                           c("6mA_1_10reads.bam", "6mA_1_10reads.bam",
                             "6mA_2_10reads.bam", "6mA_2_10reads.bam"),
                           package = "footprintR")
se <- quantifyWindowsInRegion(bamfiles = modbamfiles,
                              region = "chr1:6940000-6955000", modbase = "a",
                              BPPARAM = BiocParallel::SerialParam())
se$group <- c("group1", "group1", "group2", "group2")
gr <- getDifferentiallyModifiedWindows(se, groupCol = "group")
gr
#> GRanges object with 134 ranges and 6 metadata columns:
#>         seqnames          ranges strand |     logFC    logCPM           LR
#>            <Rle>       <IRanges>  <Rle> | <numeric> <numeric>    <numeric>
#>     [1]     chr1 6940001-6940024      * |  6.922263   13.0368    25.338333
#>     [2]     chr1 6940013-6940036      * |  7.025231   13.4291    29.825436
#>     [3]     chr1 6940025-6940048      * |  5.122833   13.7647     9.165793
#>     [4]     chr1 6940037-6940060      * |  0.586798   13.7916     0.166427
#>     [5]     chr1 6940049-6940072      * |  1.214905   13.6515     1.284355
#>     ...      ...             ...    ... .       ...       ...          ...
#>   [130]     chr1 6941549-6941572      * |   4.49461   11.4962  2.96406e-07
#>   [131]     chr1 6941561-6941584      * |   5.20575   11.7372 -1.76178e-07
#>   [132]     chr1 6941573-6941596      * |   5.67996   11.9437  2.50715e-07
#>   [133]     chr1 6941585-6941608      * |   5.46225   11.8441  2.11052e-07
#>   [134]     chr1 6941597-6941620      * |   6.03626   12.1242 -1.03612e-07
#>              PValue         FDR dirNegLog10PValue
#>           <numeric>   <numeric>         <numeric>
#>     [1] 4.81053e-07 3.22306e-05          6.317807
#>     [2] 4.72749e-08 6.33483e-06          7.325370
#>     [3] 2.46581e-03 6.60836e-02          2.608041
#>     [4] 6.83307e-01 1.00000e+00          0.165384
#>     [5] 2.57091e-01 1.00000e+00          0.589913
#>     ...         ...         ...               ...
#>   [130]    0.999566           1       0.000188696
#>   [131]    1.000000           1       0.000000000
#>   [132]    0.999600           1       0.000173541
#>   [133]    0.999633           1       0.000159220
#>   [134]    1.000000           1       0.000000000
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths

grFused <- fuseWindows(x = gr, scoreCol = "logFC", thresh = 5.0)
grFused
#> GRanges object with 5 ranges and 5 metadata columns:
#>       seqnames          ranges strand | logFCThresh numWindowsThresh direction
#>          <Rle>       <IRanges>  <Rle> |   <numeric>        <integer>  <factor>
#>   [1]     chr1 6940001-6940036      * |     6.90561                2        up
#>   [2]     chr1 6940469-6940528      * |     6.49495                4        up
#>   [3]     chr1 6940817-6940864      * |     5.91258                3        up
#>   [4]     chr1 6940925-6941080      * |     5.82304                7        up
#>   [5]     chr1 6941153-6941620      * |     5.52120               33        up
#>           logFC numWindows
#>       <numeric>  <integer>
#>   [1]   6.90561          2
#>   [2]   6.49495          4
#>   [3]   5.91258          3
#>   [4]   5.04266         12
#>   [5]   5.40114         38
#>   -------
#>   seqinfo: 1 sequence from an unspecified genome; no seqlengths