R/getInsertSizeDistFromBam.R
getInsertSizeDistFromBam.Rd
Read and tabulate the insert sizes from paired-end alignments
contained in one or several bam files. By default, all properly aligned
read pairs are included. Optionally, alignments can be restricted to
those in a specific genomic region (regions
argument) or the number
of alignments read can be limited (nmax
argument).
getInsertSizeDistFromBam(
fname,
regions = NULL,
nmax = NA_integer_,
isizemax = 800,
exclude = c("chrM", "chrY", "chrX")
)
character
vector with paths to one or several bam files.
If multiple files are given, insert sizes from all will be pooled and
tabulated together.
GRanges
object. Only alignments falling into these
regions will be used. If NULL
(the default), all alignments are
used.
numeric(1)
specifying the maximal number of alignments to
read. If NA
(the default), the alignments in regions
(if
regions
are not NULL
) or in the bam file will be used.
numeric(1)
specifying the maximal insert size to
report. Larger insert sizes will be set to isizemax
with on their
number will be reported.
character
vector with chromosome names to be excluded.
Alignments on these chromosomes will be excluded. exclude
will
be ignored if regions
is not NULL
.
integer
vector with the number of insert sizes. The element at
position i
gives the observed number of alignment pairs with an
insert size of i
. The number of insert sizes greater than
isizemax
that were set to isizemax
are reported in the
attribute "ncapped"
.
scanBam
used to read alignments.
if (requireNamespace("Rsamtools", quietly = TRUE)) {
bamf <- system.file("extdata", "getInsertSizeDistFromBam", "atac_mm10.bam",
package = "swissknife")
isize <- getInsertSizeDistFromBam(bamf)
attr(isize, "ncapped")
plot(isize, type = "l",
xlab = "Insert size (bp)", ylab = "Number of fragments")
}