Create a fixed-step wig file from the alignments in the genomic bam files of the ‘QuasR’ project.
qExportWig(
proj,
file = NULL,
collapseBySample = TRUE,
binsize = 100L,
shift = 0L,
strand = c("*", "+", "-"),
scaling = TRUE,
tracknames = NULL,
log2p1 = FALSE,
colors = c("#1B9E77", "#D95F02", "#7570B3", "#E7298A", "#66A61E", "#E6AB02", "#A6761D",
"#666666"),
includeSecondary = TRUE,
mapqMin = 0L,
mapqMax = 255L,
absIsizeMin = NULL,
absIsizeMax = NULL,
createBigWig = FALSE,
useRead = c("any", "first", "last"),
pairedAsSingle = FALSE,
clObj = NULL
)
A qProject
object as returned by qAlign
.
A character vector with the name(s) for the wig or bigWig
file(s) to be generated. Either NULL
or a vector of the same
length as the number of bam files (for collapseBySample=FALSE
)
or the number of unique sample names (for collapseBySample=TRUE
)
in proj
. If NULL
, the wig or bigWig file names are generated
from the names of the genomic bam files or unique sample names with an
added “.wig.gz” or “.bw” extension.
If TRUE
, genomic bam files with identical
sample name will be combined (summed) into a single track.
A numerical value defining the bin and step size for the
wig or bigWig file(s). binsize
will be coerced to integer()
.
Either a vector or a scalar value defining the read shift (e.g.
half of fragment length, see ‘Details’). If length(shift)>1
,
the length must match the number of bam files in ‘proj’, and
the i-th sample will be converted to wig or bigWig using the value in
shift[i]
. shift
will be coerced to integer()
. For
paired-end alignments, shift
will be ignored, and a warning
will be issued if it is set to a non-zero value (see ‘Details’).
Only count alignments of strand
. The default
(“*”) will count all alignments.
If TRUE or a numerical value, the output values in the wig or bigWig file(s) will be linearly scaled by the total number of aligned reads per sample to improve comparability (see ‘Details’).
A character vector with the names of the tracks to appear
in the track header. If NULL
, the sample names in proj
will be used.
If TRUE
, the number of alignments x
per bin will
be transformed using the formula log2(x+1)
.
A character vector with R color names to be used for the tracks.
If TRUE
(the default), include alignments
with the secondary bit (0x0100) set in the FLAG
.
Minimal mapping quality of alignments to be included
(mapping quality must be greater than or equal to mapqMin
).
Valid values are between 0 and 255. The default (0) will include all
alignments.
Maximal mapping quality of alignments to be included
(mapping quality must be less than or equal to mapqMax
).
Valid values are between 0 and 255. The default (255) will include all
alignments.
For paired-end experiments, minimal absolute insert
size (TLEN field in SAM Spec v1.4) of alignments to be included. Valid
values are greater than 0 or NULL
(default), which will not
apply any minimum insert size filtering.
For paired-end experiments, maximal absolute insert
size (TLEN field in SAM Spec v1.4) of alignments to be included. Valid
values are greater than 0 or NULL
(default), which will not apply
any maximum insert size filtering.
If TRUE
, first a temporary wig file will be
created and then converted to BigWig format (file extension “.bw”)
using the wigToBigWig
function from
package rtracklayer.
For paired-end experiments, selects the read mate whose alignments should be counted, one of:
any
(default): count all alignments
first
: count only alignments from the first read
last
: count only alignments from the last read
For single-read alignments, this argument will be ignored. For
paired-end alignments, setting this argument to a value different
from the default (any
) will cause qExportWig
not to
automatically use the mid of fragments, but to treat the selected
read as if it would come from a single-read experiment (see
‘Details’).
If TRUE
, treat paired-end data single read
data, which means that instead of calculating fragment mid-points for
each read pair, the 5-prime ends of the reads is used. This is for example
useful when analyzing paired-end DNAse-seq or ATAC-seq data, in which
the read starts are informative for chromatin accessibility.
A cluster object to be used for parallel processing of multiple samples.
(invisible) The file name of the generated wig or bigWig file(s).
qExportWig()
uses the genome bam files in proj
as input
to create wig or bigWig files with the number of alignments (pairs)
per window of binsize
nucleotides. By default
(collapseBySample=TRUE
), one file per unique sample will be
created. If collapseBySample=FALSE
, one file per genomic bam
file will be created. See http://genome.ucsc.edu/goldenPath/help/wiggle.html
for the definition of the wig format, and
http://genome.ucsc.edu/goldenPath/help/bigWig.html for the definition
of the bigWig format.
The genome is tiled with sequential windows of length binsize
,
and alignments in the bam file are assigned to these windows: Single
read alignments are assigned according to their 5'-end coordinate
shifted by shift
towards the 3'-end (assuming that the 5'-end
is the leftmost coordinate for plus-strand alignments, and the rightmost
coordinate for minus-strand alignments). Paired-end alignments are
assigned according to the base in the middle between the leftmost and
rightmost coordinates of the aligned pair of reads, unless
pairedAsSingle = TRUE
is used. Each pair of reads
is only counted once, and not properly paired alignments are
ignored. If useRead
is set to select only the first or last
read in a paired-end experiment, the selected read will be treated as
reads from a single read experiment. Secondary alignments can be
excluded by setting includeSecondary=FALSE
. In paired-end
experiments, absIsizeMin
and absIsizeMax
can be used to select
alignments based on their insert size (TLEN field in SAM Spec v1.4).
For scaling=TRUE
, the number of alignments per bin \(n\)
for the sample \(i\) are linearly scaled to the mean total
number of alignments over all samples in proj
according to:
\(n_s = n /N[i] *mean(N)\) where \(n_s\) is the scaled number
of alignments in the bin and \(N\) is a vector with the total
number of alignments for each sample. Alternatively, if scaling is set
to a positive numerical value \(s\), this value is used instead of
\(\textnormal{mean}(N)\), and values are scaled according
to: \(n_s = n /N[i] *s\).
mapqMin
and mapqMax
allow to select alignments
based on their mapping qualities. mapqMin
and mapqMax
can
take integer values between 0 and 255 and equal to
\(-10 log_{10} Pr(\textnormal{mapping position is wrong})\), rounded to the nearest
integer. A value 255 indicates that the mapping quality is not available.
If createBigWig=FALSE
and file
ends with ‘.gz’,
the resulting wig file will be compressed using gzip and is suitable
for uploading as a custom track to your favorite genome browser
(e.g. UCSC or Ensembl).
# copy example data to current working directory
file.copy(system.file(package="QuasR", "extdata"), ".", recursive=TRUE)
#> [1] TRUE
# create alignments
sampleFile <- "extdata/samples_chip_single.txt"
genomeFile <- "extdata/hg19sub.fa"
proj <- qAlign(sampleFile, genomeFile)
#> alignment files missing - need to:
#> create 2 genomic alignment(s)
#> Testing the compute nodes...
#> OK
#> Loading QuasR on the compute nodes...
#> preparing to run on 1 nodes...
#> done
#> Available cores:
#> Mac-1740133007481.local: 1
#> Performing genomic alignments for 2 samples. See progress in the log file:
#> /private/var/folders/2s/h6hvv9ps03xgz_krkkstvq_r0000gn/T/RtmpHtA1jk/file5ddc16c1b3df/reference/QuasR_log_5ddc69f34384.txt
#> Genomic alignments have been created successfully
#>
# export wiggle file
qExportWig(proj, binsize=100L, shift=0L, scaling=TRUE)
#> collecting mapping statistics for scaling...
#> done
#> start creating wig files...
#> Sample1.wig.gz (Sample1)
#> Sample2.wig.gz (Sample2)
#> done