Call modkit extract
to extract read-level base modifications from modBAM files
Source: R/modkitExtract.R
modkitExtract.Rd
This function is a wrapper around the modkit extract
sub-command to extract
read-level base modification information from modBAM files into tab-separated
values table(s).
For more information on available modkit extract
arguments and output
tables specification see https://nanoporetech.github.io/modkit/intro_extract.html
Usage
modkitExtract(
modkit_bin = NULL,
bamfile,
regions = NULL,
num_reads = NULL,
out_extract_table = NULL,
out_read_calls = NULL,
out_log_file = NULL,
modkit_args = NULL,
tempdir_base = tempdir(),
verbose = TRUE
)
Arguments
- modkit_bin
Character scalar specifying the path to the
modkit
binary. IfNULL
,modkit
will be searched on the path usingSys.which
.- bamfile
Character scalar specifying the path to a
modBAM
file. An indexed BAM file can significantly speed up certain extract operations.- regions
A
GRanges
object specifying which genomic regions to extract the reads from. Note that the reads are not trimmed to the boundaries of the specified ranges. As a result, returned positions will typically extend out of the specified regions.- num_reads
Number of reads to extract per specified genomic region. When
N
genomic ranges are specified, the total number of extracted reads will be at mostnum_reads * N
.- out_extract_table
Character scalar specifying the path for the
extract table
output. Can beNULL
if only aread-calls
table is needed.- out_read_calls
Character scalar specifying the path for the
read-calls table
output. Can beNULL
if only anextract table
is needed.- out_log_file
Character scalar specifying the path for the the command run log output. Can be
NULL
if no log file is needed (not recommended).- modkit_args
Character vector with additional
modkit extract
arguments. Please refer to themodkit extract
documentation for a complete list of possible arguments.- tempdir_base
Character scalar specifying the path to create the
modkit_temp
directory. This temporary directory is only created when multiple genomic regions are specified.- verbose
A logical scalar. If
TRUE
, report on progress.
Value
A character vector with the elements "extract-table", "read-calls"
and "run-log", specifying the paths to the generated table, call and log
files, respectively. An NA
value indicates that a given output
was not generated.
See also
modkit
software,
modkit extract
documentation and tabulated output formats specification,
GRanges
for the object used to specify
genomic regions.
Examples
if (FALSE) { # \dontrun{
GR <- as(c("chr1:12340678-12345678", "chr2:12340678-12345678"), "GRanges")
further_args <- c(
'-t 8', '--force',
'--mapped-only', '--edge-filter 100',
'-p 0.2',
'--mod-threshold m:0.8',
'--mod-threshold a:0.5'
)
# Produce both `extract table` and `read calls` files for multiple GRanges
modkitExtract(modkit = modkit_bin_PATH, bamfile = BAMF, num_reads = 10,
regions = GR[1:2], out_extract_table = "test.etbl",
out_read_calls = "test.rdcl", modkit_args = further_args)
# Produce only `extract table` file
modkitExtract(modkit = modkit_bin_PATH, bamfile = BAMF, num_reads = 10,
regions = GR[1], out_extract_table = "test.etbl",
out_read_calls = NULL, modkit_args = further_args)
# Produce only `read calls` file
modkitExtract(modkit = modkit_bin_PATH, bamfile = BAMF, num_reads = 10,
regions = GR[1], out_extract_table = NULL,
out_read_calls = "test.rdcl", modkit_args = further_args)
} # }