Skip to contents

This function is a wrapper around the modkit extract sub-command to extract read-level base modification information from modBAM files into tab-separated values table(s). For more information on available modkit extract arguments and output tables specification see https://nanoporetech.github.io/modkit/intro_extract.html

Usage

modkitExtract(
  modkit_bin = NULL,
  bamfile,
  regions = NULL,
  num_reads = NULL,
  out_extract_table = NULL,
  out_read_calls = NULL,
  out_log_file = NULL,
  modkit_args = NULL,
  tempdir_base = tempdir(),
  verbose = TRUE
)

Arguments

modkit_bin

Character scalar specifying the path to the modkit binary. If NULL, modkit will be searched on the path using Sys.which.

bamfile

Character scalar specifying the path to a modBAM file. An indexed BAM file can significantly speed up certain extract operations.

regions

A GRanges object specifying which genomic regions to extract the reads from. Note that the reads are not trimmed to the boundaries of the specified ranges. As a result, returned positions will typically extend out of the specified regions.

num_reads

Number of reads to extract per specified genomic region. When N genomic ranges are specified, the total number of extracted reads will be at most num_reads * N.

out_extract_table

Character scalar specifying the path for the extract table output. Can be NULL if only a read-calls table is needed.

out_read_calls

Character scalar specifying the path for the read-calls table output. Can be NULL if only an extract table is needed.

out_log_file

Character scalar specifying the path for the the command run log output. Can be NULL if no log file is needed (not recommended).

modkit_args

Character vector with additional modkit extract arguments. Please refer to the modkit extract documentation for a complete list of possible arguments.

tempdir_base

Character scalar specifying the path to create the modkit_temp directory. This temporary directory is only created when multiple genomic regions are specified.

verbose

A logical scalar. If TRUE, report on progress.

Value

A character vector with the elements "extract-table", "read-calls" and "run-log", specifying the paths to the generated table, call and log files, respectively. An NA value indicates that a given output was not generated.

See also

Author

Panagiotis Papapasaikas

Examples

if (FALSE) { # \dontrun{
GR <- as(c("chr1:12340678-12345678", "chr2:12340678-12345678"), "GRanges")

further_args <- c(
   '-t 8', '--force',
   '--mapped-only', '--edge-filter 100',
   '-p  0.2',
   '--mod-threshold m:0.8',
   '--mod-threshold a:0.5'
)

# Produce both `extract table` and `read calls` files for multiple GRanges
modkitExtract(modkit = modkit_bin_PATH, bamfile = BAMF, num_reads = 10,
              regions = GR[1:2], out_extract_table = "test.etbl",
              out_read_calls = "test.rdcl", modkit_args = further_args)
# Produce only `extract table` file
modkitExtract(modkit = modkit_bin_PATH, bamfile = BAMF, num_reads = 10,
              regions = GR[1], out_extract_table = "test.etbl",
              out_read_calls = NULL, modkit_args = further_args)
# Produce only `read calls` file
modkitExtract(modkit = modkit_bin_PATH, bamfile = BAMF, num_reads = 10,
              regions = GR[1], out_extract_table = NULL,
              out_read_calls = "test.rdcl", modkit_args = further_args)
} # }