This function is a wrapper around the modkit extract sub-command to extract read-level base modification information from modBAM files into tab-separated values table(s). For more information on available modkit extract arguments and output tables specification see


  modkit_bin = NULL,
  regions = NULL,
  num_reads = NULL,
  out_extract_table = NULL,
  out_read_calls = NULL,
  out_log_file = NULL,
  modkit_args = NULL,
  tempdir_base = tempdir(),
  verbose = TRUE



Character scalar specifying the path to the modkit binary. If NULL, modkit will be searched on the path using Sys.which.


Character scalar specifying the path to a modBAM file. An indexed BAM file can significantly speed up certain extract operations.


A GRanges object specifying which genomic regions to extract the reads from. Note that the reads are not trimmed to the boundaries of the specified ranges. As a result, returned positions will typically extend out of the specified regions.


Number of reads to extract per specified genomic region. When N genomic ranges are specified, the total number of extracted reads will be at most num_reads * N.


Character scalar specifying the path for the extract table output. Can be NULL if only a read-calls table is needed.


Character scalar specifying the path for the read-calls table output. Can be NULL if only an extract table is needed.


Character scalar specifying the path for the the command run log output. Can be NULL if no log file is needed (not recommended).


Character vector with additional modkit extract arguments. Please refer to the modkit extract documentation for a complete list of possible arguments.


Character scalar specifying the path to create the modkit_temp directory. This temporary directory is only created when multiple genomic regions are specified.


A logical scalar. If TRUE, report on progress.


A character vector with the elements "extract-table", "read-calls" and "run-log", specifying the paths to the generated table, call and log files, respectively. An NA value indicates that a given output was not generated.

Panagiotis Papapasaikas


if (FALSE) { # \dontrun{
GR <- as(c("chr1:12340678-12345678", "chr2:12340678-12345678"), "GRanges")

further_args <- c(
   '-t 8', '--force',
   '--mapped-only', '--edge-filter 100',
   '-p  0.2',
   '--mod-threshold m:0.8',
   '--mod-threshold a:0.5'

# Produce both `extract table` and `read calls` files for multiple GRanges
modkitExtract(modkit = modkit_bin_PATH, bamfile = BAMF, num_reads = 10,
              regions = GR[1:2], out_extract_table = "test.etbl",
              out_read_calls = "test.rdcl", modkit_args = further_args)
# Produce only `extract table` file
modkitExtract(modkit = modkit_bin_PATH, bamfile = BAMF, num_reads = 10,
              regions = GR[1], out_extract_table = "test.etbl",
              out_read_calls = NULL, modkit_args = further_args)
# Produce only `read calls` file
modkitExtract(modkit = modkit_bin_PATH, bamfile = BAMF, num_reads = 10,
              regions = GR[1], out_extract_table = NULL,
              out_read_calls = "test.rdcl", modkit_args = further_args)
} # }