Skip to contents

This function will read collapsed single-molecule footprinting data (reads combined per genomic position) from a bedMethyl file.

Usage

readBedMethyl(
  fnames,
  modbase,
  nrows = Inf,
  seqinfo = NULL,
  sequenceContextWidth = 0,
  sequenceReference = NULL,
  BPPARAM = bpparam(),
  verbose = FALSE
)

Arguments

fnames

Character vector with one or several paths of bedMethyl files, such as generated by modkit pileup. Each file will be read separately and become one of the columns in the returned SummarizedExperiment object. If fnames is a named vector, the names are used as column names in the returned object. Otherwise, the column names will be s1, ..., sN, where N is the length of fnames. If several elements of fnames have identical names, the data from the corresponding files are summed into a single column in the returned object.

modbase

Character vector defining the modified base for each sample. If modbase is a named vector, the names should correspond to the names of fnames. Otherwise, it will be assumed that the elements are in the same order as the files in fnames. If modbase has length 1, the same modified base will be used for all samples.

nrows

Only read nrows rows of the input file.

seqinfo

NULL or a Seqinfo object containing information about the set of genomic sequences (chromosomes). Alternatively, a named numeric vector with genomic sequence names and lengths. Useful to set the sorting order of sequence names.

sequenceContextWidth, sequenceReference

Define the sequence context to be extracted around modified bases. By default ( sequenceContextWidth = 0), no sequence context will be extracted, otherwise it will be returned in rowData(x)$sequenceContext. See addSeqContext for details.

BPPARAM

A BiocParallelParam object that controls the number of parallel CPU threads to use for some of the steps in readBedMethyl(). The default value (bpparam) will select an appropriate value for the current environment, or the default parallel backend registered using register.

verbose

If TRUE, report on progress.

Value

A SummarizedExperiment object with genomic positions in rows and samples (the unique names of fnames) in the columns. If sequenceContextWidth != 0, rowData(x)$sequenceContext will be a DNAStringSet object with the extracted sequences.

See also

modkit software, bedMethyl format description, SummarizedExperiment for the returned object type, fread for the function used to read the input files, addSeqContext used to add the sequence context.

Author

Michael Stadler, Charlotte Soneson

Examples

bmfile <- system.file("extdata", "modkit_pileup_1.bed.gz", package = "footprintR")
readBedMethyl(bmfile, modbase = "m")
#> class: RangedSummarizedExperiment 
#> dim: 10000 1 
#> metadata(1): readLevelData
#> assays(2): Nmod Nvalid
#> rownames: NULL
#> rowData names(0):
#> colnames(1): s1
#> colData names(2): sample modbase