Read collapsed single-molecule footprinting data from a bedMethyl file

This function will read collapsed single-molecule footprinting data (reads combined per genomic position) from a bedMethyl file.

Usage

readBedMethyl(
  fnames,
  modbase,
  nrows = Inf,
  sampleAnnot = NULL,
  seqinfo = NULL,
  sequenceContextWidth = 0,
  sequenceReference = NULL,
  BPPARAM = MulticoreParam(4L, RNGseed = 42L),
  verbose = FALSE
)

Arguments

fnames: Character vector with one or several paths of bedMethyl files, such as generated by modkit pileup. Each file will be read separately and become one of the columns in the returned SummarizedExperiment object. If fnames is a named vector, the names are used as column names in the returned object. Otherwise, the column names will be s1, ..., sN, where N is the length of fnames. If several elements of fnames have identical names, the data from the corresponding files are summed into a single column in the returned object.
modbase: Character vector defining the modified base for each sample. If modbase is a named vector, the names should correspond to the names of fnames. Otherwise, it will be assumed that the elements are in the same order as the files in fnames. If modbase has length 1, the same modified base will be used for all samples.
nrows: Only read nrows rows of the input file.
sampleAnnot: A data.frame (or NULL) providing annotations for the samples. It must contain at least one column, named "sample", which must contain all the values of names(fnames). The provided annotations will be propagated to the returned SummarizedExperiment object.
seqinfo: NULL or a Seqinfo object containing information about the set of genomic sequences (chromosomes). Alternatively, a named numeric vector with genomic sequence names and lengths. Useful to set the sorting order of sequence names.
sequenceContextWidth, sequenceReference: Define the sequence context to be extracted around modified bases. By default ( sequenceContextWidth = 0), no sequence context will be extracted, otherwise it will be returned in rowData(x)$sequenceContext. See addSeqContext for details.
BPPARAM: A BiocParallelParam object that controls the number of parallel CPU threads to use for some of the steps in readBedMethyl(). The default value is (MulticoreParam(4L, RNGseed = 42L))
verbose: If TRUE, report on progress.

Value

A SummarizedExperiment object with genomic positions in rows and samples (the unique names of fnames) in the columns. If sequenceContextWidth != 0, rowData(x)$sequenceContext will be a DNAStringSet object with the extracted sequences.

Author

Michael Stadler, Charlotte Soneson

Examples

bmfile <- system.file("extdata", "modkit_pileup_1.bed.gz",
                      package = "footprintR")
readBedMethyl(bmfile, modbase = "m",
              BPPARAM = BiocParallel::SerialParam())
#> class: RangedSummarizedExperiment 
#> dim: 10000 1 
#> metadata(1): readLevelData
#> assays(2): Nmod Nvalid
#> rownames: NULL
#> rowData names(0):
#> colnames(1): s1
#> colData names(2): sample modbase

Read collapsed single-molecule footprinting data from a `bedMethyl` file

Usage

Arguments

Value

See also

Author

Examples