Read collapsed single-molecule footprinting data from a bedMethyl file
Source: R/readBedMethyl.R
readBedMethyl.RdThis function will read collapsed single-molecule footprinting data
(reads combined per genomic position) from a bedMethyl file.
Usage
readBedMethyl(
fnames,
modbase,
nrows = Inf,
sampleAnnot = NULL,
seqinfo = NULL,
sequenceContextWidth = 0,
sequenceReference = NULL,
BPPARAM = MulticoreParam(4L, RNGseed = 42L),
verbose = FALSE
)Arguments
- fnames
Character vector with one or several paths of
bedMethylfiles, such as generated bymodkit pileup. Each file will be read separately and become one of the columns in the returnedSummarizedExperimentobject. Iffnamesis a named vector, the names are used as column names in the returned object. Otherwise, the column names will bes1, ...,sN, whereNis the length offnames. If several elements offnameshave identical names, the data from the corresponding files are summed into a single column in the returned object.- modbase
Character vector defining the modified base for each sample. If
modbaseis a named vector, the names should correspond to the names offnames. Otherwise, it will be assumed that the elements are in the same order as the files infnames. Ifmodbasehas length 1, the same modified base will be used for all samples.- nrows
Only read
nrowsrows of the input file.- sampleAnnot
A
data.frame(orNULL) providing annotations for the samples. It must contain at least one column, named"sample", which must contain all the values ofnames(fnames). The provided annotations will be propagated to the returnedSummarizedExperimentobject.- seqinfo
NULLor aSeqinfoobject containing information about the set of genomic sequences (chromosomes). Alternatively, a named numeric vector with genomic sequence names and lengths. Useful to set the sorting order of sequence names.- sequenceContextWidth, sequenceReference
Define the sequence context to be extracted around modified bases. By default (
sequenceContextWidth = 0), no sequence context will be extracted, otherwise it will be returned inrowData(x)$sequenceContext. SeeaddSeqContextfor details.- BPPARAM
A
BiocParallelParamobject that controls the number of parallel CPU threads to use for some of the steps inreadBedMethyl(). The default value is (MulticoreParam(4L, RNGseed = 42L))- verbose
If
TRUE, report on progress.
Value
A SummarizedExperiment object
with genomic positions in rows and samples (the unique names of
fnames) in the columns. If sequenceContextWidth != 0,
rowData(x)$sequenceContext will be a DNAStringSet
object with the extracted sequences.
See also
modkit software,
bedMethyl format description,
SummarizedExperiment for the returned object type,
fread for the function used to read the input files,
addSeqContext used to add the sequence context.
Examples
bmfile <- system.file("extdata", "modkit_pileup_1.bed.gz",
package = "footprintR")
readBedMethyl(bmfile, modbase = "m",
BPPARAM = BiocParallel::SerialParam())
#> class: RangedSummarizedExperiment
#> dim: 10000 1
#> metadata(1): readLevelData
#> assays(2): Nmod Nvalid
#> rownames: NULL
#> rowData names(0):
#> colnames(1): s1
#> colData names(2): sample modbase