R/motif_enrichment_HOMER.R
prepareHomer.Rd
For each bin, write genomic coordinates for foreground and background regions into files for HOMER motif enrichment analysis.
prepareHomer(
gr,
b,
genomedir,
outdir,
motifFile,
homerfile = findHomer(),
regionsize = "given",
Ncpu = 2L,
verbose = FALSE
)
A GRanges
object (or an object that can be coerced to one)
with the genomic regions to analyze.
A vector of the same length as gr
that groups its elements
into bins (typically a factor).
Directory containing sequence files in Fasta format (one per chromosome).
A path specifying the folder into which the output files (two
files per unique value of b
) will be written.
A file with HOMER formatted PWMs to be used in the enrichment analysis.
Path and file name of the findMotifsGenome.pl
HOMER
script.
The peak size to use in HOMER ("given"
keeps the
coordinate region, an integer value will keep only that many bases in
the region center).
Number of parallel threads that HOMER can use.
A logical scalar. If TRUE
, print progress messages.
The path and name of the script file to run the HOMER motif enrichment analysis.
For each bin (unique value of b
) this functions creates two
files in outdir
(outdir/bin_N_foreground.tab
and
outdir/bin_N_background.tab
, where N
is the number of the
bin and foreground/background correspond to the ranges that are/are not
within the current bin). The files are in the HOMER peak file format
(see http://homer.ucsd.edu/homer/ngs/peakMotifs.html for details).
In addition, a shell script file is created containing the shell commands to run the HOMER motif enrichment analysis.
# prepare genome directory (here: one dummy chromosome)
genomedir <- tempfile()
dir.create(genomedir)
writeLines(c(">chr1", "ATGCATGCATCGATCGATCGATCGTACGTA"),
file.path(genomedir, "chr1.fa"))
# prepare motif file, regions and bins
motiffile <- tempfile()
dumpJaspar(filename = motiffile, pkg = "JASPAR2020",
opts = list(ID = c("MA0006.1")))
#> [1] TRUE
gr <- GenomicRanges::GRanges("chr1", IRanges::IRanges(1:4, width = 4))
b <- bin(1:4, nElements = 2)
# create dummy file (should point to local Homer installation)
homerfile <- file.path(tempdir(), "findMotifsGenome.pl")
writeLines("dummy", homerfile)
# run prepareHomer
outdir <- tempfile()
prepareHomer(gr = gr, b = b, genomedir = genomedir,
outdir = outdir, motifFile = motiffile,
homerfile = homerfile, verbose = TRUE)
#> creating foreground/background region files for HOMER
#> bin [1,2.5]
#> bin (2.5,4]
#> [1] "/var/folders/h1/8hndypj13nsbj5pn4xsnv1tm0000gn/T//Rtmpds1P4s/file531f69e75e3d/run.sh"
list.files(outdir)
#> [1] "bin_001_background.tab" "bin_001_foreground.tab" "bin_002_background.tab"
#> [4] "bin_002_foreground.tab" "run.sh"
# clean up example
unlink(c(genomedir, motiffile, homerfile, outdir))