R/getRegionsFromTxDb.R
getRegionsFromTxDb.Rd
From a transcript database package (TxDb
),
extract exonic and gene body ranges for use with EISA. These regions can
be used to quantify RNA-seq alignments in exons and gene bodies, respectively.
Intronic counts can then be obtained from the difference between gene bodies
and exonic region counts.
getRegionsFromTxDb(txdb, exonExt = 10L, strandedData = TRUE)
a TxDb
or an EnsDb
object with the transcript annotations.
numeric
(default = 10L). Exonic ranges will be extended
on either side by this many nucleotides, in order to avoid "bleed-over" of
exonic alignments into adjacent intronic regions.
logical(1)
. If TRUE
, the RNA-seq data is
assumed to be strand-specific, and therefore only overlapping genes that
are on the same strand will be filtered out. If FALSE
, also genes
overlapping on opposite strands will be filtered out.
a list
with elements "exons" and "genebodies", containing
named GenomicRanges
objects with ranges for exons and gene bodies,
respectively.
The exonic regions are generated as follows:
extract exons by gene from the txdb
extend each exon by exonExt
combine overlapping exons within each gene
create gene body ranges from the most extreme exonic coordinates
filter out genes that have only a single exon (no intron), have exons on more than a single chromosome or on both strands, or that overlap other genes
TxDb
for details on
TxDb
objects and the txdbmaker
package for how to create them,
e.g. from .gtf
files.
if (requireNamespace("AnnotationDbi", quietly = TRUE)) {
txdb <- AnnotationDbi::loadDb(system.file("extdata", "hg19sub.sqlite", package = "eisaR"))
regL <- getRegionsFromTxDb(txdb)
lengths(regL)
}
#> Loading required package: GenomicFeatures
#> Loading required package: AnnotationDbi
#> Loading required package: Biobase
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
#> extracting exon coordinates
#> total number of genes/exons: 12/32
#> removing overlapping/single-exon/ambiguous genes (8)
#> creating filtered regions for 4 genes (33.3%) with 20 exons (62.5%)
#> exons genebodies
#> 20 4