Get exonic/gene body regions from a transcript database.

From a transcript database package (TxDb), extract exonic and gene body ranges for use with EISA. These regions can be used to quantify RNA-seq alignments in exons and gene bodies, respectively. Intronic counts can then be obtained from the difference between gene bodies and exonic region counts.

getRegionsFromTxDb(txdb, exonExt = 10L, strandedData = TRUE)

Arguments

txdb: a TxDb or an EnsDb object with the transcript annotations.
exonExt: numeric (default = 10L). Exonic ranges will be extended on either side by this many nucleotides, in order to avoid "bleed-over" of exonic alignments into adjacent intronic regions.
strandedData: logical(1). If TRUE, the RNA-seq data is assumed to be strand-specific, and therefore only overlapping genes that are on the same strand will be filtered out. If FALSE, also genes overlapping on opposite strands will be filtered out.

Value

a list with elements "exons" and "genebodies", containing named GenomicRanges objects with ranges for exons and gene bodies, respectively.

Details

The exonic regions are generated as follows:

extract exons by gene from the txdb
extend each exon by exonExt
combine overlapping exons within each gene
create gene body ranges from the most extreme exonic coordinates
filter out genes that have only a single exon (no intron), have exons on more than a single chromosome or on both strands, or that overlap other genes

Author

Michael Stadler

Examples