prepareFeatureCollections.Rd
Prepare feature collections for testing with limma::camera
. The
function maps the feature IDs in the collections (complexes, GO terms
or pathways) to the values in the specified idCol
column of
rowData(sce)
, and subsequently replaces them with the corresponding
row names of the SummarizedExperiment
object. Feature sets with
too few features (after the matching) are removed.
Complexes are obtained from the database provided via `complexDbPath`.
GO terms and pathways (BIOCARTA, KEGG, PID, REACTOME and WIKIPATHWAYS) are
retrieved from `MSigDB` via the `msigdbr` package.
prepareFeatureCollections(
sce,
idCol,
includeFeatureCollections,
complexDbPath,
speciesInfo,
complexSpecies,
customComplexes = list(),
minSizeToKeep = 2
)
A SummarizedExperiment
object (or a derivative).
Character scalar, indicating which column in
rowData(sce)
that contains IDs matching those in the
feature collections (gene symbols).
Character vector indicating the types
of feature collections to prepare. Should be a subset of
c("complexes", "GO", "pathways")
or NULL
.
Character scalar providing the path to the database
of complexes, generated using makeComplexDB()
and serialized
to a .rds file. If `NULL`, the complex database provided with
einprot will be used.
List with at least two entries (species
and
speciesCommon
), providing the species information. Typically
generated using getSpeciesInfo()
.
Character scalar, either "all"
or
"current"
, indicating whether all complexes should be
tested, or only those defined for the current species.
Named list, for providing any custom complexes
that are not already included in the database provided via
complexDbPath
.
Numeric scalar, indicating the minimum size of a feature collection to be retained.
A list of CharacterList
s (one for each feature collection).
sce <- readRDS(system.file("extdata", "mq_example", "1356_sce.rds",
package = "einprot"))
fc <- prepareFeatureCollections(sce, idCol = "einprotGene",
includeFeatureCollections = "complexes",
complexDbPath = NULL,
speciesInfo = getSpeciesInfo("mouse"),
complexSpecies = "all")
## List of complexes, expressed in terms of the row names of sce
fc
#> $complexes
#> CharacterList of length 269
#> [["mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID)"]] Hnrnph1 ...
#> [["mouse: Drosha complex"]] Hnrnph1 Ddx5 Hnrnpm
#> [["mouse: Gata1-Fog1-MeCP1 complex"]] Mbd3.D3YTR4 Mta1.E9PX23 ... Mbd2
#> [["mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex"]] Kdm1a Hdac1
#> [["mouse: Ikaros complex"]] Chd4 Hdac1 Hdac2 Rbbp4
#> [["mouse: Ikaros-NuRD complex"]] Chd4 Hdac1 Hdac2
#> [["mouse: Metallothionein-3 complex"]] Hsp90ab1 Actb
#> [["mouse: Nkx3.2-SMAD1-SMAD4-HDAC-Sin3A complex"]] Hdac1 Rbbp4 Rbbp7
#> [["mouse: Nucleolar remodeling complex (NoRC complex) (+2 alt. IDs)"]] Baz2a ...
#> [["mouse: Parvulin-associated pre-rRNP complex"]] Ncl Rpl7a Rpl7 ... Nop56 Rpl4
#> ...
#> <259 more elements>
#>
## Metadata for the complexes
S4Vectors::mcols(fc$complexes)
#> DataFrame with 269 rows and 8 columns
#> Species.common
#> <character>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID) mouse
#> mouse: Drosha complex mouse
#> mouse: Gata1-Fog1-MeCP1 complex mouse
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex mouse
#> mouse: Ikaros complex mouse
#> ... ...
#> S.pombe: Rpd3L complex Schizosacc...
#> S.pombe: Rpd3L-Expanded complex Schizosacc...
#> S.pombe: Rpd3S complex Schizosacc...
#> S.pombe: small-subunit processome Schizosacc...
#> S.pombe: transcription factor TFIIIC complex Schizosacc...
#> Source
#> <character>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID) CORUM
#> mouse: Drosha complex CORUM
#> mouse: Gata1-Fog1-MeCP1 complex CORUM
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex CORUM
#> mouse: Ikaros complex CORUM
#> ... ...
#> S.pombe: Rpd3L complex pombase
#> S.pombe: Rpd3L-Expanded complex pombase
#> S.pombe: Rpd3S complex pombase
#> S.pombe: small-subunit processome pombase
#> S.pombe: transcription factor TFIIIC complex pombase
#> PMID
#> <character>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID) 11003644;1...
#> mouse: Drosha complex 17435748
#> mouse: Gata1-Fog1-MeCP1 complex 15920471
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex 26487680
#> mouse: Ikaros complex 10204490
#> ... ...
#> S.pombe: Rpd3L complex 17450151,1...
#> S.pombe: Rpd3L-Expanded complex 19040720;G...
#> S.pombe: Rpd3S complex 12773392,1...
#> S.pombe: small-subunit processome 36423630;G...
#> S.pombe: transcription factor TFIIIC complex 10906331;2...
#> All.names
#> <character>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID) mouse: DCS...
#> mouse: Drosha complex mouse: Dro...
#> mouse: Gata1-Fog1-MeCP1 complex mouse: Gat...
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex mouse: Hda...
#> mouse: Ikaros complex mouse: Ika...
#> ... ...
#> S.pombe: Rpd3L complex S.pombe: R...
#> S.pombe: Rpd3L-Expanded complex S.pombe: R...
#> S.pombe: Rpd3S complex S.pombe: R...
#> S.pombe: small-subunit processome S.pombe: s...
#> S.pombe: transcription factor TFIIIC complex S.pombe: t...
#> genes
#> <character>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID) Hnrnph1;Pt...
#> mouse: Drosha complex Dhx15;Hnrn...
#> mouse: Gata1-Fog1-MeCP1 complex Hdac1;Zfpm...
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex Hdac1;Phb2...
#> mouse: Ikaros complex Ikzf3;Hdac...
#> ... ...
#> S.pombe: Rpd3L complex Brms1;Brms...
#> S.pombe: Rpd3L-Expanded complex Brms1;Brms...
#> S.pombe: Rpd3S complex Hdac2;Morf...
#> S.pombe: small-subunit processome Dcaf13;Ddx...
#> S.pombe: transcription factor TFIIIC complex Gtf3c2;Gtf...
#> nGenes
#> <integer>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID) 4
#> mouse: Drosha complex 8
#> mouse: Gata1-Fog1-MeCP1 complex 13
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex 6
#> mouse: Ikaros complex 12
#> ... ...
#> S.pombe: Rpd3L complex 9
#> S.pombe: Rpd3L-Expanded complex 12
#> S.pombe: Rpd3S complex 5
#> S.pombe: small-subunit processome 40
#> S.pombe: transcription factor TFIIIC complex 4
#> sharedGenes
#> <character>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID) Hnrnph1;Hn...
#> mouse: Drosha complex Hnrnph1;Dd...
#> mouse: Gata1-Fog1-MeCP1 complex Mbd3.D3YTR...
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex Kdm1a;Hdac...
#> mouse: Ikaros complex Chd4;Hdac1...
#> ... ...
#> S.pombe: Rpd3L complex Hdac2;Rbbp...
#> S.pombe: Rpd3L-Expanded complex Hdac2;Rbbp...
#> S.pombe: Rpd3S complex Hdac2;Rbbp...
#> S.pombe: small-subunit processome Rps2;Fbl;N...
#> S.pombe: transcription factor TFIIIC complex Gtf3c4;Gtf...
#> nSharedGenes
#> <integer>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID) 2
#> mouse: Drosha complex 3
#> mouse: Gata1-Fog1-MeCP1 complex 14
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex 2
#> mouse: Ikaros complex 4
#> ... ...
#> S.pombe: Rpd3L complex 3
#> S.pombe: Rpd3L-Expanded complex 3
#> S.pombe: Rpd3S complex 3
#> S.pombe: small-subunit processome 5
#> S.pombe: transcription factor TFIIIC complex 2