Prepare feature collections for testing with limma::camera. The function maps the feature IDs in the collections (complexes, GO terms or pathways) to the values in the specified idCol column of rowData(sce), and subsequently replaces them with the corresponding row names of the SummarizedExperiment object. Feature sets with too few features (after the matching) are removed. Complexes are obtained from the database provided via `complexDbPath`. GO terms and pathways (BIOCARTA, KEGG, PID, REACTOME and WIKIPATHWAYS) are retrieved from `MSigDB` via the `msigdbr` package.

prepareFeatureCollections(
  sce,
  idCol,
  includeFeatureCollections,
  complexDbPath,
  speciesInfo,
  complexSpecies,
  customComplexes = list(),
  minSizeToKeep = 2
)

Arguments

sce

A SummarizedExperiment object (or a derivative).

idCol

Character scalar, indicating which column in rowData(sce) that contains IDs matching those in the feature collections (gene symbols).

includeFeatureCollections

Character vector indicating the types of feature collections to prepare. Should be a subset of c("complexes", "GO", "pathways") or NULL.

complexDbPath

Character scalar providing the path to the database of complexes, generated using makeComplexDB() and serialized to a .rds file. If `NULL`, the complex database provided with einprot will be used.

speciesInfo

List with at least two entries (species and speciesCommon), providing the species information. Typically generated using getSpeciesInfo().

complexSpecies

Character scalar, either "all" or "current", indicating whether all complexes should be tested, or only those defined for the current species.

customComplexes

Named list, for providing any custom complexes that are not already included in the database provided via complexDbPath.

minSizeToKeep

Numeric scalar, indicating the minimum size of a feature collection to be retained.

Value

A list of CharacterLists (one for each feature collection).

Author

Charlotte Soneson

Examples

sce <- readRDS(system.file("extdata", "mq_example", "1356_sce.rds",
                           package = "einprot"))
fc <- prepareFeatureCollections(sce, idCol = "einprotGene",
                                includeFeatureCollections = "complexes",
                                complexDbPath = NULL,
                                speciesInfo = getSpeciesInfo("mouse"),
                                complexSpecies = "all")

## List of complexes, expressed in terms of the row names of sce
fc
#> $complexes
#> CharacterList of length 269
#> [["mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID)"]] Hnrnph1 ...
#> [["mouse: Drosha complex"]] Hnrnph1 Ddx5 Hnrnpm
#> [["mouse: Gata1-Fog1-MeCP1 complex"]] Mbd3.D3YTR4 Mta1.E9PX23 ... Mbd2
#> [["mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex"]] Kdm1a Hdac1
#> [["mouse: Ikaros complex"]] Chd4 Hdac1 Hdac2 Rbbp4
#> [["mouse: Ikaros-NuRD complex"]] Chd4 Hdac1 Hdac2
#> [["mouse: Metallothionein-3 complex"]] Hsp90ab1 Actb
#> [["mouse: Nkx3.2-SMAD1-SMAD4-HDAC-Sin3A complex"]] Hdac1 Rbbp4 Rbbp7
#> [["mouse: Nucleolar remodeling complex (NoRC complex) (+2 alt. IDs)"]] Baz2a ...
#> [["mouse: Parvulin-associated pre-rRNP complex"]] Ncl Rpl7a Rpl7 ... Nop56 Rpl4
#> ...
#> <259 more elements>
#> 

## Metadata for the complexes
S4Vectors::mcols(fc$complexes)
#> DataFrame with 269 rows and 8 columns
#>                                                               Species.common
#>                                                                  <character>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID)          mouse
#> mouse: Drosha complex                                                  mouse
#> mouse: Gata1-Fog1-MeCP1 complex                                        mouse
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex                        mouse
#> mouse: Ikaros complex                                                  mouse
#> ...                                                                      ...
#> S.pombe: Rpd3L complex                                         Schizosacc...
#> S.pombe: Rpd3L-Expanded complex                                Schizosacc...
#> S.pombe: Rpd3S complex                                         Schizosacc...
#> S.pombe: small-subunit processome                              Schizosacc...
#> S.pombe: transcription factor TFIIIC complex                   Schizosacc...
#>                                                                    Source
#>                                                               <character>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID)       CORUM
#> mouse: Drosha complex                                               CORUM
#> mouse: Gata1-Fog1-MeCP1 complex                                     CORUM
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex                     CORUM
#> mouse: Ikaros complex                                               CORUM
#> ...                                                                   ...
#> S.pombe: Rpd3L complex                                            pombase
#> S.pombe: Rpd3L-Expanded complex                                   pombase
#> S.pombe: Rpd3S complex                                            pombase
#> S.pombe: small-subunit processome                                 pombase
#> S.pombe: transcription factor TFIIIC complex                      pombase
#>                                                                        PMID
#>                                                                 <character>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID) 11003644;1...
#> mouse: Drosha complex                                              17435748
#> mouse: Gata1-Fog1-MeCP1 complex                                    15920471
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex                    26487680
#> mouse: Ikaros complex                                              10204490
#> ...                                                                     ...
#> S.pombe: Rpd3L complex                                        17450151,1...
#> S.pombe: Rpd3L-Expanded complex                               19040720;G...
#> S.pombe: Rpd3S complex                                        12773392,1...
#> S.pombe: small-subunit processome                             36423630;G...
#> S.pombe: transcription factor TFIIIC complex                  10906331;2...
#>                                                                   All.names
#>                                                                 <character>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID) mouse: DCS...
#> mouse: Drosha complex                                         mouse: Dro...
#> mouse: Gata1-Fog1-MeCP1 complex                               mouse: Gat...
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex               mouse: Hda...
#> mouse: Ikaros complex                                         mouse: Ika...
#> ...                                                                     ...
#> S.pombe: Rpd3L complex                                        S.pombe: R...
#> S.pombe: Rpd3L-Expanded complex                               S.pombe: R...
#> S.pombe: Rpd3S complex                                        S.pombe: R...
#> S.pombe: small-subunit processome                             S.pombe: s...
#> S.pombe: transcription factor TFIIIC complex                  S.pombe: t...
#>                                                                       genes
#>                                                                 <character>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID) Hnrnph1;Pt...
#> mouse: Drosha complex                                         Dhx15;Hnrn...
#> mouse: Gata1-Fog1-MeCP1 complex                               Hdac1;Zfpm...
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex               Hdac1;Phb2...
#> mouse: Ikaros complex                                         Ikzf3;Hdac...
#> ...                                                                     ...
#> S.pombe: Rpd3L complex                                        Brms1;Brms...
#> S.pombe: Rpd3L-Expanded complex                               Brms1;Brms...
#> S.pombe: Rpd3S complex                                        Hdac2;Morf...
#> S.pombe: small-subunit processome                             Dcaf13;Ddx...
#> S.pombe: transcription factor TFIIIC complex                  Gtf3c2;Gtf...
#>                                                                  nGenes
#>                                                               <integer>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID)         4
#> mouse: Drosha complex                                                 8
#> mouse: Gata1-Fog1-MeCP1 complex                                      13
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex                       6
#> mouse: Ikaros complex                                                12
#> ...                                                                 ...
#> S.pombe: Rpd3L complex                                                9
#> S.pombe: Rpd3L-Expanded complex                                      12
#> S.pombe: Rpd3S complex                                                5
#> S.pombe: small-subunit processome                                    40
#> S.pombe: transcription factor TFIIIC complex                          4
#>                                                                 sharedGenes
#>                                                                 <character>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID) Hnrnph1;Hn...
#> mouse: Drosha complex                                         Hnrnph1;Dd...
#> mouse: Gata1-Fog1-MeCP1 complex                               Mbd3.D3YTR...
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex               Kdm1a;Hdac...
#> mouse: Ikaros complex                                         Chd4;Hdac1...
#> ...                                                                     ...
#> S.pombe: Rpd3L complex                                        Hdac2;Rbbp...
#> S.pombe: Rpd3L-Expanded complex                               Hdac2;Rbbp...
#> S.pombe: Rpd3S complex                                        Hdac2;Rbbp...
#> S.pombe: small-subunit processome                             Rps2;Fbl;N...
#> S.pombe: transcription factor TFIIIC complex                  Gtf3c4;Gtf...
#>                                                               nSharedGenes
#>                                                                  <integer>
#> mouse: DCS complex (Ptbp1, Ptbp2, Hnrph1, Hnrpf) (+1 alt. ID)            2
#> mouse: Drosha complex                                                    3
#> mouse: Gata1-Fog1-MeCP1 complex                                         14
#> mouse: Hdac1-Ino80-Kdm1a-Phb2-Rbp1-Taf5 complex                          2
#> mouse: Ikaros complex                                                    4
#> ...                                                                    ...
#> S.pombe: Rpd3L complex                                                   3
#> S.pombe: Rpd3L-Expanded complex                                          3
#> S.pombe: Rpd3S complex                                                   3
#> S.pombe: small-subunit processome                                        5
#> S.pombe: transcription factor TFIIIC complex                             2