Run statistical test — runTest • einprot

Perform pairwise comparisons. If colData(sce) contains a column named 'batch', this will be included as a covariate if testType is 'limma' or 'proDA'. If it contains a column named 'sampleweights', these will be used as sample weights if testType is 'limma'.

runTest(
  sce,
  comparisons,
  groupComposition = NULL,
  testType,
  assayForTests,
  assayImputation = NULL,
  minNbrValidValues = 2,
  minlFC = 0,
  featureCollections = list(),
  complexFDRThr = 0.1,
  volcanoAdjPvalThr = 0.05,
  volcanoLog2FCThr = 1,
  baseFileName = NULL,
  seed = 123,
  samSignificance = TRUE,
  nperm = 250,
  volcanoS0 = 0.1,
  addAbundanceValues = FALSE,
  aName = NULL,
  singleFit = TRUE,
  subtractBaseline = FALSE,
  baselineGroup = "",
  extraColumns = NULL
)

Arguments

sce: A SummarizedExperiment object (or a derivative).
comparisons: A list of character vectors of length 2, each giving the two groups to be compared.
groupComposition: A list providing the composition of each group used in any of the comparisons. If NULL, assumes that each group used in comparisons consists of a single group in the group column of colData(sce).
testType: Character scalar, either "limma", "ttest" or "proDA".
assayForTests: Character scalar, the name of an assay of the SummarizedExperiment object with values that will be used to perform the test.
assayImputation: Character scalar, the name of an assay of sce with logical values indicating whether an entry was imputed or not.
minNbrValidValues: Numeric scalar, the minimum number of valid (non-imputed) values that must be present for a features to include it in the result table.
minlFC: Non-negative numeric scalar, the logFC threshold to use for limma-treat. If minlFC = 0, limma::eBayes is used instead.
featureCollections: List of CharacterLists with feature collections.
complexFDRThr: Numeric scalar giving the significance (FDR) threshold below which a complex will be considered significant.
volcanoAdjPvalThr: Numeric scalar giving the FDR threshold for significance (for later use in volcano plots).
volcanoLog2FCThr: Numeric scalar giving the logFC threshold for significance (for later use in volcano plots).
baseFileName: Character scalar or NULL, the base file name of the output text files. If NULL, no result files are generated.
seed: Numeric scalar, the random seed to use for permutation (only used if testType is "ttest").
samSignificance: Logical scalar, indicating whether the SAM statistic should be used to determine significance (similar to the approach used by Perseus). Only used if testType = "ttest". If FALSE, the p-values are adjusted using the Benjamini-Hochberg approach and used to determine significance.
nperm: Numeric scalar, the number of permutations (only used if testType is "ttest").
volcanoS0: Numeric scalar, the S0 value to use for creating significance curves (only used if testType is "ttest").
addAbundanceValues: Logical scalar, whether to extract abundance and add to the result table.
aName: Character vector, the names of the assays in the SummarizedExperiment object to get abundance values from (only required if addAbundanceValues is TRUE).
singleFit: Logical scalar, whether to fit a single model to the full data set and extract relevant results using contrasts. If FALSE, the data set will be subset for each comparison to only the relevant samples. Setting singleFit to TRUE is only supported for testType = "limma" or "proDA".
subtractBaseline: Logical scalar, whether to subtract the background/ reference value for each feature in each batch before fitting the model. If TRUE, requires that a 'batch' column is available.
baselineGroup: Character scalar representing the reference group. Only used if subtractBaseline is TRUE, in which case the abundance values for a given sample will be adjusted by subtracting the average value across all samples in the baselineGroup from the same batch as the original sample.
extraColumns: Character vector (or NULL) indicating columns of rowData(sce) to include in the result table.

Value

A list with the following components:

tests: - a list with test results
plotnotes: - the prior df used by limma
plottitles: - indicating the type of test
plotsubtitles: - indicating the significance thresholds
messages: - any messages for the user
design: - information about the experimental design
featureCollections: - list of feature sets, expanded with results from camera
topsets: - a list with the significant feature sets
curveparams: - information required to create Perseus-like significance curves

In addition, if baseFileName is not NULL, text files with test results (including only features and feature sets passing the imposed significance thresholds) are saved.

Author

Charlotte Soneson

Examples

sce <- readRDS(system.file("extdata", "mq_example", "1356_sce.rds",
                           package = "einprot"))
tres <- runTest(sce, comparisons = list(c("RBC_ctrl", "Adnp")),
                testType = "limma", assayForTests = "log2_LFQ.intensity",
                assayImputation = "imputed_LFQ.intensity")
head(tres$tests$Adnp_vs_RBC_ctrl)
#>      pid    logFC       CI.L      CI.R  AveExpr         t      P.Value
#> 1   Dhx9 9.022469  7.3178397 10.727099 21.20317 12.095374 1.271544e-06
#> 2 Zmynd8 2.109792 -0.7438610  4.963444 18.97072  1.689516 1.276159e-01
#> 3  Zmym4 9.839518  5.5090078 14.170027 22.76805  5.192287 6.998028e-04
#> 4    Rlf 7.174602  2.3601311 11.989073 20.78440  3.405439 8.574818e-03
#> 5 Zfp600 1.057193 -0.6296855  2.744072 23.16597  1.432169 1.880666e-01
#> 6  Rpl32 3.435481  0.1777595  6.693203 20.46518  2.409892 4.097579e-02
#>      adj.P.Val           B  s2.prior     sigma  se.logFC df.total   mlog10p
#> 1 6.103411e-05  5.33424636 1.9903569 0.6037027 0.7459438 8.441054 5.8956686
#> 2 2.187701e-01 -4.99296345 3.3021534 1.3954445 1.2487549 8.441054 0.8940952
#> 3 4.633177e-03 -0.00816105 0.9482483 2.6818702 1.8950257 8.441054 3.1550243
#> 4 2.993391e-02 -2.43895360 2.2768805 2.9052289 2.1068064 8.441054 2.0667751
#> 5 2.911999e-01 -5.32702279 0.7532645 0.9183849 0.7381760 8.441054 0.7256883
#> 6 9.148082e-02 -3.94846663 2.4844680 1.8104766 1.4255749 8.441054 1.3874727
#>   einprotGene                                    einprotProtein einprotLabel
#> 1        Dhx9 A0A087WPL5;E9QNN1;O70133;O70133-2;O70133-3;Q3UR42         Dhx9
#> 2      Zmynd8  Q3UH28;Q3U1M7;A2A483;E9Q8D1;A2A482;A2A484;A2A485       Zmynd8
#> 3       Zmym4                            A2A791;A2A791-2;F6VYE2        Zmym4
#> 4         Rlf                                     A2A7F4;E9Q532          Rlf
#> 5      Zfp600                                            A2A7V0       Zfp600
#> 6       Rpl32                                     A2AD25;P62911        Rpl32
#>   showInVolcano IDsForSTRING
#> 1          TRUE         <NA>
#> 2         FALSE         <NA>
#> 3          TRUE         <NA>
#> 4          TRUE         <NA>
#> 5         FALSE         <NA>
#> 6         FALSE         <NA>
tres$design
#> $design
#>               (Intercept) fcChd4BF fcRBC_ctrl
#> Adnp_IP04               1        0          0
#> Adnp_IP05               1        0          0
#> Adnp_IP06               1        0          0
#> Chd4BF_IP07             1        1          0
#> Chd4BF_IP08             1        1          0
#> Chd4BF_IP09             1        1          0
#> RBC_ctrl_IP01           1        0          1
#> RBC_ctrl_IP02           1        0          1
#> RBC_ctrl_IP03           1        0          1
#> attr(,"assign")
#> [1] 0 1 1
#> attr(,"contrasts")
#> attr(,"contrasts")$fc
#> [1] "contr.treatment"
#> 
#> 
#> $sampleData
#>                     fc
#> Adnp_IP04         Adnp
#> Adnp_IP05         Adnp
#> Adnp_IP06         Adnp
#> Chd4BF_IP07     Chd4BF
#> Chd4BF_IP08     Chd4BF
#> Chd4BF_IP09     Chd4BF
#> RBC_ctrl_IP01 RBC_ctrl
#> RBC_ctrl_IP02 RBC_ctrl
#> RBC_ctrl_IP03 RBC_ctrl
#> 
#> $contrasts
#> $contrasts$Adnp_vs_RBC_ctrl
#> [1]  0  0 -1
#> 
#> 
#> $sampleWeights
#> NULL
#>