Launch an analysis workflow to detect post-translational modifications based on outputs from separate protein- and peptide-level analyses run with runPDTMTAnalysis().

runPDTMTptmAnalysis(
  templateRmd = system.file("extdata/process_PD_TMT_PTM_template.Rmd", package =
    "einprot"),
  outputDir = ".",
  outputBaseName = "PDTMTptmAnalysis",
  reportTitle = "PD/PTM data processing",
  reportAuthor = "",
  forceOverwrite = FALSE,
  experimentInfo = list(),
  species,
  sceProteins,
  scePeptides,
  assayForTests,
  assayImputation,
  idCol,
  labelCol,
  proteinIdColProteins = function(df) einprot::getFirstId(df, "einprotProtein", ";"),
  proteinIdColPeptides = function(df) einprot::getFirstId(df, "einprotProtein", ";"),
  comparisons = list(),
  ctrlGroup = "",
  allPairwiseComparisons = TRUE,
  singleFit = TRUE,
  subtractBaseline = FALSE,
  baselineGroup = "",
  testType = "interaction",
  minNbrValidValues = 2,
  minlFC = 0,
  volcanoAdjPvalThr = 0.05,
  volcanoLog2FCThr = 1,
  volcanoMaxFeatures = 25,
  volcanoLabelSign = "both",
  volcanoFeaturesToLabel = "",
  addInteractiveVolcanos = FALSE,
  interactiveDisplayColumns = NULL,
  interactiveGroupColumn = NULL,
  seed = 42,
  linkTableColumns = c(),
  customYml = NULL,
  doRender = TRUE
)

Arguments

templateRmd

Path to the template R Markdown file. Typically does not need to be modified.

outputDir

Path to a directory where all output files will be written. Will be created if it doesn't exist.

outputBaseName

Character string providing the 'base name' of the output files. All output files will start with this prefix.

reportTitle, reportAuthor

Character scalars, giving the title and author for the result report.

forceOverwrite

Logical, whether to force overwrite an existing Rmd file with the same outputBaseName in the outputDir.

experimentInfo

Named list with information about the experiment. Each entry of the list must be a scalar value.

species

Character scalar providing the species. Must be one of the supported species (see getSupportedSpecies()). Either the common or the scientific name can be used.

sceProteins, scePeptides

Character strings pointing to .rds files with SingleCellExperiment objects containing proteins and peptides, respectively and generated by runPDTMTAnalysis(). File paths will be expressed in canonical form (using normalizePath()) before they are processed.

assayForTests

Character string giving the name of the assay to use for testing.

assayImputation

Character string giving the name of the assay containing information about the imputation status of each observation.

idCol, labelCol

Arguments defining the feature identifiers (row names, should be unique) and feature labels (for plots, can be anything). Each of these arguments can be either a character vector of column names in the input file (after application of make.names), in which case the corresponding feature ID is generated by simply concatenating the values in these columns, or a function with one input argument (a data.frame, corresponding to the annotation columns of the input file), returning a character vector corresponding to the desired feature IDs.

proteinIdColProteins, proteinIdColPeptides

Character strings pointing to columns of the rowData of the respective objects that contain the protein identifiers (will be used to match the two objects).

comparisons

List of character vectors defining comparisons to perform. The first element of each vector represents the denominator of the comparison. If not empty, ctrlGroup and allPairwiseComparisons are ignored.

ctrlGroup

Character vector defining the sample group(s) to use as control group in comparisons.

allPairwiseComparisons

Logical, should all pairwise comparisons be performed?

singleFit

Logical scalar indicating whether a single model fit should be used (and results for pairwise comparisons extracted via contrasts). If FALSE, the data set will be subset to the relevant samples for each comparison. Only applicable if testType is "interaction".

subtractBaseline

Logical scalar, whether to subtract the background/ reference value for each feature in each batch before fitting the model. If TRUE, requires that a 'batch' column is available.

baselineGroup

Character scalar representing the reference group. Only used if subtractBaseline is TRUE, in which case the abundance values for a given sample will be adjusted by subtracting the average value across all samples in the baselineGroup from the same batch as the original sample.

testType

The testing approach to use, either "interaction" or "welch" (similar to the approach used by MSstatsPTM).

minNbrValidValues

Numeric, the minimum number of valid values (must be met in both the protein and peptide objects) for a peptide to be used for statistical testing.

minlFC

Numeric, minimum log fold change to test against (only used if testType = "interaction").

volcanoAdjPvalThr

Numeric, adjusted p-value threshold to determine which proteins to highlight in the volcano plots.

volcanoLog2FCThr

Numeric, log-fold change threshold to determine which proteins to highlight in the volcano plots.

volcanoMaxFeatures

Numeric, maximum number of significant features to label in the volcano plots.

volcanoLabelSign

Character scalar, either 'both', 'pos', or 'neg', indicating whether to label the most significant features regardless of sign, or only those with positive/negative log-fold changes.

volcanoFeaturesToLabel

Character vector with features to always label in the volcano plots (regardless of significance).

addInteractiveVolcanos

Logical scalar indicating whether to add interactive volcano plots to the html report. For experiments with many quantified features or many comparisons, setting this to TRUE can make the html report very large and difficult to interact with.

interactiveDisplayColumns

Character vector (or NULL) indicating which columns to include in the tooltip for the interactive volcano plots. The default shows the feature ID.

interactiveGroupColumn

Character scalar (or NULL, default) indicating the column to group points by in the interactive volcano plot. Hovering over a point will highlight all other points with the same value of this column.

seed

Numeric, random seed to use for any non-deterministic calculations.

linkTableColumns

Character vector with regular expressions that will be matched against the column names of the rowData of the generated SingleCellExperiment object and included in the link table in the end of the report.

customYml

Character string providing the path to a custom YAML file that can be used to overwrite default settings in the report. If set to NULL (default), no alterations are made.

doRender

Logical scalar. If FALSE, the Rmd file will be generated (and any parameters injected), but not rendered.

Value

Invisibly, the path to the compiled html report.

Author

Charlotte Soneson