Combine values from multiple columns from a data.frame into a new column, typically representing an identifier used to represent or label features.

combineIds(
  df,
  combineCols,
  combineWhen = "nonunique",
  splitSeparator = ";",
  joinSeparator = ".",
  makeUnique = TRUE
)

Arguments

df

A data.frame.

combineCols

Character vector giving the names of the columns of df that should be combined.

combineWhen

Character scalar indicating when to combine columns. Must be either "always" (which always combines the columns), "nonunique" (which only combines the columns if it's necessary to obtain unique names), or "missing" (which uses subsequent columns if all previous columns have missing values in a given position).

splitSeparator

Character scalar, character vector of length equal to the length of combineCols, or NULL. If not NULL, indicates the separator by which to split the entries in the corresponding column before combining columns.

joinSeparator

Character scalar giving the separator to use when combining columns.

makeUnique

Logical scalar, indicating whether or not the feature IDs should be guaranteed to be unique.

Value

A vector with values obtained by combining the indicated columns.

Author

Charlotte Soneson

Examples

combineIds(data.frame(x = c("A;B", NA, "A", "D", NA),
                      y = c(1, NA, 3, 4, 5),
                      z = c("a", "b", "c;d", "e", "f")),
           combineCols = c("x", "y", "z"),
           combineWhen = "nonunique",
           splitSeparator = ";")
#> [1] "A.1" "NA"  "A.3" "D"   "5"  

combineIds(data.frame(x = c("A;B", NA, "A", "D", NA),
                      y = c(1, NA, 3, 4, 5),
                      z = c("a", "b", "c;d", "e", "f")),
           combineCols = c("x", "y", "z"),
           combineWhen = "missing",
           splitSeparator = ";")
#> [1] "A"   "b"   "A.1" "D"   "5"  

combineIds(data.frame(x = c("A;B", NA, "A", "D", NA),
                      y = c(1, NA, 3, 4, 5),
                      z = c("a", "b", "c;d", "e", "f")),
           combineCols = c("x", "y", "z"),
           combineWhen = "always",
           splitSeparator = ";")
#> [1] "A.1.a"   "NA.NA.b" "A.3.c"   "D.4.e"   "NA.5.f"