Perform geometric sketching with the geosketch
python package.
geosketch(
mat,
N,
replace = FALSE,
k = "auto",
alpha = 0.1,
seed = NULL,
max_iter = 200,
one_indexed = TRUE,
verbose = FALSE
)
m x n matrix. Samples (the dimension along which to subsample) should be in the rows, features in the columns.
Numeric scalar, the number of samples to retain.
Logical scalar, whether to sample with replacement.
Numeric scalar or "auto"
, specifying the number of covering.
If k = "auto"
(the default), it is set to sqrt(nrow(mat))
for replace = TRUE
and to N
for replace = FALSE
.
Numeric scalar defining the acceptable interval around k
.
Binary search halts when it obtains between k * (1 - alpha)
and
k * (1 + alpha)
covering boxes.
Numeric scalar or NULL
(default). If not NULL
, it
will be converted to integer and passed to numpy to seed the random
number generator.
Numeric scalar giving the maximum iterations at which to terminate binary search in rare cases of non-monotonicity of covering boxes.
Logical scalar, whether to return one-indexed indices.
Locigal scalar, whether to print logging output while running.
A numeric vector with indices to retain.
The first time this function is run, it will create a conda environment
containing the geosketch
package.
This is done via the basilisk
R/Bioconductor package - see the
documentation for that package for troubleshooting.
Hie et al (2019): Geometric sketching compactly summarizes the single-cell transcriptomic landscape. Cell Systems 8, 483–493.