Given an R data.frame or matrix with missing values, clusters on the pattern of missingness and returns cluster labels plus silhouette score.
Usage
cluster_on_missing(
data,
cols_ignore = NULL,
n_clusters = NULL,
seed = 42,
k_neighbors = NULL,
leiden_resolution = 0.25,
leiden_objective = "CPM",
use_snn = TRUE
)Arguments
- data
A data.frame or matrix (samples × features), may contain
NA.- cols_ignore
Character vector of column names to ignore when clustering.
- n_clusters
Integer; if provided, will run KMeans with this many clusters. If
NULL, will use Leiden.- seed
Integer; random seed for KMeans (or reproducibility in Leiden).
- k_neighbors
Integer; minimum cluster size for Leiden. If
NULL, defaults tonrow(data) %/% 25.- leiden_resolution
Resolution for Leiden Clustering.
- leiden_objective
objective
- use_snn
use snn