Skip to contents

Produce a cluster-stratified summary table using gtsummary, where the cluster assignments are supplied as a separate vector. All additional arguments (...) are passed directly to gtsummary::tbl_summary(), so users can specify all_continuous() / all_categorical() selectors and custom statistics.

Usage

cluster_summary(
  data,
  clusters,
  add_options = list(add_overall = FALSE, add_n = TRUE, add_p = FALSE),
  return_as = c("gtsummary", "gt"),
  include = NULL,
  ...
)

Arguments

data

A data.frame or tibble of features to summarize.

clusters

A vector (factor, character, or numeric) of cluster labels with length equal to nrow(data).

add_options

List of post-processing options:

  • add_overall (default FALSE): add overall column

  • add_n (default TRUE) : add group Ns

  • add_p (default FALSE): add p-values

return_as

"gtsummary" (default) or "gt". When "gt", the function calls gtsummary::as_gt() for rendering.

include

Optional character vector of variables to include. Defaults to all columns in data.

...

Passed to gtsummary::tbl_summary() (e.g., statistic=, type=, digits=, missing=, label=, etc.).

Value

A gtsummary::tbl_summary (default) or gt::gt_tbl if return_as="gt".

Examples

if(requireNamespace("gtsummary")){
df <- data.frame(
  age = rnorm(100, 60, 10),
  bmi = rnorm(100, 28, 5),
  sex = sample(c("F","M"), 100, TRUE)
)
cl <- sample(1:3, 100, TRUE)

cluster_summary(
  data = df,
  clusters = cl,
  statistic = list(
    gtsummary::all_continuous()  ~ "{mean} ({sd})",
    gtsummary::all_categorical() ~ "{n} / {N} ({p}%)"
  ),
  missing = "always"
)
}
Characteristic N 1
N = 31
1
2
N = 34
1
3
N = 35
1
age 100 61 (10) 62 (11) 56 (9)
    Unknown
0 0 0
bmi 100 26.9 (4.4) 28.7 (4.5) 28.3 (5.1)
    Unknown
0 0 0
sex 100


    F
14 / 31 (45%) 14 / 34 (41%) 18 / 35 (51%)
    M
17 / 31 (55%) 20 / 34 (59%) 17 / 35 (49%)
    Unknown
0 0 0
1 Mean (SD); n / N (%)