The same simulated biomarker measurements as in df_missing
, but with no missing values—
useful as a ground truth for evaluating imputation methods.
Format
A tibble with 8,000 rows and 30 variables containing full simulated data:
- index
Integer. Row identifier imported from
data_raw/df_complete.csv
.- Age, Salary, ZipCode10001-ZipCode30003
Demographic columns.
- Y11, ..., Y55
Simulated Biomarker columns
Examples
data(df_complete)
head(df_complete)
#> # A tibble: 6 × 31
#> index Age Salary ZipCode10001 ZipCode20002 ZipCode30003 Y11 Y12 Y13
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 0 11.0 6.37 0 1 0 -4.05 -27.4 -19.1
#> 2 1 9.73 5.91 1 0 0 0.546 -19.6 -12.2
#> 3 2 11.4 6.64 0 1 0 -6.25 -28.3 -20.4
#> 4 3 13.6 5.90 0 0 1 -10.6 -31.8 -24.7
#> 5 4 9.54 6.13 1 0 0 0.358 -16.5 -11.3
#> 6 5 9.54 6.39 1 0 0 4.76 -19.0 -12.3
#> # ℹ 22 more variables: Y14 <dbl>, Y15 <dbl>, Y21 <dbl>, Y22 <dbl>, Y23 <dbl>,
#> # Y24 <dbl>, Y25 <dbl>, Y31 <dbl>, Y32 <dbl>, Y33 <dbl>, Y34 <dbl>,
#> # Y35 <dbl>, Y41 <dbl>, Y42 <dbl>, Y43 <dbl>, Y44 <dbl>, Y45 <dbl>,
#> # Y51 <dbl>, Y52 <dbl>, Y53 <dbl>, Y54 <dbl>, Y55 <dbl>