An "all-in-one" function that takes a single dependent variable with a vector of explanatory variable names (continuous or categorical variables) to produce a final table for publication including summary statistics. The appropriate model is selected on the basis of dependent variable and whether a random effect is specified.

finalfit.lm method (not called directly)

finalfit.glm method (not called directly)

finalfit.coxph method (not called directly)

finalfit(
  .data,
  dependent = NULL,
  explanatory = NULL,
  explanatory_multi = NULL,
  random_effect = NULL,
  formula = NULL,
  model_args = list(),
  weights = NULL,
  cont_cut = 5,
  column = NULL,
  keep_models = FALSE,
  metrics = FALSE,
  add_dependent_label = TRUE,
  dependent_label_prefix = "Dependent: ",
  dependent_label_suffix = "",
  keep_fit_id = FALSE,
  ...
)

finalfit.lm(
  .data,
  dependent,
  explanatory,
  explanatory_multi = NULL,
  random_effect = NULL,
  model_args = NULL,
  weights = NULL,
  cont_cut = 5,
  column = FALSE,
  keep_models = FALSE,
  metrics = FALSE,
  add_dependent_label = TRUE,
  dependent_label_prefix = "Dependent: ",
  dependent_label_suffix = "",
  keep_fit_id = FALSE,
  ...
)

finalfit.glm(
  .data,
  dependent,
  explanatory,
  explanatory_multi = NULL,
  random_effect = NULL,
  model_args = NULL,
  weights = NULL,
  cont_cut = 5,
  column = FALSE,
  keep_models = FALSE,
  metrics = FALSE,
  add_dependent_label = TRUE,
  dependent_label_prefix = "Dependent: ",
  dependent_label_suffix = "",
  keep_fit_id = FALSE,
  ...
)

finalfit.coxph(
  .data,
  dependent,
  explanatory,
  explanatory_multi = NULL,
  random_effect = NULL,
  model_args = NULL,
  column = TRUE,
  cont_cut = 5,
  keep_models = FALSE,
  metrics = FALSE,
  add_dependent_label = TRUE,
  dependent_label_prefix = "Dependent: ",
  dependent_label_suffix = "",
  keep_fit_id = FALSE,
  ...
)

Arguments

.data

Data frame or tibble.

dependent

Character vector of length 1: quoted name of dependent variable. Can be continuous, a binary factor, or a survival object of form Surv(time, status).

explanatory

Character vector of any length: quoted name(s) of explanatory variables.

explanatory_multi

Character vector of any length: quoted name(s) of a subset of explanatory variables to generate reduced multivariable model (must only contain variables contained in explanatory).

random_effect

Character vector of length 1, either, (1) name of random intercept variable, e.g. "var1", (automatically convered to "(1 | var1)"); or, (2) the full lme4 specification, e.g. "(var1 | var2)". Note parenthesis MUST be included in (2) but NOT included in (1).

formula

an object of class "formula" (or one that can be coerced to that class). Optional instead of standard dependent/explanatory format. Do not include if using dependent/explanatory.

model_args

List. A list of arguments to pass to lm, glm, coxph.

weights

Character vector of length 1: quoted name of weights variable. Passed to summary_factorlist, lm, and glm to provide weighted summary table and regression (e.g. IPTW). If wish weighted regression and non-weighted summary table, pass weights argument within model_args. Not available with surival dependent variable.

cont_cut

Numeric: number of unique values in continuous variable at which to consider it a factor.

column

Logical: Compute margins by column rather than row.

keep_models

Logical: include full multivariable model in output when working with reduced multivariable model (explanatory_multi) and/or mixed effect models (random_effect).

metrics

Logical: include useful model metrics in output in publication format.

add_dependent_label

Add the name of the dependent label to the top left of table.

dependent_label_prefix

Add text before dependent label.

dependent_label_suffix

Add text after dependent label.

keep_fit_id

Keep original model output coefficient label (internal).

...

Other arguments to pass to fit2df: estimate_name, digits, confint_type, confint_level, confint_sep.

Value

Returns a data frame with the final model table.

Examples

library(finalfit)
library(dplyr)

# Summary, univariable and multivariable analyses of the form:
# glm(depdendent ~ explanatory, family="binomial")
# lmuni(), lmmulti(), lmmixed(), glmuni(), glmmulti(), glmmixed(), glmmultiboot(),
#   coxphuni(), coxphmulti()

data(colon_s) # Modified from survival::colon
explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor")
dependent = 'mort_5yr'
colon_s %>%
  finalfit(dependent, explanatory)
#> Note: dependent includes missing data. These are dropped.
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#>  Dependent: Mortality 5 year                  Alive       Died
#>                          Age   <40 years  31 (46.3)  36 (53.7)
#>                              40-59 years 208 (61.4) 131 (38.6)
#>                                60+ years 272 (53.4) 237 (46.6)
#>                          Sex      Female 243 (55.6) 194 (44.4)
#>                                     Male 268 (56.1) 210 (43.9)
#>                  Obstruction          No 408 (56.7) 312 (43.3)
#>                                      Yes  89 (51.1)  85 (48.9)
#>                  Perforation          No 497 (56.0) 391 (44.0)
#>                                      Yes  14 (51.9)  13 (48.1)
#>           OR (univariable)        OR (multivariable)
#>                          -                         -
#>  0.54 (0.32-0.92, p=0.023) 0.57 (0.34-0.98, p=0.041)
#>  0.75 (0.45-1.25, p=0.270) 0.81 (0.48-1.36, p=0.426)
#>                          -                         -
#>  0.98 (0.76-1.27, p=0.889) 0.98 (0.75-1.28, p=0.902)
#>                          -                         -
#>  1.25 (0.90-1.74, p=0.189) 1.25 (0.90-1.76, p=0.186)
#>                          -                         -
#>  1.18 (0.54-2.55, p=0.672) 1.12 (0.51-2.44, p=0.770)

# Multivariable analysis with subset of explanatory
#   variable set used in univariable analysis
explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor")
explanatory_multi = c("age.factor", "obstruct.factor")
dependent = "mort_5yr"
colon_s %>%
  finalfit(dependent, explanatory, explanatory_multi)
#> Note: dependent includes missing data. These are dropped.
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#>  Dependent: Mortality 5 year                  Alive       Died
#>                          Age   <40 years  31 (46.3)  36 (53.7)
#>                              40-59 years 208 (61.4) 131 (38.6)
#>                                60+ years 272 (53.4) 237 (46.6)
#>                          Sex      Female 243 (55.6) 194 (44.4)
#>                                     Male 268 (56.1) 210 (43.9)
#>                  Obstruction          No 408 (56.7) 312 (43.3)
#>                                      Yes  89 (51.1)  85 (48.9)
#>                  Perforation          No 497 (56.0) 391 (44.0)
#>                                      Yes  14 (51.9)  13 (48.1)
#>           OR (univariable)        OR (multivariable)
#>                          -                         -
#>  0.54 (0.32-0.92, p=0.023) 0.57 (0.34-0.98, p=0.041)
#>  0.75 (0.45-1.25, p=0.270) 0.81 (0.48-1.36, p=0.424)
#>                          -                         -
#>  0.98 (0.76-1.27, p=0.889)                         -
#>                          -                         -
#>  1.25 (0.90-1.74, p=0.189) 1.26 (0.90-1.76, p=0.176)
#>                          -                         -
#>  1.18 (0.54-2.55, p=0.672)                         -

# Summary, univariable and multivariable analyses of the form:
# lme4::glmer(dependent ~ explanatory + (1 | random_effect), family="binomial")

explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor")
explanatory_multi = c("age.factor", "obstruct.factor")
random_effect = "hospital"
dependent = "mort_5yr"
# colon_s %>%
#   finalfit(dependent, explanatory, explanatory_multi, random_effect)

# Include model metrics:
colon_s %>%
  finalfit(dependent, explanatory, explanatory_multi,  metrics=TRUE)
#> Note: dependent includes missing data. These are dropped.
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Setting levels: control = 0, case = 1
#> Setting direction: controls < cases
#> [[1]]
#>  Dependent: Mortality 5 year                  Alive       Died
#>                          Age   <40 years  31 (46.3)  36 (53.7)
#>                              40-59 years 208 (61.4) 131 (38.6)
#>                                60+ years 272 (53.4) 237 (46.6)
#>                          Sex      Female 243 (55.6) 194 (44.4)
#>                                     Male 268 (56.1) 210 (43.9)
#>                  Obstruction          No 408 (56.7) 312 (43.3)
#>                                      Yes  89 (51.1)  85 (48.9)
#>                  Perforation          No 497 (56.0) 391 (44.0)
#>                                      Yes  14 (51.9)  13 (48.1)
#>           OR (univariable)        OR (multivariable)
#>                          -                         -
#>  0.54 (0.32-0.92, p=0.023) 0.57 (0.34-0.98, p=0.041)
#>  0.75 (0.45-1.25, p=0.270) 0.81 (0.48-1.36, p=0.424)
#>                          -                         -
#>  0.98 (0.76-1.27, p=0.889)                         -
#>                          -                         -
#>  1.25 (0.90-1.74, p=0.189) 1.26 (0.90-1.76, p=0.176)
#>                          -                         -
#>  1.18 (0.54-2.55, p=0.672)                         -
#> 
#> [[2]]
#>                                                                                                                                   
#>  Number in dataframe = 929, Number in model = 894, Missing = 35, AIC = 1226.8, C-statistic = 0.555, H&L = Chi-sq(8) 0.06 (p=1.000)
#> 

# Summary, univariable and multivariable analyses of the form:
# survival::coxph(dependent ~ explanatory)

explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor")
dependent = "Surv(time, status)"

colon_s %>%
  finalfit(dependent, explanatory)
#>  Dependent: Surv(time, status)                    all          HR (univariable)
#>                            Age   <40 years   70 (7.5)                         -
#>                                40-59 years 344 (37.0) 0.76 (0.53-1.09, p=0.132)
#>                                  60+ years 515 (55.4) 0.93 (0.66-1.31, p=0.668)
#>                            Sex      Female 445 (47.9)                         -
#>                                       Male 484 (52.1) 1.01 (0.84-1.22, p=0.888)
#>                    Obstruction          No 732 (80.6)                         -
#>                                        Yes 176 (19.4) 1.29 (1.03-1.62, p=0.028)
#>                    Perforation          No 902 (97.1)                         -
#>                                        Yes   27 (2.9) 1.17 (0.70-1.95, p=0.556)
#>         HR (multivariable)
#>                          -
#>  0.79 (0.55-1.13, p=0.196)
#>  0.98 (0.69-1.40, p=0.926)
#>                          -
#>  1.02 (0.85-1.23, p=0.812)
#>                          -
#>  1.30 (1.03-1.64, p=0.026)
#>                          -
#>  1.08 (0.64-1.81, p=0.785)

# Rather than going all-in-one, any number of subset models can
# be manually added on to a summary_factorlist() table using finalfit.merge().
# This is particularly useful when models take a long-time to run or are complicated.

# Note requirement for fit_id=TRUE.
# `fit2df` is a subfunction extracting most common models to a dataframe.

explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor")
dependent = 'mort_5yr'
colon_s %>%
  finalfit(dependent, explanatory, metrics=TRUE)
#> Note: dependent includes missing data. These are dropped.
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Setting levels: control = 0, case = 1
#> Setting direction: controls < cases
#> [[1]]
#>  Dependent: Mortality 5 year                  Alive       Died
#>                          Age   <40 years  31 (46.3)  36 (53.7)
#>                              40-59 years 208 (61.4) 131 (38.6)
#>                                60+ years 272 (53.4) 237 (46.6)
#>                          Sex      Female 243 (55.6) 194 (44.4)
#>                                     Male 268 (56.1) 210 (43.9)
#>                  Obstruction          No 408 (56.7) 312 (43.3)
#>                                      Yes  89 (51.1)  85 (48.9)
#>                  Perforation          No 497 (56.0) 391 (44.0)
#>                                      Yes  14 (51.9)  13 (48.1)
#>           OR (univariable)        OR (multivariable)
#>                          -                         -
#>  0.54 (0.32-0.92, p=0.023) 0.57 (0.34-0.98, p=0.041)
#>  0.75 (0.45-1.25, p=0.270) 0.81 (0.48-1.36, p=0.426)
#>                          -                         -
#>  0.98 (0.76-1.27, p=0.889) 0.98 (0.75-1.28, p=0.902)
#>                          -                         -
#>  1.25 (0.90-1.74, p=0.189) 1.25 (0.90-1.76, p=0.186)
#>                          -                         -
#>  1.18 (0.54-2.55, p=0.672) 1.12 (0.51-2.44, p=0.770)
#> 
#> [[2]]
#>                                                                                                                                  
#>  Number in dataframe = 929, Number in model = 894, Missing = 35, AIC = 1230.7, C-statistic = 0.56, H&L = Chi-sq(8) 5.69 (p=0.682)
#> 

explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor")
explanatory_multi = c("age.factor", "obstruct.factor")
random_effect = "hospital"
dependent = 'mort_5yr'

# Separate tables
colon_s %>%
  summary_factorlist(dependent, explanatory, fit_id=TRUE) -> example.summary
#> Note: dependent includes missing data. These are dropped.

colon_s %>%
  glmuni(dependent, explanatory) %>%
  fit2df(estimate_suffix=" (univariable)") -> example.univariable
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...

colon_s %>%
  glmmulti(dependent, explanatory) %>%
  fit2df(estimate_suffix=" (multivariable)") -> example.multivariable
#> Waiting for profiling to be done...

# Edited as CRAN slow to run these
# colon_s %>%
#   glmmixed(dependent, explanatory, random_effect) %>%
#   fit2df(estimate_suffix=" (multilevel") -> example.multilevel

# Pipe together
example.summary %>%
  finalfit_merge(example.univariable) %>%
  finalfit_merge(example.multivariable, last_merge = TRUE)
#>         label      levels      Alive       Died          OR (univariable)
#> 3         Age   <40 years   31 (6.1)   36 (8.9)                         -
#> 1             40-59 years 208 (40.7) 131 (32.4) 0.54 (0.32-0.92, p=0.023)
#> 2               60+ years 272 (53.2) 237 (58.7) 0.75 (0.45-1.25, p=0.270)
#> 8         Sex      Female 243 (47.6) 194 (48.0)                         -
#> 9                    Male 268 (52.4) 210 (52.0) 0.98 (0.76-1.27, p=0.889)
#> 4 Obstruction          No 408 (82.1) 312 (78.6)                         -
#> 5                     Yes  89 (17.9)  85 (21.4) 1.25 (0.90-1.74, p=0.189)
#> 6 Perforation          No 497 (97.3) 391 (96.8)                         -
#> 7                     Yes   14 (2.7)   13 (3.2) 1.18 (0.54-2.55, p=0.672)
#>          OR (multivariable)
#> 3                         -
#> 1 0.57 (0.34-0.98, p=0.041)
#> 2 0.81 (0.48-1.36, p=0.426)
#> 8                         -
#> 9 0.98 (0.75-1.28, p=0.902)
#> 4                         -
#> 5 1.25 (0.90-1.76, p=0.186)
#> 6                         -
#> 7 1.12 (0.51-2.44, p=0.770)
# finalfit_merge(example.multilevel)