Final output tables for common regression models

An "all-in-one" function that takes a single dependent variable with a vector of explanatory variable names (continuous or categorical variables) to produce a final table for publication including summary statistics. The appropriate model is selected on the basis of dependent variable and whether a random effect is specified.

finalfit.lm method (not called directly)

finalfit.glm method (not called directly)

finalfit.coxph method (not called directly)

finalfit(
  .data,
  dependent = NULL,
  explanatory = NULL,
  explanatory_multi = NULL,
  random_effect = NULL,
  formula = NULL,
  model_args = list(),
  weights = NULL,
  cont_cut = 5,
  column = NULL,
  keep_models = FALSE,
  metrics = FALSE,
  add_dependent_label = TRUE,
  dependent_label_prefix = "Dependent: ",
  dependent_label_suffix = "",
  keep_fit_id = FALSE,
  ...
)

finalfit.lm(
  .data,
  dependent,
  explanatory,
  explanatory_multi = NULL,
  random_effect = NULL,
  model_args = NULL,
  weights = NULL,
  cont_cut = 5,
  column = FALSE,
  keep_models = FALSE,
  metrics = FALSE,
  add_dependent_label = TRUE,
  dependent_label_prefix = "Dependent: ",
  dependent_label_suffix = "",
  keep_fit_id = FALSE,
  ...
)

finalfit.glm(
  .data,
  dependent,
  explanatory,
  explanatory_multi = NULL,
  random_effect = NULL,
  model_args = NULL,
  weights = NULL,
  cont_cut = 5,
  column = FALSE,
  keep_models = FALSE,
  metrics = FALSE,
  add_dependent_label = TRUE,
  dependent_label_prefix = "Dependent: ",
  dependent_label_suffix = "",
  keep_fit_id = FALSE,
  ...
)

finalfit.coxph(
  .data,
  dependent,
  explanatory,
  explanatory_multi = NULL,
  random_effect = NULL,
  model_args = NULL,
  column = TRUE,
  cont_cut = 5,
  keep_models = FALSE,
  metrics = FALSE,
  add_dependent_label = TRUE,
  dependent_label_prefix = "Dependent: ",
  dependent_label_suffix = "",
  keep_fit_id = FALSE,
  ...
)

Arguments

.data: Data frame or tibble.
dependent: Character vector of length 1: quoted name of dependent variable. Can be continuous, a binary factor, or a survival object of form Surv(time, status).
explanatory: Character vector of any length: quoted name(s) of explanatory variables.
explanatory_multi: Character vector of any length: quoted name(s) of a subset of explanatory variables to generate reduced multivariable model (must only contain variables contained in explanatory).
random_effect: Character vector of length 1, either, (1) name of random intercept variable, e.g. "var1", (automatically convered to "(1 | var1)"); or, (2) the full lme4 specification, e.g. "(var1 | var2)". Note parenthesis MUST be included in (2) but NOT included in (1).
formula: an object of class "formula" (or one that can be coerced to that class). Optional instead of standard dependent/explanatory format. Do not include if using dependent/explanatory.
model_args: List. A list of arguments to pass to lm, glm, coxph.
weights: Character vector of length 1: quoted name of weights variable. Passed to summary_factorlist, lm, and glm to provide weighted summary table and regression (e.g. IPTW). If wish weighted regression and non-weighted summary table, pass weights argument within model_args. Not available with surival dependent variable.
cont_cut: Numeric: number of unique values in continuous variable at which to consider it a factor.
column: Logical: Compute margins by column rather than row.
keep_models: Logical: include full multivariable model in output when working with reduced multivariable model (explanatory_multi) and/or mixed effect models (random_effect).
metrics: Logical: include useful model metrics in output in publication format.
add_dependent_label: Add the name of the dependent label to the top left of table.
dependent_label_prefix: Add text before dependent label.
dependent_label_suffix: Add text after dependent label.
keep_fit_id: Keep original model output coefficient label (internal).
...: Other arguments to pass to fit2df: estimate_name, digits, confint_type, confint_level, confint_sep.

Value

Returns a data frame with the final model table.

Examples

library(finalfit)
library(dplyr)

# Summary, univariable and multivariable analyses of the form:
# glm(depdendent ~ explanatory, family="binomial")
# lmuni(), lmmulti(), lmmixed(), glmuni(), glmmulti(), glmmixed(), glmmultiboot(),
#   coxphuni(), coxphmulti()

data(colon_s) # Modified from survival::colon
explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor")
dependent = 'mort_5yr'
colon_s %>%
  finalfit(dependent, explanatory)
#> Note: dependent includes missing data. These are dropped.
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#>  Dependent: Mortality 5 year                  Alive       Died
#>                          Age   <40 years  31 (46.3)  36 (53.7)
#>                              40-59 years 208 (61.4) 131 (38.6)
#>                                60+ years 272 (53.4) 237 (46.6)
#>                          Sex      Female 243 (55.6) 194 (44.4)
#>                                     Male 268 (56.1) 210 (43.9)
#>                  Obstruction          No 408 (56.7) 312 (43.3)
#>                                      Yes  89 (51.1)  85 (48.9)
#>                  Perforation          No 497 (56.0) 391 (44.0)
#>                                      Yes  14 (51.9)  13 (48.1)
#>           OR (univariable)        OR (multivariable)
#>                          -                         -
#>  0.54 (0.32-0.92, p=0.023) 0.57 (0.34-0.98, p=0.041)
#>  0.75 (0.45-1.25, p=0.270) 0.81 (0.48-1.36, p=0.426)
#>                          -                         -
#>  0.98 (0.76-1.27, p=0.889) 0.98 (0.75-1.28, p=0.902)
#>                          -                         -
#>  1.25 (0.90-1.74, p=0.189) 1.25 (0.90-1.76, p=0.186)
#>                          -                         -
#>  1.18 (0.54-2.55, p=0.672) 1.12 (0.51-2.44, p=0.770)

# Multivariable analysis with subset of explanatory
#   variable set used in univariable analysis
explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor")
explanatory_multi = c("age.factor", "obstruct.factor")
dependent = "mort_5yr"
colon_s %>%
  finalfit(dependent, explanatory, explanatory_multi)
#> Note: dependent includes missing data. These are dropped.
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#>  Dependent: Mortality 5 year                  Alive       Died
#>                          Age   <40 years  31 (46.3)  36 (53.7)
#>                              40-59 years 208 (61.4) 131 (38.6)
#>                                60+ years 272 (53.4) 237 (46.6)
#>                          Sex      Female 243 (55.6) 194 (44.4)
#>                                     Male 268 (56.1) 210 (43.9)
#>                  Obstruction          No 408 (56.7) 312 (43.3)
#>                                      Yes  89 (51.1)  85 (48.9)
#>                  Perforation          No 497 (56.0) 391 (44.0)
#>                                      Yes  14 (51.9)  13 (48.1)
#>           OR (univariable)        OR (multivariable)
#>                          -                         -
#>  0.54 (0.32-0.92, p=0.023) 0.57 (0.34-0.98, p=0.041)
#>  0.75 (0.45-1.25, p=0.270) 0.81 (0.48-1.36, p=0.424)
#>                          -                         -
#>  0.98 (0.76-1.27, p=0.889)                         -
#>                          -                         -
#>  1.25 (0.90-1.74, p=0.189) 1.26 (0.90-1.76, p=0.176)
#>                          -                         -
#>  1.18 (0.54-2.55, p=0.672)                         -

# Summary, univariable and multivariable analyses of the form:
# lme4::glmer(dependent ~ explanatory + (1 | random_effect), family="binomial")

explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor")
explanatory_multi = c("age.factor", "obstruct.factor")
random_effect = "hospital"
dependent = "mort_5yr"
# colon_s %>%
#   finalfit(dependent, explanatory, explanatory_multi, random_effect)

# Include model metrics:
colon_s %>%
  finalfit(dependent, explanatory, explanatory_multi,  metrics=TRUE)
#> Note: dependent includes missing data. These are dropped.
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Setting levels: control = 0, case = 1
#> Setting direction: controls < cases
#> [[1]]
#>  Dependent: Mortality 5 year                  Alive       Died
#>                          Age   <40 years  31 (46.3)  36 (53.7)
#>                              40-59 years 208 (61.4) 131 (38.6)
#>                                60+ years 272 (53.4) 237 (46.6)
#>                          Sex      Female 243 (55.6) 194 (44.4)
#>                                     Male 268 (56.1) 210 (43.9)
#>                  Obstruction          No 408 (56.7) 312 (43.3)
#>                                      Yes  89 (51.1)  85 (48.9)
#>                  Perforation          No 497 (56.0) 391 (44.0)
#>                                      Yes  14 (51.9)  13 (48.1)
#>           OR (univariable)        OR (multivariable)
#>                          -                         -
#>  0.54 (0.32-0.92, p=0.023) 0.57 (0.34-0.98, p=0.041)
#>  0.75 (0.45-1.25, p=0.270) 0.81 (0.48-1.36, p=0.424)
#>                          -                         -
#>  0.98 (0.76-1.27, p=0.889)                         -
#>                          -                         -
#>  1.25 (0.90-1.74, p=0.189) 1.26 (0.90-1.76, p=0.176)
#>                          -                         -
#>  1.18 (0.54-2.55, p=0.672)                         -
#> 
#> [[2]]
#>                                                                                                                                   
#>  Number in dataframe = 929, Number in model = 894, Missing = 35, AIC = 1226.8, C-statistic = 0.555, H&L = Chi-sq(8) 0.06 (p=1.000)
#> 

# Summary, univariable and multivariable analyses of the form:
# survival::coxph(dependent ~ explanatory)

explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor")
dependent = "Surv(time, status)"

colon_s %>%
  finalfit(dependent, explanatory)
#>  Dependent: Surv(time, status)                    all          HR (univariable)
#>                            Age   <40 years   70 (7.5)                         -
#>                                40-59 years 344 (37.0) 0.76 (0.53-1.09, p=0.132)
#>                                  60+ years 515 (55.4) 0.93 (0.66-1.31, p=0.668)
#>                            Sex      Female 445 (47.9)                         -
#>                                       Male 484 (52.1) 1.01 (0.84-1.22, p=0.888)
#>                    Obstruction          No 732 (80.6)                         -
#>                                        Yes 176 (19.4) 1.29 (1.03-1.62, p=0.028)
#>                    Perforation          No 902 (97.1)                         -
#>                                        Yes   27 (2.9) 1.17 (0.70-1.95, p=0.556)
#>         HR (multivariable)
#>                          -
#>  0.79 (0.55-1.13, p=0.196)
#>  0.98 (0.69-1.40, p=0.926)
#>                          -
#>  1.02 (0.85-1.23, p=0.812)
#>                          -
#>  1.30 (1.03-1.64, p=0.026)
#>                          -
#>  1.08 (0.64-1.81, p=0.785)

# Rather than going all-in-one, any number of subset models can
# be manually added on to a summary_factorlist() table using finalfit.merge().
# This is particularly useful when models take a long-time to run or are complicated.

# Note requirement for fit_id=TRUE.
# `fit2df` is a subfunction extracting most common models to a dataframe.

explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor")
dependent = 'mort_5yr'
colon_s %>%
  finalfit(dependent, explanatory, metrics=TRUE)
#> Note: dependent includes missing data. These are dropped.
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Setting levels: control = 0, case = 1
#> Setting direction: controls < cases
#> [[1]]
#>  Dependent: Mortality 5 year                  Alive       Died
#>                          Age   <40 years  31 (46.3)  36 (53.7)
#>                              40-59 years 208 (61.4) 131 (38.6)
#>                                60+ years 272 (53.4) 237 (46.6)
#>                          Sex      Female 243 (55.6) 194 (44.4)
#>                                     Male 268 (56.1) 210 (43.9)
#>                  Obstruction          No 408 (56.7) 312 (43.3)
#>                                      Yes  89 (51.1)  85 (48.9)
#>                  Perforation          No 497 (56.0) 391 (44.0)
#>                                      Yes  14 (51.9)  13 (48.1)
#>           OR (univariable)        OR (multivariable)
#>                          -                         -
#>  0.54 (0.32-0.92, p=0.023) 0.57 (0.34-0.98, p=0.041)
#>  0.75 (0.45-1.25, p=0.270) 0.81 (0.48-1.36, p=0.426)
#>                          -                         -
#>  0.98 (0.76-1.27, p=0.889) 0.98 (0.75-1.28, p=0.902)
#>                          -                         -
#>  1.25 (0.90-1.74, p=0.189) 1.25 (0.90-1.76, p=0.186)
#>                          -                         -
#>  1.18 (0.54-2.55, p=0.672) 1.12 (0.51-2.44, p=0.770)
#> 
#> [[2]]
#>                                                                                                                                  
#>  Number in dataframe = 929, Number in model = 894, Missing = 35, AIC = 1230.7, C-statistic = 0.56, H&L = Chi-sq(8) 5.69 (p=0.682)
#> 

explanatory = c("age.factor", "sex.factor", "obstruct.factor", "perfor.factor")
explanatory_multi = c("age.factor", "obstruct.factor")
random_effect = "hospital"
dependent = 'mort_5yr'

# Separate tables
colon_s %>%
  summary_factorlist(dependent, explanatory, fit_id=TRUE) -> example.summary
#> Note: dependent includes missing data. These are dropped.

colon_s %>%
  glmuni(dependent, explanatory) %>%
  fit2df(estimate_suffix=" (univariable)") -> example.univariable
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...

colon_s %>%
  glmmulti(dependent, explanatory) %>%
  fit2df(estimate_suffix=" (multivariable)") -> example.multivariable
#> Waiting for profiling to be done...

# Edited as CRAN slow to run these
# colon_s %>%
#   glmmixed(dependent, explanatory, random_effect) %>%
#   fit2df(estimate_suffix=" (multilevel") -> example.multilevel

# Pipe together
example.summary %>%
  finalfit_merge(example.univariable) %>%
  finalfit_merge(example.multivariable, last_merge = TRUE)
#>         label      levels      Alive       Died          OR (univariable)
#> 3         Age   <40 years   31 (6.1)   36 (8.9)                         -
#> 1             40-59 years 208 (40.7) 131 (32.4) 0.54 (0.32-0.92, p=0.023)
#> 2               60+ years 272 (53.2) 237 (58.7) 0.75 (0.45-1.25, p=0.270)
#> 8         Sex      Female 243 (47.6) 194 (48.0)                         -
#> 9                    Male 268 (52.4) 210 (52.0) 0.98 (0.76-1.27, p=0.889)
#> 4 Obstruction          No 408 (82.1) 312 (78.6)                         -
#> 5                     Yes  89 (17.9)  85 (21.4) 1.25 (0.90-1.74, p=0.189)
#> 6 Perforation          No 497 (97.3) 391 (96.8)                         -
#> 7                     Yes   14 (2.7)   13 (3.2) 1.18 (0.54-2.55, p=0.672)
#>          OR (multivariable)
#> 3                         -
#> 1 0.57 (0.34-0.98, p=0.041)
#> 2 0.81 (0.48-1.36, p=0.426)
#> 8                         -
#> 9 0.98 (0.75-1.28, p=0.902)
#> 4                         -
#> 5 1.25 (0.90-1.76, p=0.186)
#> 6                         -
#> 7 1.12 (0.51-2.44, p=0.770)
# finalfit_merge(example.multilevel)