Title: | Estimate Risk Ratios and Risk Differences using Regression |
---|---|
Description: | Risk ratios and risk differences are estimated using regression models that allow for binary, categorical, and continuous exposures and confounders. Implemented are marginal standardization after fitting logistic models (g-computation) with delta-method and bootstrap standard errors, Miettinen's case-duplication approach (Schouten et al. 1993, <doi:10.1002/sim.4780121808>), log-binomial (Poisson) models with empirical variance (Zou 2004, <doi:10.1093/aje/kwh090>), binomial models with starting values from Poisson models (Spiegelman and Hertzmark 2005, <doi:10.1093/aje/kwi188>), and others. |
Authors: | Konrad Stopsack [aut, cre] |
Maintainer: | Konrad Stopsack <[email protected]> |
License: | GPL-3 |
Version: | 0.4.2 |
Built: | 2025-03-12 04:02:16 UTC |
Source: | https://github.com/stopsack/risks |
A cohort of women with breast cancer and complete follow-up, as used by Spiegelman and Hertzmark (Am J Epidemiol 2005) and Greenland (Am J Epidemiol 2004).
breastcancer
breastcancer
breastcancer
A tibble with 192 rows and 3 columns:
Death, binary: 0, 1
Cancer stage, 3 categories
Hormone receptor status, binary: "High", "Low"
...
Newman SC. Biostatistical methods in epidemiology. New York, NY: Wiley, 2001, table 5.3
Estimate confidence intervals for the case duplication model with robust/sandwich/empirical covariance structure.
## S3 method for class 'duplicate' confint(object, parm = NULL, level = 0.95, ...)
## S3 method for class 'duplicate' confint(object, parm = NULL, level = 0.95, ...)
object |
Fitted model |
parm |
Not used |
level |
Confidence level, defaults to 0.95 |
... |
Additional arguments, not used |
Matrix: First column, lower bound; second column, upper bound.
Confidence intervals for models fit using marginal standardization based on parametric bootstrapping.
## S3 method for class 'margstd_boot' confint( object, parm = NULL, level = 0.95, bootrepeats = 1000, bootci = c("bca", "normal", "nonpar"), jacksd = FALSE, ... )
## S3 method for class 'margstd_boot' confint( object, parm = NULL, level = 0.95, bootrepeats = 1000, bootci = c("bca", "normal", "nonpar"), jacksd = FALSE, ... )
object |
Model fitted through marginal standardization |
parm |
Not used, for compatibility |
level |
Confidence level, defaults to 0.95. |
bootrepeats |
Bootstrap repeats. Defaults to 1000. Consider increasing. |
bootci |
Type of bootstrap confidence interval:
|
jacksd |
Return jackknife Monte-Carlo error for the confidence limits?
Only functional with BCa confidence intervals. Defaults to |
... |
Not used |
Matrix: First column, lower bound; second column, upper bound.
Confidence intervals for models fit using marginal standardization based on the melta method.
## S3 method for class 'margstd_delta' confint(object, parm = NULL, level = 0.95, ...)
## S3 method for class 'margstd_delta' confint(object, parm = NULL, level = 0.95, ...)
object |
Model fitted through marginal standardization, delta method |
parm |
Not used, for compatibility |
level |
Confidence level, defaults to |
... |
Not used |
Matrix: First column, lower bound; second column, upper bound.
Estimate confidence intervals for the Poisson model with robust/sandwich/empirical covariance structure.
## S3 method for class 'robpoisson' confint(object, parm = NULL, level = 0.95, ...)
## S3 method for class 'robpoisson' confint(object, parm = NULL, level = 0.95, ...)
object |
Fitted model |
parm |
Not used |
level |
Confidence level, defaults to 0.95 |
... |
Additional arguments, not used |
Matrix: First column, lower bound; second column, upper bound.
Print fitted risks model. The only change, compared to print.glm()
,
is the addition of the main type of model: relative risk or risk difference.
If multiple models were fitted via approach = "all"
, then the
first converged model will be printed.
## S3 method for class 'risks' print(x, ...)
## S3 method for class 'risks' print(x, ...)
x |
Fitted model |
... |
Passed to print.glm() |
No return value, called for printing
Print summaries for "risks" models. The printout is the same as
for regular summaries of generalized linear models fit via
stats::glm()
, except that the type of "risks" model
is printed first (e.g., "Poisson model with robust covariance")
and confidence intervals for model parameters are printed at the end.
## S3 method for class 'summary.risks' print(x, ...)
## S3 method for class 'summary.risks' print(x, ...)
x |
Model |
... |
Passed on |
No return value, called for printing
riskratio
and riskdiff
provide a flexible interface to fitting
risk ratio and risk difference models.
In cohort studies with a binary outcome, risk ratios and risk differences are typically more appropriate to report than odds ratios from logistic regression, yet such models have historically been difficult to implement in standard software.
The risks package selects an efficient way to fit risk ratio or risk difference models successfully, which will converge whenever logistic models converge. Optionally, a specific approach to model fitting can also be requested. Implemented are Poisson models with robust covariance, binomial models, logistic models with case duplication, binomial models aided in convergence by starting values obtained through Poisson models or logistic models with case duplication, binomial models fitted via combinatorial expectation maximization (optionally also with Poisson starting values), and estimates obtained via marginal standardization after logistic regression with bootstrapped or delta method for confidence intervals.
Adjusting for covariates (e.g., confounders) in the model specification
(formula =
) is possible.
riskratio( formula, data, approach = c("auto", "all", "robpoisson", "duplicate", "glm", "glm_startp", "glm_startd", "glm_cem", "glm_cem_startp", "margstd_boot", "margstd_delta", "logistic", "legacy"), variable = NULL, at = NULL, ... ) riskdiff( formula, data, approach = c("auto", "all", "robpoisson", "glm", "glm_startp", "glm_cem", "glm_cem_startp", "margstd_boot", "margstd_delta", "legacy"), variable = NULL, at = NULL, ... )
riskratio( formula, data, approach = c("auto", "all", "robpoisson", "duplicate", "glm", "glm_startp", "glm_startd", "glm_cem", "glm_cem_startp", "margstd_boot", "margstd_delta", "logistic", "legacy"), variable = NULL, at = NULL, ... ) riskdiff( formula, data, approach = c("auto", "all", "robpoisson", "glm", "glm_startp", "glm_cem", "glm_cem_startp", "margstd_boot", "margstd_delta", "legacy"), variable = NULL, at = NULL, ... )
formula |
A formula object of the form |
data |
A |
approach |
Optional: Method for model fitting.
The other options allow for directly selecting a fitting approach, some of which may not converge or yield out-of-range predicted probabilities. See full documentation for details.
|
variable |
Optional: exposure variable to use for marginal
standardization. If |
at |
Optional: Levels of exposure variable |
... |
Optional: Further arguments passed to fitting functions
( |
Fitted model. This object can be passed on to post-processing functions:
summary.risks
: an overview of results
(risks-specific S3 methods: summary.robpoisson
,
summary.margstd_boot
,
summary.margstd_delta
).
tidy.risks
: a tibble of coefficients and confidence
intervals.
Standard post-processing functions can also be used:
coef
: a vector of coefficients.
confint
: a matrix of confidence intervals
(risks-specific S3 methods: confint.robpoisson
,
confint.margstd_boot
,
confint.margstd_delta
).
predict.glm(type = "response")
: fitted values
(predictions).
residuals
: residuals.
If model fitting using all possible approaches was requested via
approach = "all"
, then their results can be retrieved from the
list all_models
in the returned object (e.g.,
fit$all_models[[1]]
, fit$all_models[[2]]
, etc.).
riskratio()
: Fit risk ratio models
riskdiff()
: Fit risk difference models
Wacholder S. Binomial regression in GLIM: Estimating risk ratios
and risk differences. Am J Epidemiol 1986;123:174-184.
(Binomial regression models; approach = "glm"
)
Spiegelman D, Hertzmark E. Easy SAS Calculations for Risk or
Prevalence Ratios and Differences. Am J Epidemiol 2005;162:199-200.
(Binomial models fitted used starting values from Poisson models;
approach = "glm_start"
)
Zou G. A modified Poisson regression approach to prospective
studies with binary data. Am J Epidemiol 2004;159:702-706.
(Poisson model with robust/sandwich standard errors;
approach = "robpoisson"
)
Schouten EG, Dekker JM, Kok FJ, Le Cessie S, Van Houwelingen HC,
Pool J, Vandenbroucke JP. Risk ratio and rate ratio estimation in
case-cohort designs: hypertension and cardiovascular mortality.
Stat Med 1993;12:1733–45; (Logistic model with case duplication and
cluster-robust standard errors, approach = "duplicate"
).
Donoghoe MW, Marschner IC. logbin: An R Package for
Relative Risk Regression Using the Log-Binomial Model.
J Stat Softw 2018;86(9). (Log-binomial models fitted via combinatorial
expectation maximization; riskratio(approach = "glm_cem")
Donoghoe MW, Marschner IC. Stable computational methods
for additive binomial models with application to adjusted risk differences.
Comput Stat Data Anal 2014;80:184-96. (Additive binomial models
fitted via combinatorial expectation maximization;
riskdiff(approach = "glm_cem")
)
Localio AR, Margolis DJ, Berlin JA.
Relative risks and confidence intervals were easily computed
indirectly from multivariable logistic regression.
J Clin Epidemiol 2007;60(9):874-82. (Marginal standardization after fitting
a logistic model; approach = "margstd_boot"
)
data(breastcancer) # Cohort study with binary outcome # See for details: help(breastcancer) # Risk ratio model fit_rr <- riskratio(formula = death ~ stage + receptor, data = breastcancer) fit_rr summary(fit_rr) # Risk difference model fit_rd <- riskdiff(formula = death ~ stage + receptor, data = breastcancer) fit_rd summary(fit_rd)
data(breastcancer) # Cohort study with binary outcome # See for details: help(breastcancer) # Risk ratio model fit_rr <- riskratio(formula = death ~ stage + receptor, data = breastcancer) fit_rr summary(fit_rr) # Risk difference model fit_rd <- riskdiff(formula = death ~ stage + receptor, data = breastcancer) fit_rd summary(fit_rd)
The risks package allows for fitting risk ratio and risk difference models using regression.
riskratio
: Fit risk ratio models.
riskdiff
: Fit risk difference models.
summary.risks
: Summarize fitted model.
tidy.risks
: Tibble (data frame) of parameters,
coefficients, standard errors, confidence intervals.
confint.robpoisson
,
confint.duplicate
,
confint.margstd_boot
,
confint.margstd_delta
: Confidence intervals. (Standard
confidence intervals for generalized linear models are used for other
models.)
https://github.com/stopsack/risks
This function implements the Mantel-Haenszel estimators for risk ratio and risk differences for a binary or categorical exposure and one or more categorical confounder(s). Compare to estimates from regression models.
rr_rd_mantel_haenszel( data, exposure, outcome, confounders, estimand = c("rr", "rd"), conf.level = 0.95 )
rr_rd_mantel_haenszel( data, exposure, outcome, confounders, estimand = c("rr", "rd"), conf.level = 0.95 )
data |
Data set. |
exposure |
Exposure variable. Must be binary or categorical. The first level is treated as unexposed. |
outcome |
Outcome variable. Must be binary. |
confounders |
Optional. Binary or categorical variable(s) to perform
stratification over. Supply more than one variable using
|
estimand |
Optional. |
conf.level |
Optional. Confidence level. Defaults to |
Tibble in tidy
format with
term
the (non-reference) exposure levels
estimate
Risk ratio (on log scale) or risk difference
std.error
, conf.low
, and conf.high
Square-root of M-H
variance estimate, and the corresponding confidence limits (on log scale
for RR)
model
always "mh"
estimand
"rr"
or "rd"
Greenland S, Rothman KJ. Introduction to Stratified Analysis. In: Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. 3rd edn. Lippincott Williams & Wilkins: Philadelphia, PA 2008. Page 275. Risk ratios: formulae 15-18, -20, -22. Risk differences: formulae 15-18, -19, -21.
# Newman SC. Biostatistical methods in epidemiology. New York, NY: # Wiley, 2001, table 5.3 library(tibble) # used to set up example data dat <- tibble( death = c(rep(1, 54), rep(0, 138)), stage = c(rep("Stage I", 7), rep("Stage II", 26), rep("Stage III", 21), rep("Stage I", 60), rep("Stage II", 70), rep("Stage III", 8)), receptor = c(rep("Low", 2), rep("High", 5), rep("Low", 9), rep("High", 17), rep("Low", 12), rep("High", 9), rep("Low", 10), rep("High", 50), rep("Low", 13), rep("High", 57), rep("Low", 2), rep("High", 6))) # Risk difference rr_rd_mantel_haenszel( data = dat, exposure = stage, outcome = death, confounders = receptor, estimand = "rd") # Risk ratio, log scale: result <- rr_rd_mantel_haenszel( data = dat, exposure = stage, outcome = death, confounders = receptor, estimand = "rr") result # Risk ratio, exponentiated: result %>% dplyr::mutate(dplyr::across(.cols = c(estimate, conf.low, conf.high), .fns = exp))
# Newman SC. Biostatistical methods in epidemiology. New York, NY: # Wiley, 2001, table 5.3 library(tibble) # used to set up example data dat <- tibble( death = c(rep(1, 54), rep(0, 138)), stage = c(rep("Stage I", 7), rep("Stage II", 26), rep("Stage III", 21), rep("Stage I", 60), rep("Stage II", 70), rep("Stage III", 8)), receptor = c(rep("Low", 2), rep("High", 5), rep("Low", 9), rep("High", 17), rep("Low", 12), rep("High", 9), rep("Low", 10), rep("High", 50), rep("Low", 13), rep("High", 57), rep("Low", 2), rep("High", 6))) # Risk difference rr_rd_mantel_haenszel( data = dat, exposure = stage, outcome = death, confounders = receptor, estimand = "rd") # Risk ratio, log scale: result <- rr_rd_mantel_haenszel( data = dat, exposure = stage, outcome = death, confounders = receptor, estimand = "rr") result # Risk ratio, exponentiated: result %>% dplyr::mutate(dplyr::across(.cols = c(estimate, conf.low, conf.high), .fns = exp))
Summarize results from fitting a logistic model with case duplication and
cluster-robust covariance.
The output is the same as for a regular summary(glm(...))
,
except for using cluster-robust standard errors.
## S3 method for class 'duplicate' summary( object, dispersion = NULL, correlation = FALSE, symbolic.cor = FALSE, ... )
## S3 method for class 'duplicate' summary( object, dispersion = NULL, correlation = FALSE, symbolic.cor = FALSE, ... )
object |
Model |
dispersion |
Not used |
correlation |
Not used |
symbolic.cor |
Not used |
... |
Other arguments, not used |
Model summary (list)
Summary for models using marginal standardization
## S3 method for class 'margstd_boot' summary( object, dispersion = NULL, correlation = FALSE, symbolic.cor = FALSE, level = 0.95, bootrepeats = 1000, bootci = c("bca", "normal", "nonpar"), ... )
## S3 method for class 'margstd_boot' summary( object, dispersion = NULL, correlation = FALSE, symbolic.cor = FALSE, level = 0.95, bootrepeats = 1000, bootci = c("bca", "normal", "nonpar"), ... )
object |
Model |
dispersion |
Not used |
correlation |
Not used |
symbolic.cor |
Not used |
level |
Confidence level, defaults to |
bootrepeats |
Bootstrap repeats for standard errors. Defaults to 1000. Consider increasing. |
bootci |
Type of bootstrap confidence interval:
|
... |
Not used |
Model summary (list)
Summary for models using marginal standardization with delta method SEs
## S3 method for class 'margstd_delta' summary( object, dispersion = NULL, correlation = FALSE, symbolic.cor = FALSE, level = 0.95, ... )
## S3 method for class 'margstd_delta' summary( object, dispersion = NULL, correlation = FALSE, symbolic.cor = FALSE, level = 0.95, ... )
object |
Model |
dispersion |
Not used |
correlation |
Not used |
symbolic.cor |
Not used |
level |
Confidence level, defaults to |
... |
Not used |
Model summary (list)
Determine type of risks model fitted and generate appropriate summary.
## S3 method for class 'risks' summary(object, conf.int = TRUE, default = TRUE, ...)
## S3 method for class 'risks' summary(object, conf.int = TRUE, default = TRUE, ...)
object |
Fitted model |
conf.int |
Add confidence intervals to printout? Defaults to TRUE. |
default |
Normal confidence intervals via confint.default()?
Default to TRUE. By setting |
... |
Passed on |
If multiple models were fitted (approach = "all"
), then
the first converged model is displayed. Other models can be accessed
via the returned list $all_models
.
Model summary (list)
Summarize results from fitting a Poisson model with
robust/empirical/sandwich covariance.
The output is the same as for a regular summary(glm(...))
,
except for using robust standard errors.
## S3 method for class 'robpoisson' summary( object, dispersion = NULL, correlation = FALSE, symbolic.cor = FALSE, ... )
## S3 method for class 'robpoisson' summary( object, dispersion = NULL, correlation = FALSE, symbolic.cor = FALSE, ... )
object |
Model |
dispersion |
Not used |
correlation |
Not used |
symbolic.cor |
Not used |
... |
Other arguments, not used |
Model summary (list)
Obtain a tibble (data frame) with parameters, coefficients, standard errors, confidence limits, and p-values. A column with the type of model fitted is added.
## S3 method for class 'risks' tidy( x, conf.int = TRUE, conf.level = 0.95, bootrepeats = 1000, bootci = c("bca", "normal", "nonpar"), bootverbose = FALSE, exponentiate = FALSE, default = TRUE, ... )
## S3 method for class 'risks' tidy( x, conf.int = TRUE, conf.level = 0.95, bootrepeats = 1000, bootci = c("bca", "normal", "nonpar"), bootverbose = FALSE, exponentiate = FALSE, default = TRUE, ... )
x |
Model |
conf.int |
Show confidence intervals? |
conf.level |
Optional. Confidence level. Defaults to |
bootrepeats |
Optional. Number of bootstrap repeats.
Applicable to models fitted via marginal standardization and bootstrapping
( |
bootci |
Optional and applicable for
|
bootverbose |
Optional. Add values of |
exponentiate |
Optional. Exponentiate coefficients and confidence
limits? Defaults to FALSE. Setting |
default |
Use default, normality-based confidence intervals?
Defaults to TRUE. With |
... |
Passed on |
If multiple types of models are fitted, tidy()
can be used
to parameters for all models at once, in one tibble. The last
column of the tibble includes the name of the model. See examples.
tibble
# Define example data library(broom) # provides tidy() function dat <- tibble::tibble( death = c(rep(1, 54), rep(0, 138)), stage = c(rep("Stage I", 7), rep("Stage II", 26), rep("Stage III", 21), rep("Stage I", 60), rep("Stage II", 70), rep("Stage III", 8)), receptor = c(rep("Low", 2), rep("High", 5), rep("Low", 9), rep("High", 17), rep("Low", 12), rep("High", 9), rep("Low", 10), rep("High", 50), rep("Low", 13), rep("High", 57), rep("Low", 2), rep("High", 6))) # Fit and tidy the model fit_rr <- riskratio(formula = death ~ stage + receptor, data = dat) tidy(fit_rr) # Marginal standardization, # increase number of bootstrap repeats: fit_rr <- riskratio( formula = death ~ stage + receptor, data = dat, approach = "margstd_boot") tidy(fit_rr, bootrepeats = 2000) # Multiple types of models fitted: fit_rr <- riskratio(formula = death ~ stage + receptor, data = dat, approach = "all") tidy(fit_rr)
# Define example data library(broom) # provides tidy() function dat <- tibble::tibble( death = c(rep(1, 54), rep(0, 138)), stage = c(rep("Stage I", 7), rep("Stage II", 26), rep("Stage III", 21), rep("Stage I", 60), rep("Stage II", 70), rep("Stage III", 8)), receptor = c(rep("Low", 2), rep("High", 5), rep("Low", 9), rep("High", 17), rep("Low", 12), rep("High", 9), rep("Low", 10), rep("High", 50), rep("Low", 13), rep("High", 57), rep("Low", 2), rep("High", 6))) # Fit and tidy the model fit_rr <- riskratio(formula = death ~ stage + receptor, data = dat) tidy(fit_rr) # Marginal standardization, # increase number of bootstrap repeats: fit_rr <- riskratio( formula = death ~ stage + receptor, data = dat, approach = "margstd_boot") tidy(fit_rr, bootrepeats = 2000) # Multiple types of models fitted: fit_rr <- riskratio(formula = death ~ stage + receptor, data = dat, approach = "all") tidy(fit_rr)