This takes a dataset with an identified outcome and treatment column along
with any number of covariates and appends three columns to the dataset corresponding
to an estimate of the conditional expectation of treatment (.pi_hat
), along with the
conditional expectation of the control and treatment potential outcome surfaces
(.mu0_hat
and .mu1_hat
respectively).
produce_plugin_estimates(data, outcome, treatment, ..., .weights = NULL)
dataframe (already prepared with attach_config
and make_splits
)
Unquoted name of the outcome variable.
Unquoted name of the treatment variable.
Unquoted names of covariates to include in the models of the nuisance functions.
Unquoted name of weights column. If NULL, all analysis will assume weights are all equal to one and sample-based quantities will be returned.
To see an example analysis, read vignette("experimental_analysis")
in the context
of an experiment, vignette("experimental_analysis")
for an observational study, or
vignette("methodological_details")
for a deeper dive under the hood.
library("dplyr")
if(require("palmerpenguins")) {
data(package = 'palmerpenguins')
penguins$unitid = seq_len(nrow(penguins))
penguins$propensity = rep(0.5, nrow(penguins))
penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity)
cfg <- basic_config() %>%
add_known_propensity_score("propensity") %>%
add_outcome_model("SL.glm.interaction") %>%
remove_vimp()
attach_config(penguins, cfg) %>%
make_splits(unitid, .num_splits = 4) %>%
produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>%
construct_pseudo_outcomes(body_mass_g, treatment) %>%
estimate_QoI(species, sex)
}
#> Dropped 11 of 344 rows (3.2%) through listwise deletion.
#>
#> estimating nuisance models [-----------------------------------] splits: 0 / 4
#>
#> estimating nuisance models [========>--------------------------] splits: 1 / 4
#>
#> estimating nuisance models [=================>-----------------] splits: 2 / 4
#>
#> estimating nuisance models [=========================>---------] splits: 3 / 4
#>
#> estimating nuisance models [===================================] splits: 4 / 4
#>
#>
#> Dropped 11 of 344 rows (3.2%) through listwise deletion.
#> Skipping diagnostic on .pseudo_outcome due to lack of model.
#> # A tibble: 11 × 5
#> estimand term level estimate std_error
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 MSE body_mass_g Control Response 1.08e+5 1.02e+4
#> 2 MSE body_mass_g Treatment Response 9.41e+4 1.11e+4
#> 3 SL risk SL.glm.interaction_All Control Response 1.11e+5 1.98e+3
#> 4 SL risk SL.glm_All Control Response 1.21e+5 3.56e+3
#> 5 SL risk SL.glm.interaction_All Treatment Response 9.72e+4 4.66e+3
#> 6 SL risk SL.glm_All Treatment Response 9.21e+4 1.69e+3
#> 7 SL coefficient SL.glm.interaction_All Control Response 9.54e-1 3.90e-2
#> 8 SL coefficient SL.glm_All Control Response 4.61e-2 3.90e-2
#> 9 SL coefficient SL.glm.interaction_All Treatment Response 2.04e-1 2.04e-1
#> 10 SL coefficient SL.glm_All Treatment Response 7.96e-1 2.04e-1
#> 11 SATE NA NA 2.55e+1 3.42e+1