This takes a dataset with an identified outcome and treatment column along with any number of covariates and appends three columns to the dataset corresponding to an estimate of the conditional expectation of treatment (.pi_hat), along with the conditional expectation of the control and treatment potential outcome surfaces (.mu0_hat and .mu1_hat respectively).

produce_plugin_estimates(data, outcome, treatment, ..., .weights = NULL)

Arguments

data

dataframe (already prepared with attach_config and make_splits)

outcome

Unquoted name of the outcome variable.

treatment

Unquoted name of the treatment variable.

...

Unquoted names of covariates to include in the models of the nuisance functions.

.weights

Unquoted name of weights column. If NULL, all analysis will assume weights are all equal to one and sample-based quantities will be returned.

Details

To see an example analysis, read vignette("experimental_analysis") in the context of an experiment, vignette("experimental_analysis") for an observational study, or vignette("methodological_details") for a deeper dive under the hood.

Examples

library("dplyr")
if(require("palmerpenguins")) {
data(package = 'palmerpenguins')
penguins$unitid = seq_len(nrow(penguins))
penguins$propensity = rep(0.5, nrow(penguins))
penguins$treatment = rbinom(nrow(penguins), 1, penguins$propensity)
cfg <- basic_config() %>% 
add_known_propensity_score("propensity") %>%
add_outcome_model("SL.glm.interaction") %>%
remove_vimp()
attach_config(penguins, cfg) %>%
make_splits(unitid, .num_splits = 4) %>%
produce_plugin_estimates(outcome = body_mass_g, treatment = treatment, species, sex) %>%
construct_pseudo_outcomes(body_mass_g, treatment) %>%
estimate_QoI(species, sex)
}
#> Dropped 11 of 344 rows (3.2%) through listwise deletion.
#> 

#> estimating nuisance models [-----------------------------------] splits: 0 / 4
#> 

#> estimating nuisance models [========>--------------------------] splits: 1 / 4
#> 

#> estimating nuisance models [=================>-----------------] splits: 2 / 4
#> 

#> estimating nuisance models [=========================>---------] splits: 3 / 4
#> 

#> estimating nuisance models [===================================] splits: 4 / 4
#> 
                                                                              
#> 

#> Dropped 11 of 344 rows (3.2%) through listwise deletion.
#> Skipping diagnostic on .pseudo_outcome due to lack of model.
#> # A tibble: 11 × 5
#>    estimand       term                   level                estimate std_error
#>    <chr>          <chr>                  <chr>                   <dbl>     <dbl>
#>  1 MSE            body_mass_g            Control Response   109706.      1.04e+4
#>  2 MSE            body_mass_g            Treatment Response  88478.      9.18e+3
#>  3 SL risk        SL.glm.interaction_All Control Response   110073.      3.94e+3
#>  4 SL risk        SL.glm_All             Control Response   113294.      2.10e+3
#>  5 SL risk        SL.glm.interaction_All Treatment Response  89945.      2.05e+3
#>  6 SL risk        SL.glm_All             Treatment Response  92592.      2.19e+3
#>  7 SL coefficient SL.glm.interaction_All Control Response        0.666   9.75e-2
#>  8 SL coefficient SL.glm_All             Control Response        0.334   9.75e-2
#>  9 SL coefficient SL.glm.interaction_All Treatment Response      0.740   8.59e-2
#> 10 SL coefficient SL.glm_All             Treatment Response      0.260   8.59e-2
#> 11 SATE           NA                     NA                    -75.9     3.42e+1