## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
The propensity score is based on a sort of Horvitz-Thompson estimator.
Dividing by the probability of sampling means that we weight higher for units with low inclusion probabilities.
In our case, we can imagine having a sample of units (each with \(Y_0\) and \(Y_1\)). We then randomly assign them to treatment.
This is equivalent to randomly sampling potential outcomes.
So if we believe that treatment(/sampling) probabilities are assigned according to some covariates, then we just need to know what those probabilities are.
Call the propensity score \(e(X)\). Then \(e(X)\) tells us the probability of sampling \(Y_1\) (treating out sample as the population, because we're interested in a SATE).
This suggests that we can just use \({1 \over n_1} \sum_{i=1}^{n_1} {(Y_i \setminus N) \over e(X_i)}\) to estimate \(E[Y_1]\).
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
This is a fancy and very effective algorithm developed by Jas Sekhon.
The basic logic is as follows:
Start with the mahalanobis distance solution.
Evaluate balance (by default, by paired t-tests and KS tests on covariates)
Tweak the covariance matrix.
New matching solution
See if balance improved
Iterate
It uses a genetic algorithm to tweak the covariance matrix.
It is NOT fast. And you should use a large value of pop.size, which will make it even slower (10 is WAY too low. The default is 100, and even that is too low). Also, you should use the available wrapper functions via MatchIt (or even just in the Matching package)
require(Matching)
## Loading required package: Matching
## ##
## ## Matching (Version 4.8-3.4, Build Date: 2013/10/28)
## ## See http://sekhon.berkeley.edu/matching for additional documentation.
## ## Please cite software as:
## ## Jasjeet S. Sekhon. 2011. ``Multivariate and Propensity Score Matching
## ## Software with Automated Balance Optimization: The Matching package for R.''
## ## Journal of Statistical Software, 42(7): 1-52.
## ##
require(rgenoud)
## Loading required package: rgenoud
## Loading required package: parallel
## ## rgenoud (Version 5.7-12, Build Date: 2013-06-28)
## ## See http://sekhon.berkeley.edu/rgenoud for additional documentation.
## ## Please cite software as:
## ## Walter Mebane, Jr. and Jasjeet S. Sekhon. 2011.
## ## ``Genetic Optimization Using Derivatives: The rgenoud package for R.''
## ## Journal of Statistical Software, 42(11): 1-26.
## ##
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
## Warning: p-value will be approximate in the presence of ties
What if we framed preprocessing explicitly as an optimization problem?
We want to minimize difference between empirical moments of treatment and control by varying the weights accorded to individual observations in our dataset.
All while keeping weights relatively stable.
This is "entropy balancing" created by Jens Hainmueller.
We optimize the following problem: \(\min_{\boldsymbol{W},\lambda_0,\boldsymbol\lambda} L^p = \sum_{D=0} w_i \log ({w_i / q_i}) +\) \(\sum_{r=1}^R \lambda_r \left(\sum_{D=0} w_ic_{ri}(X_i)-m_r\right) +\) \((\lambda_0 -1) \left( \sum_{D=0} w_i -1 \right)\)
require(ebal, quietly =TRUE)
## ##
## ## ebal Package: Implements Entropy Balancing.
##
## ## See http://www.stanford.edu/~jhain/ for additional information.