diagnostics
Diagnostic utilities for assessing balance and weight quality.
balance_report(X, A, weights)
Generate comprehensive balance report.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
(Array, shape(n_samples, n_features))
|
Covariates |
required |
A
|
(Array, shape(n_samples, 1) or (n_samples,))
|
Treatments |
required |
weights
|
(Array, shape(n_samples))
|
Sample weights |
required |
Returns:
| Name | Type | Description |
|---|---|---|
report |
dict
|
Comprehensive balance report with: - smd: Array of SMD per covariate - max_smd: Maximum absolute SMD across covariates - mean_smd: Mean absolute SMD across covariates - ess: Effective sample size - ess_ratio: ESS / n_samples - weight_stats: Dictionary of weight distribution statistics - n_samples: Number of samples - n_features: Number of features - treatment_type: 'binary' or 'continuous' |
Notes
This function provides a complete overview of balance quality after weighting, useful for reporting and model diagnostics.
Source code in src/stochpw/diagnostics/advanced.py
calibration_curve(discriminator_probs, true_labels, num_bins=10)
Compute calibration curve for discriminator predictions.
A well-calibrated discriminator should have predicted probabilities that match the true frequencies of the labels.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discriminator_probs
|
(Array, shape(n_samples))
|
Predicted probabilities from discriminator (values between 0 and 1) |
required |
true_labels
|
(Array, shape(n_samples))
|
True binary labels (0 or 1) |
required |
num_bins
|
int
|
Number of bins to divide probability range into |
10
|
Returns:
| Name | Type | Description |
|---|---|---|
bin_centers |
(Array, shape(num_bins))
|
Center of each probability bin |
true_frequencies |
(Array, shape(num_bins))
|
Actual frequency of positive class in each bin |
counts |
(Array, shape(num_bins))
|
Number of samples in each bin |
Notes
Perfect calibration means true_frequencies == bin_centers for all bins.
Source code in src/stochpw/diagnostics/advanced.py
effective_sample_size(weights)
Compute effective sample size (ESS).
ESS = (sum w)^2 / sum(w^2)
Lower values indicate more extreme weights (fewer "effective" samples). ESS = n means uniform weights.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
weights
|
(Array, shape(n_samples))
|
Sample weights |
required |
Returns:
| Name | Type | Description |
|---|---|---|
ess |
Array(scalar)
|
Effective sample size |
Source code in src/stochpw/diagnostics/weights.py
maximum_mean_discrepancy(X, A, weights, sigma=None)
Compute Maximum Mean Discrepancy (MMD) between weighted treatment groups.
MMD measures distributional distance between groups using a kernel-based approach. Unlike SMD which compares means feature-by-feature, MMD captures higher-order moments and interactions between features.
For binary treatment, computes MMD between weighted treatment and control groups. For continuous treatment, this function is not applicable and returns NaN.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
(Array, shape(n_samples, n_features))
|
Covariates |
required |
A
|
(Array, shape(n_samples, 1) or (n_samples,))
|
Treatments (must be binary) |
required |
weights
|
(Array, shape(n_samples))
|
Sample weights |
required |
sigma
|
float
|
Bandwidth parameter for RBF kernel. If None, uses median heuristic: sigma = median(pairwise distances) / sqrt(2) |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
mmd |
float
|
MMD statistic (non-negative, 0 means identical distributions) |
Notes
The MMD is defined as:
.. math:: MMD^2 = E[k(X_1, X_1')] - 2E[k(X_1, X_0)] + E[k(X_0, X_0')]
where X_1, X_1' are from the treated group and X_0, X_0' are from control. This implementation uses weighted expectations based on the provided weights.
References
Gretton, A., Borgwardt, K. M., Rasch, M. J., Schölkopf, B., & Smola, A. (2012). A kernel two-sample test. Journal of Machine Learning Research, 13(1), 723-773.
Examples:
>>> import jax.numpy as jnp
>>> from stochpw import maximum_mean_discrepancy
>>> X = jnp.array([[1, 2], [2, 3], [3, 4], [4, 5]])
>>> A = jnp.array([0, 0, 1, 1])
>>> weights = jnp.ones(4)
>>> mmd = maximum_mean_discrepancy(X, A, weights)
Source code in src/stochpw/diagnostics/balance.py
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 | |
roc_curve(weights, true_labels, max_points=100)
Compute ROC curve from weights for discriminator performance.
Given weights w(x,a), infers eta(x,a) = w(x,a) / (1 + w(x,a)) and computes the ROC curve for discriminating between observed and permuted data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
weights
|
(Array, shape(n_samples))
|
Sample weights from permutation weighting |
required |
true_labels
|
(Array, shape(n_samples))
|
True binary labels (0=observed, 1=permuted) |
required |
max_points
|
int
|
Maximum number of points in the ROC curve (for computational efficiency) |
100
|
Returns:
| Name | Type | Description |
|---|---|---|
fpr |
Array
|
False positive rates at each threshold |
tpr |
Array
|
True positive rates at each threshold |
thresholds |
Array
|
Thresholds used to compute fpr and tpr |
Notes
The ROC curve is the most important diagnostic for discriminator quality. A good discriminator will have high AUC (area under curve), indicating it can successfully distinguish between observed and permuted data.
The discriminator probability eta is inferred from weights as: eta(x,a) = w(x,a) / (1 + w(x,a))
Examples:
>>> # After fitting a weighter
>>> weights = weighter.predict(X, A)
>>> # Create permuted data and labels
>>> weights_perm = weighter.predict(X, A_permuted)
>>> all_weights = jnp.concatenate([weights, weights_perm])
>>> labels = jnp.concatenate([jnp.zeros(len(weights)), jnp.ones(len(weights_perm))])
>>> fpr, tpr, thresholds = roc_curve(all_weights, labels)
>>> auc = jnp.trapezoid(tpr, fpr)
Source code in src/stochpw/diagnostics/advanced.py
standardized_mean_difference(X, A, weights)
Compute weighted standardized mean difference for each covariate.
For binary treatment, computes SMD between weighted treatment groups. For continuous treatment, computes weighted correlation with covariates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
(Array, shape(n_samples, n_features))
|
Covariates |
required |
A
|
(Array, shape(n_samples, 1) or (n_samples,))
|
Treatments |
required |
weights
|
(Array, shape(n_samples))
|
Sample weights |
required |
Returns:
| Name | Type | Description |
|---|---|---|
smd |
(Array, shape(n_features))
|
SMD or correlation for each covariate |
Source code in src/stochpw/diagnostics/balance.py
standardized_mean_difference_se(X, A, weights)
Compute standard errors for standardized mean differences.
Uses the bootstrap-style approximation for weighted SMD standard errors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
(Array, shape(n_samples, n_features))
|
Covariates |
required |
A
|
(Array, shape(n_samples, 1) or (n_samples,))
|
Treatments |
required |
weights
|
(Array, shape(n_samples))
|
Sample weights |
required |
Returns:
| Name | Type | Description |
|---|---|---|
se |
(Array, shape(n_features))
|
Standard error for each covariate's SMD |
Source code in src/stochpw/diagnostics/balance.py
weight_statistics(weights)
Compute comprehensive statistics about weight distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
weights
|
(Array, shape(n_samples))
|
Sample weights |
required |
Returns:
| Name | Type | Description |
|---|---|---|
stats |
dict
|
Dictionary with weight statistics: - mean: Mean weight - std: Standard deviation of weights - min: Minimum weight - max: Maximum weight - cv: Coefficient of variation (std/mean) - entropy: Entropy of normalized weights - max_ratio: Ratio of max to min weight - n_extreme: Number of weights > 10x mean |
Notes
Useful for diagnosing weight quality and potential issues with extreme or highly variable weights.