The print and summary methods provide descriptions of the results obtained with the function dml.
The coef function extracts the coefficients.
The se function extracts the standard errors.
The confint function extracts the standard errors.
# S3 method for class 'dml'
summary(object, combine.method = "median", ...)
# S3 method for class 'dml'
coef(object, combine.method = "median", ...)
se(object, ...)
# S3 method for class 'dml'
se(object, combine.method = "median", ...)
# S3 method for class 'dml'
confint(object, parm = NULL, level = 0.95, combine.method = "median", ...)
# S3 method for class 'summary_dml'
print(x, digits = max(3L, getOption("digits") - 3L), interpret = TRUE, ...)
# S3 method for class 'dml'
print(
x,
digits = max(3L, getOption("digits") - 3L),
combine.method = "median",
...
)an object of class dml.
method to combine the results of each repetition of the DML fit. Options are mean and median. Default is median.
arguments passed to other methods.
character vector with the names of parameters.
confidence level. Default is 0.95.
an object of class dml.
minimal number of significant digits.
logical. Should a verbal interpretation of the DML procedure be printed? Default is TRUE.
For summary: an object of class summary_dml. For coef: a named numeric vector of coefficients. For se: a named numeric vector of standard errors. For confint: a matrix with confidence intervals. For print: the input object, invisibly.
# loads package
library(dml.sensemakr)
## loads data
data("pension")
# set the outcome
y <- pension$net_tfa # net total financial assets
# set the treatment
d <- pension$e401 # 401K eligibility
# set the covariates (a matrix)
x <- model.matrix(~ -1 + age + inc + educ+ fsize + marr + twoearn + pira + hown, data = pension)
## compute income quartiles for group ATE.
g1 <- cut(x[,"inc"], quantile(x[,"inc"], c(0, 0.25,.5,.75,1), na.rm = TRUE),
labels = c("q1", "q2", "q3", "q4"), include.lowest = TRUE)
# run DML (nonparametric model)
## 2 folds (change as needed)
## 1 repetition (change as needed)
dml.401k <- dml(y, d, x, model = "npm", groups = g1, cf.folds = 2, cf.reps = 1)
#> Debiased Machine Learning
#>
#> Model: Nonparametric
#> Target: ate
#> Cross-Fitting: 2 folds, 1 reps
#> ML Method: outcome (yreg0:ranger, yreg1:ranger), treatment (ranger)
#> Tuning: dirty
#>
#>
#> ====================================
#> Tuning parameters using all the data
#> ====================================
#>
#> - Tuning Model for D.
#> -- Best Tune:
#> mtry min.node.size splitrule
#> 1 2 5 variance
#>
#> - Tuning Model for Y (non-parametric).
#> -- Best Tune:
#> mtry min.node.size splitrule
#> 1 2 5 variance
#> mtry min.node.size splitrule
#> 1 2 5 variance
#>
#>
#> ======================================
#> Repeating 2-fold cross-fitting 1 times
#> ======================================
#>
#> -- Rep 1 -- Folds: 1 2
#>
summary(dml.401k)
#>
#> Debiased Machine Learning
#>
#> Model: Nonparametric
#> Cross-Fitting: 2 folds, 1 reps
#> ML Method: outcome (yreg0:ranger, yreg1:ranger, R2 = 26.155%), treatment (ranger, R2 = 11.83%)
#> Tuning: dirty
#>
#> Average Treatment Effect:
#>
#> Estimate Std. Error t value P(>|t|)
#> ate.all 8231 1152 7.143 9.12e-13 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Group Average Treatment Effect:
#>
#> Estimate Std. Error t value P(>|t|)
#> gate.q1 4454.5 887.5 5.019 5.19e-07 ***
#> gate.q2 2709.5 1315.0 2.060 0.0394 *
#> gate.q3 7161.3 1836.2 3.900 9.62e-05 ***
#> gate.q4 18596.7 3910.7 4.755 1.98e-06 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Note: DML estimates combined using the median method.
#>
#> Verbal interpretation of DML procedure:
#>
#> -- Average treatment effects were estimated using DML with 2-fold cross-fitting. In order to reduce the variance that stems from sample splitting, we repeated the procedure 1 times. Estimates are combined using the median as the final estimate, incorporating variation across experiments into the standard error as described in Chernozhukov et al. (2018). The outcome regression uses from the R package ; the treatment regression uses Random Forest from the R package ranger.
summary(dml.401k, combine.method = "mean")
#>
#> Debiased Machine Learning
#>
#> Model: Nonparametric
#> Cross-Fitting: 2 folds, 1 reps
#> ML Method: outcome (yreg0:ranger, yreg1:ranger, R2 = 26.155%), treatment (ranger, R2 = 11.83%)
#> Tuning: dirty
#>
#> Average Treatment Effect:
#>
#> Estimate Std. Error t value P(>|t|)
#> ate.all 8231 1152 7.143 9.12e-13 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Group Average Treatment Effect:
#>
#> Estimate Std. Error t value P(>|t|)
#> gate.q1 4454.5 887.5 5.019 5.19e-07 ***
#> gate.q2 2709.5 1315.0 2.060 0.0394 *
#> gate.q3 7161.3 1836.2 3.900 9.62e-05 ***
#> gate.q4 18596.7 3910.7 4.755 1.98e-06 ***
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#>
#> Note: DML estimates combined using the mean method.
#>
#> Verbal interpretation of DML procedure:
#>
#> -- Average treatment effects were estimated using DML with 2-fold cross-fitting. In order to reduce the variance that stems from sample splitting, we repeated the procedure 1 times. Estimates are combined using the mean as the final estimate, incorporating variation across experiments into the standard error as described in Chernozhukov et al. (2018). The outcome regression uses from the R package ; the treatment regression uses Random Forest from the R package ranger.
coef(dml.401k)
#> ate.all gate.q1 gate.q2 gate.q3 gate.q4
#> 8230.988 4454.540 2709.535 7161.307 18596.731
coef(dml.401k, combine.method = "mean")
#> ate.all gate.q1 gate.q2 gate.q3 gate.q4
#> 8230.988 4454.540 2709.535 7161.307 18596.731
se(dml.401k)
#> ate.all gate.q1 gate.q2 gate.q3 gate.q4
#> 1152.2759 887.4944 1315.0380 1836.2201 3910.6600
confint(dml.401k, combine.method = "mean")
#> 2.5 % 97.5 %
#> ate.all 5972.5687 10489.407
#> gate.q1 2715.0826 6193.997
#> gate.q2 132.1082 5286.962
#> gate.q3 3562.3813 10760.232
#> gate.q4 10931.9777 26261.483