The performance of estimators based on the propensity score

doi:10.1016/j.jeconom.2012.11.006

Journal of Econometrics

Volume 175, Issue 1, July 2013, Pages 1-21

https://doi.org/10.1016/j.jeconom.2012.11.006 Get rights and content

Abstract

We investigate the finite sample properties of a large number of estimators for the average treatment effect on the treated that are suitable when adjustment for observed covariates is required, like inverse probability weighting, kernel and other variants of matching, as well as different parametric models. The simulation design used is based on real data usually employed for the evaluation of labour market programmes in Germany. We vary several dimensions of the design that are of practical importance, like sample size, the type of the outcome variable, and aspects of the selection process. We find that trimming individual observations with too much weight as well as the choice of tuning parameters are important for all estimators. A conclusion from our simulations is that a particular radius matching estimator combined with regression performs best overall, in particular when robustness to misspecifications of the propensity score and different types of outcome variables is considered an important property.

Introduction

Semiparametric estimators using the propensity score to adjust in one way or another for covariate differences are now well-established. They are used for estimating causal effects in a selection-on-observables framework with discrete treatments, or for simply purging the means of an outcome variable in two or more subsamples from differences due to observed variables.¹ Compared to (non-saturated) parametric regressions, they have the advantage of including the covariates in a more flexible way without incurring a curse-of-dimensionality problem, and of allowing for effect heterogeneity. The former problem is highly relevant due to the large number of covariates that should usually be adjusted for. It is tackled by collapsing the covariate information into a single parametric function. This function, the so-called propensity score, is defined as the probability of being observed in one of two subsamples conditional on the covariates. The difference to parametric regression is that this parametric function is not directly related to the outcome (as it would be in regression) and thus, additional robustness to misspecification can be expected.² These methods originate from the pioneering work of Rosenbaum and Rubin (1983) who show that balancing two samples on the propensity score is sufficient to equalize their covariate distributions.

Although many of these propensity-score-based methods are not asymptotically efficient (see for example Heckman et al., 1998a, Heckman et al., 1998b, Hahn, 1998),³ they are the work-horses in the literature on programme evaluation and are now rapidly spreading to other fields. They are usually implemented as semiparametric estimators: the propensity score is based on a parametric model, but the relationship between the outcome variables and the propensity score is non-parametric. However, despite the popularity of propensity-score-based methods, the issue of which version of the many different estimators suggested in the literature should be used in a particular application is still unresolved, despite recent advances in important Monte Carlo studies by Frölich (2004) and Busso et al., forthcoming, Busso et al., 2009. In this paper we address this question and add further insights to it. Broadly speaking, the popular estimators can be subdivided into four classes: parametric estimators (like OLS or probit or their so-called double-robust relatives, see Robins et al., 1992), inverse (selection) probability weighting estimators (similar to Horvitz and Thompson, 1952) or to the recently introduced titling version by Graham et al., 2011, Graham et al., 2012, direct matching estimators (Rubin, 1974, Rosenbaum and Rubin, 1983), and kernel matching estimators (Heckman et al., 1998a, Heckman et al., 1998b).⁴ However, many variants of the estimators exist within each class and several methods combine the principles underlying these main classes.

There are two strands of the literature that are relevant for our research question: First, the literature on the asymptotic properties of a subset of estimators provides some guidance on their small sample properties. In Section 3 we review this literature and discuss the various estimators. Unfortunately, asymptotic properties have not (yet?) been derived for all estimators used in practice, nor is it obvious how well they approximate small sample behaviour. Furthermore, these results are usually not informative for the important choice of tuning parameters on which many estimators critically depend (e.g., number of matched neighbours, bandwidth selection in kernel matching).

The second strand of the literature provides Monte Carlo evidence on the properties of the estimators of the effects.⁵ As one of the first papers investigating estimators from several classes simultaneously, Frölich (2004) found that a particular version of kernel-matching based on local regressions with finite sample adjustments (local ridge regression) performs best. In contrast, Busso et al., forthcoming, Busso et al., 2009 conclude that inverse probability weighting (IPW) has the best properties (when using normalized weights for estimation). They explain the differences to Frölich (2004) by claiming that he (i) considers unrealistic data generating processes and (ii) does not use an IPW estimator with normalized weights. In other words, they point to the design dependence of the Monte Carlo results as well as to the requirement of using optimized variants of the estimators. Below, we argue that their work may be subject to the same criticism. This provides a major motivation for our study.

We contribute to the literature on the properties of estimators based on adjusting for covariate differences in the following way: firstly, we suggest a different approach to conduct simulations. This approach is based on ‘real’ data. Therefore, we call our particular implementation of this idea an ‘Empirical Monte Carlo Study’.⁶ The basic idea is to use the empirical data to simulate realistic ‘placebo treatments’ among the non-treated. The various estimators then use the remaining non-treated in different ways to estimate the (known) non-treatment outcome of the ‘placebo-treated’. Selection into treatment, which is potentially of key importance for the performance of the various estimators, is based on a selection process directly obtained from the data. Moreover, we exploit the actual dependence of the outcome of interest on the covariates on which selection is based in the data rather than making assumptions on this relation when specifying the data generating process. Thus, this approach is less prone to the standard critique of simulation studies that the chosen data generating processes are irrelevant for real applications. Since our model for the propensity score mirrors specifications used in past applied work, it depends on many more covariates compared to the studies mentioned above. Although this makes the simulation results particularly plausible in our context of labour market programme evaluation in Europe, this may also be seen as a limitation concerning its applications to other fields. Therefore, to help generalize the results outside our specific data situation, we modify many features of the data generating process, like the type of the outcome variable and as well as various aspects of the selection process.

Secondly, we consider standard estimators as well as their modified (optimized?) versions based on different tuning parameters such as bandwidth or radius choice. This leads to a large number of estimators to evaluate, but it also provides us with more information on important choices regarding the parameters on which the various estimators depend. Such estimators may also consist of combinations of estimators, like combining matching with weighted regression, which have not been considered in any simulation so far. Finally, we reemphasize the relevance of trimming to improve the finite sample properties of all estimators. The rule we propose is (i) a data driven trimming rule, (ii) easy to implement, (iii) identical for all estimators, and (iv) avoids asymptotic bias. We show that for almost all estimators considered, including the parametric ones, trimming based on this rule effectively improves their performance.

Overall, we find that (i) trimming observations that have ‘too large’ a weight is important for many estimators; (ii) the choices of the various tuning parameters play an important role; (iii) simple matching estimators are inefficient and have considerable small sample bias; (iv) no estimator is superior in all designs and for all outcomes; (v) particular bias-adjusted radius (or calliper) matching estimators perform best on average, but may have fat tails if the number of controls is not large enough; and finally, (vi) flexible, but simple parametric approaches do almost as well in the smaller samples, because their gain in precision frequently compensates (in part) for their larger bias which, however, dominates when samples become larger. Strictly speaking these properties relate to our particular data generating process (DGP) only. However, at least such a DGP is typical for an important application of matching methods, namely labour market evaluations.

The paper proceeds as follows: in the next section we describe our Monte Carlo design, relegating many details as well as descriptive statistics to online Appendices B and C, where the latter contains a description of the support features of our data. In Section 3 we discuss the basic setup of each of the relevant estimators and their properties, as well as the issue of trimming, while relegating the technical details of the estimators to Appendix. The main results are presented in Section 4, while the full set of results is given in online Appendix D. Section 5 concludes and online Appendix E contains further sensitivity checks. The website of this paper (www.sew.unisg.ch/lechner/matching) will contain additional material that has been removed from the paper for the sake of brevity, in particular Appendices B, C, D, and E as well as the Gauss, Stata, and R codes for the preferred estimators. The following is the Supplementary material related to this article.

. HLW_MatchEst_201313 R2 Internet appendix.docx.

Section snippets

Basic idea

A typical Monte Carlo study specifies the data generation process of all relevant random variables and then conducts estimation and inference from samples that are generated by independent draws from those random variables based on pseudo random number generators. The advantage of such a design is that all dimensions of the true data generating process (DGP) are known and can be used for a thorough comparison with the estimates obtained from the simulations. However, the disadvantage is that

Notation and targets for the estimation

The outcome variable, $Y$ , denotes earnings or employment. The group of treated units (treatment indicator $D = 1$ ) are the participants in training in our empirical example. We are interested in comparing the mean value of $Y$ in the group of treated $(D = 1)$ with the mean value of $Y$ in the group of non-treated $(D = 0)$ , the non-participants, free of any mean differences in outcomes that are due to differences in the observed covariates $X$ across the groups.²⁰

Trimming

From Eq. (1) we see that all estimators can be written as the mean outcome of the treated minus the weighted outcome of the non-treated observations. By the nature of this estimation principle, the weights of the non-treated are not uniform (except in the case of random assignment in which they should be very similar even in the smallest sample). They depend on the covariates via the propensity score. If particular values of $p (x)$ are rare among the controls and common among the treated, such

Results

In this section, we first discuss several issues concerning the implementation of the various estimators (5.1). After that, the results are discussed, beginning with issues that concern all estimators simultaneously, like the impact of different features of the data generating process, the specification of the propensity score and the trimming (5.2). Then, we analyze implementational issues that are specific to the particular classes of estimators considered (5.3). Finally, we compare the best

Conclusion

This paper investigates the finite sample properties of all major classes of propensity-score-based estimators of the average treatment effect on the treated (ATET) that are used in applications. Moreover, within each class of estimators we investigate the performance of the estimators for a variety of possible versions and various values of the tuning parameters. Both features make this study the most comprehensive one in the field so far.

We propose a way to overcome one of the main criticisms

Acknowledgments

Michael Lechner is a Research Fellow of CEPR and PSI, London, CES-Ifo, Munich, IAB, Nuremberg, IZA, Bonn, and ZEW, Mannheim. Conny Wunsch is a Research Fellow of CES-Ifo, Munich, and IZA, Bonn. This project received financial support from the Institut für Arbeitsmarkt und Berufsforschung, IAB, Nuremberg (contract 8104). We would like to thank Patrycja Scioch (IAB), Benjamin Schünemann and Darjusch Tafreschi (both SEW, St. Gallen) for their help in the early stages of data preparation. An

References (101)

X. Chen
Large sample sieve estimation of semi-nonparametric models
R.H. Dehejia
Practical propensity score estimation: a reply to Smith and Todd
Journal of Econometrics
(2005)
M. Frölich
Nonparametric IV estimation of local average treatment effects with covariates
Journal of Econometrics
(2007)
R. Hujer et al.
How do employment effects of job creation schemes differ with respect to the foregoing unemployment duration?
Labour Economics
(2010)
R. Hujer et al.
New evidence on the effects of job creation schemes in Germany—a matching approach with threefold heterogeneity
Research in Economics
(2004)
M. Lechner
Long-run labour market and health effects of individual sports activities
The Journal of Health Economics
(2009)
W.K. Newey
A method of moments interpretation of sequential estimators
Economics Letters
(1984)
J.M. Wooldridge
Inverse probability weighted estimation for general missing data problems
Journal of Econometrics
(2007)
A. Abadie
Semiparametric difference-in-difference estimators
Review of Economic Studies
(2005)
Abadie, A., Imbens, G.W., 2002. Simple and bias-corrected matching estimators for average treatment effects, NBER...

A. Abadie et al.

Large sample properties of matching estimators for average treatment effects

Econometrica

(2006)

A. Abadie et al.

On the failure of the bootstrap for matching estimators

Econometrica

(2008)

Abadie, A., Imbens, G.W., 2009. Matching on the estimated propensity score, NBER Working Paper...

J.D. Angrist

Estimating the labor market effects of voluntary military service using social security data on military applicants

Econometrica

(1998)

J.D. Angrist et al.

When to control for covariates? panel-asymptotic results for estimates of treatment effects

Review of Economics and Statistics

(2004)

J.D. Angrist et al.

Mostly Harmless Econometrics: An Empiricists’ Companion

(2009)

B. Augurzky et al.

Assessing the performance of matching algorithms when selection into treatment is strong

Journal of Applied Econometrics

(2007)

H. Bang et al.

Doubly robust estimation in missing data and causal inference models

Biometrics

(2005)

S. Behncke et al.

Unemployed and their case workers: should they be friends or foes?

The Journal of the Royal Statistical Society - Series A

(2010)

S. Behncke et al.

A caseworker like me - does the similarity between unemployed and caseworker increase job placements?

The Economic Journal

(2010)

M. Bertrand et al.

How much should we trust differences-in-differences estimates

Quarterly Journal of Economics

(2004)

R. Blundell et al.

Alternative approaches to evaluation in empirical microeconomics

Journal of Human Resources

(2009)

R. Blundell et al.

Evaluating the employment impact of a mandatory job search program

Journal of the European Economic Association

(2004)

Busso, M., DiNardo, J., McCrary, J., 2009. Finite sample properties of semiparametric estimators of average treatment...

Busso, M., DiNardo, J., McCrary, J., 2009. New evidence on the finite sample properties of propensity score matching...

M. Caliendo et al.

Sectoral heterogeneity in the employment effects of job creation schemes in Germany

Journal of Economics and Statistics

(2006)

M. Caliendo et al.

The employment effects of job creation schemes in Germany–a microeconometric evaluation

M. Caliendo et al.

Identifying effect heterogeneity to improve the efficiency of job creation schemes in Germany

Applied Economics

(2008)

D. Card et al.

Active labour market policy evaluations: a meta-analysis

Economic Journal

(2010)

Chen, X., Hong, H., Tarozzi, A., 2008. Semiparametric efficiency in gmm models of nonclassical measurement errors,...

R.K. Crump et al.

Dealing with limited overlap in estimation of average treatment effects

Biometrika

(2009)

R.H. Dehejia et al.

Causal effects in non-experimental studies: reevaluating the evaluation of training programmes

SAT Journal of the American Statistical Association

(1999)

R.H. Dehejia et al.

Propensity score-matching methods for nonexperimental causal studies

Review of Economics and Statistics

(2002)

A. Diamond et al.

Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies

Mimeo

(2008)

J. DiNardo et al.

Labor market institutions and the distribution of wages, 1973–1992: a semiparametric approach

Econometrica

(1996)

C. Drake

Effects of misspecification of the propensity score on estimators of treatment effect

Biometrics

(1993)

J. Fan

Design-adaptive nonparametric regression

SAT Journal of the American Statistical Association

(1992)

C.A. Flores et al.

Evaluating nonexperimental estimators for multiple treatments: evidence from experimental data

Mimeo

(2009)

M. Frölich

Finite-sample properties of propensity-score matching and weighting estimators

Review of Economics and Statistics

(2004)

M. Frölich

Matching estimators and optimal bandwidth choice

Statistics and Computing

(2005)

M. Frölich

Nonparametric regression for binary dependent variables

Econometrics Journal

(2007)

J. Galdo et al.

Bandwidth selection and the estimation of treatment effects with unbalanced data

Annales d’Économie et de Statistique

(2008)

M. Gerfin et al.

Microeconometric evaluation of the active labour market policy in Switzerland

The Economic Journal

(2002)

A.N. Glynn et al.

An introduction to the augmented inverse propensity weighted estimator

Political Analysis

(2010)

Graham, B.S., Pinto, C., Egel, D., 2011. Efficient Estimation of Data Combination Models by the Method of...

B.S. Graham et al.

Inverse probability tilting for moment condition models with missing data

Review of Economic Studies

(2012)

J. Hahn

On the role of the propensity score in efficient semiparametric estimation of average treatment effects

Econometrica

(1998)

L.P. Hansen

Large sample properties of generalized methods of moments estimators

Econometrica

(1982)

B. Hansen

Full matching in an observational study of coaching for the sat

SAT Journal of the American Statistical Association

(2004)

P. Hall et al.

Cross-validation and the estimation of conditional probability densities

SAT Journal of the American Statistical Association

(2004)

Cited by (250)

Cash transfers and micro-enterprise performance: Theory and quasi-experimental evidence from Kenya
2024, Journal of Development Economics
Theoretically, the welfare effects of cash-based assistance depend on how businesses respond to the demand shock and on resulting effects on prices. Such market effects have been largely overlooked in the literature. In this study, we examine the business and price effects of cash-based assistance to refugees in Kenya. Monthly restricted cash transfers worth 3 to 13 dollars were provided to 400,000 refugees in the form of digital money exclusively usable for food purchases at licensed shops. We show that licensed businesses have much higher revenues (+175%) and profits (+154%) and charge higher prices than unlicensed businesses. In line with theory, the restricted cash transfer program created a parallel retail market in which a limited number of businesses enjoy high market power. The theoretical and empirical results provide a cautionary tale highlighting the drawbacks of setting up a less competitive, parallel market to distribute cash-based assistance.
The impact of low-immersion virtual reality on product sales: Insights from the real estate industry
2024, Decision Support Systems
The transformation of information flow and shopping experiences on modern digital platforms has significantly shifted. Notably, the integration of low-immersion virtual reality (VR) technology has become a key driver in enhancing the immersive nature of online shopping experiences. The projected exponential growth of the VR market, with anticipated revenue surging from $11.64 billion in 2021 to $227.34 billion by 2029, further underscores the importance of understanding the impact of VR signals on product sales and profitability. Drawing upon signaling theory, this study examines the impact of VR signals on real estate transactions across different price tiers. Our empirical analysis suggests a positive correlation between the integration of low-immersion VR technology and real estate profitability. It is worth noting that this effect varies significantly across the level of real estate prices. Our research highlights the strategic benefits of adopting low-immersion VR technology as a decision support mechanism, enhancing the immersive shopping experience, and improving product profitability. Furthermore, this research also provides perspective on the broader applicability of VR technology in different business sectors, showing a promising future for its role in optimizing consumer experiences.
Using Wasserstein Generative Adversarial Networks for the design of Monte Carlo simulations
2024, Journal of Econometrics
When researchers develop new econometric methods it is common practice to compare the performance of the new methods to those of existing methods in Monte Carlo studies. The credibility of such Monte Carlo studies is often limited because of the discretion the researcher has in choosing the Monte Carlo designs reported. To improve the credibility we propose using a class of generative models that has recently been developed in the machine learning literature, termed Generative Adversarial Networks (GANs) which can be used to systematically generate artificial data that closely mimics existing datasets. Thus, in combination with existing real data sets, GANs can be used to limit the degrees of freedom in Monte Carlo study designs for the researcher, making any comparisons more convincing. In addition, if an applied researcher is concerned with the performance of a particular statistical method on a specific data set (beyond its theoretical properties in large samples), she can use such GANs to assess the performance of the proposed method, e.g. the coverage rate of confidence intervals or the bias of the estimator, using simulated data which closely resembles the exact setting of interest. To illustrate these methods we apply Wasserstein GANs (WGANs) to the estimation of average treatment effects. In this example, we find that $(i)$ there is not a single estimator that outperforms the others in all three settings, so researchers should tailor their analytic approach to a given setting, $(i i)$ systematic simulation studies can be helpful for selecting among competing methods in this situation, and $(i i i)$ the generated data closely resemble the actual data.
The financial impact of offering publicly funded homebirths: A population-based microsimulation in Queensland, Australia
2024, Women and Birth
Despite strong evidence of benefits and increasing consumer demand for homebirth, Australia has failed to effectively upscale it. To promote the adoption and expansion of homebirth in the public health care system, policymakers require quantifiable results to evaluate its economic value. To date, there has been limited evaluation of the financial impact of birth settings for women at low risk of pregnancy complications.
This study aimed to examine the difference in inpatient costs around birth between offering homebirth in the public maternity system versus not offering public homebirth to selected women who meet low-risk pregnancy criteria.
We used a whole-of-population linked administrative dataset containing all women who gave birth in Queensland (one Australian State) between 01/07/2012 and 30/06/2018 where publicly funded homebirth is not currently offered. We created a static microsimulation model to compare the inpatient cost difference for mother and baby around birth based on the women who gave birth between 01/07/2017 and 30/06/2018 (n = 36,314). The model comprised of a base model – representing standard public hospital care, and a counterfactual model – representing a hypothetical scenario where 5 % of women who gave birth in public hospitals planned to give birth at home prior to the onset of labour (n = 1816). Costs were reported in 2021/22 AUD.
In our hypothetical scenario, after considering the effect of assumptive place and mode of birth for these planned homebirths, the estimated State-level inpatient cost saving around birth (summed for mother and babies) per pregnancy were: AU$303.13 (to Queensland public hospitals) and AU$186.94 (to Queensland public hospital funders). This calculates to a total cost saving per annum of AU$11 million (to Queensland public hospitals) and AU$6.8 million (to Queensland public hospital funders).
A considerable amount of inpatient health care costs around birth could be saved if 5 % of women booked at their local public hospitals, planned to give birth at home through a public-funded homebirth program. This finding supports the establishment and expansion of the homebirth option in the public health care system.
The effect of philosophy on critical reading: Evidence from initial teacher education in Colombia
2024, International Journal of Educational Development
Teacher quality, its effect on students’ outcomes, and the association of these with economic growth, is the core of recent discussions in Latin America given the region’s weak results in international learning assessments. This paper investigates whether there is an effect of philosophy on the outcomes of critical reading for students in B.Ed. programs in Colombia. Relying on exact matching combined with propensity score matching with regression adjustment, we use national data from Colombia to show that students in B.Ed. in philosophy outperformed students in other B.Ed. in critical reading test (0.401–0.124 SD), and, importantly, with higher effects observed for students with lower prior academic achievement (0.44 SD). This suggests that philosophy can help to narrow educational outcomes of students whose socioeconomic conditions are disadvantageous, contributing to social justice in education.
Development of railway station plazas: Impact on land prices of surrounding areas
2023, Transport Policy
Improvement in the built environment is an essential component of transit-oriented development; however, the economic effects of such investments have not yet been fully understood because their investigation requires a suitable empirical case. Using the decentralized development of station plazas, this study identified the conditions under which plaza development raises the economic value of the surrounding areas. We empirically analyzed the impact of station plaza development on the land prices of 3870 properties in the surrounding areas at 181 of the 1556 stations in the Tokyo Metropolitan Area from 2000 to 2010, where station plazas were developed for the revitalization of station areas in a mature urban economy. Mahalanobis distance matching was applied to obtain the treatment effect, and the heterogeneity in the effects on the land price of properties by land-use pattern and geographical location was analyzed. The results showed that the properties benefited from station plaza development depending on the accessibility to and from the nearest station and the Tokyo Station (the central station) and their land-use patterns. The study found that there could be a redistributive effect in that the plaza development had positive effects on land price changes around the stations and negative effects in places far from the stations.

View all citing articles on Scopus

View full text

The performance of estimators based on the propensity score

Abstract

Introduction

Section snippets

Basic idea

Notation and targets for the estimation

Trimming

Results

Conclusion

Acknowledgments

Journal of Econometrics

Journal of Econometrics

Labour Economics

Research in Economics

The Journal of Health Economics

Economics Letters

Journal of Econometrics

Semiparametric difference-in-difference estimators

Review of Economic Studies

Large sample properties of matching estimators for average treatment effects

Econometrica

On the failure of the bootstrap for matching estimators

Econometrica

Estimating the labor market effects of voluntary military service using social security data on military applicants

Econometrica

When to control for covariates? panel-asymptotic results for estimates of treatment effects

Review of Economics and Statistics

Mostly Harmless Econometrics: An Empiricists’ Companion

Assessing the performance of matching algorithms when selection into treatment is strong

Journal of Applied Econometrics

Doubly robust estimation in missing data and causal inference models

Biometrics

Unemployed and their case workers: should they be friends or foes?

The Journal of the Royal Statistical Society - Series A

A caseworker like me - does the similarity between unemployed and caseworker increase job placements?

The Economic Journal

How much should we trust differences-in-differences estimates

Quarterly Journal of Economics

Alternative approaches to evaluation in empirical microeconomics

Journal of Human Resources

Evaluating the employment impact of a mandatory job search program

Journal of the European Economic Association

Sectoral heterogeneity in the employment effects of job creation schemes in Germany

Journal of Economics and Statistics

The employment effects of job creation schemes in Germany–a microeconometric evaluation

Identifying effect heterogeneity to improve the efficiency of job creation schemes in Germany

Applied Economics

Active labour market policy evaluations: a meta-analysis

Economic Journal

Dealing with limited overlap in estimation of average treatment effects

Biometrika

Causal effects in non-experimental studies: reevaluating the evaluation of training programmes

SAT Journal of the American Statistical Association

Propensity score-matching methods for nonexperimental causal studies

Review of Economics and Statistics

Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies

Mimeo

Labor market institutions and the distribution of wages, 1973–1992: a semiparametric approach

Econometrica

Effects of misspecification of the propensity score on estimators of treatment effect

Biometrics

Design-adaptive nonparametric regression

SAT Journal of the American Statistical Association

Evaluating nonexperimental estimators for multiple treatments: evidence from experimental data

Mimeo

Finite-sample properties of propensity-score matching and weighting estimators

Review of Economics and Statistics

Matching estimators and optimal bandwidth choice

Statistics and Computing

Nonparametric regression for binary dependent variables

Econometrics Journal

Bandwidth selection and the estimation of treatment effects with unbalanced data

Annales d’Économie et de Statistique

Microeconometric evaluation of the active labour market policy in Switzerland

The Economic Journal

An introduction to the augmented inverse propensity weighted estimator

Political Analysis

Inverse probability tilting for moment condition models with missing data

Review of Economic Studies

On the role of the propensity score in efficient semiparametric estimation of average treatment effects