Skip to main content

Estimating Causal Effects

Structural Causal Model

Structural causal models (SCMs) are a type of model used in causal inference to represent the relationships between variables and how they cause each other.

Unlike a standard ML model in which the objective is to develop predictive relationships, SCMs are optimized to give us the correct treatment effect between variables: they provide a framework for testing hypotheses about how changes to one variable will affect other variables in the system, allowing us to estimate causal effects from observational data, which is particularly important when it is not possible or ethical to conduct randomized controlled experiments.

SCMs are widely used in fields such as economics, social sciences, epidemiology, and computer science, among others. They have many practical applications, such as understanding the effects of policy interventions, identifying the factors that contribute to disease outbreaks, modeling relationships in a supply-chain, measuring the impact of advertisement spend, etc.

Causal Effect Estimation

Causal effect estimation is a fundamental aspect of empirical research, where understanding the impact of interventions or policies is critical. The goal of causal effect estimation is to quantify the effect of a particular treatment or intervention on an outcome of interest while controlling for other factors that might affect the outcome. One approach to estimating causal effects is through randomized controlled trials (RCTs), where participants are randomly assigned to either a treatment or a control group. Randomization helps ensure that any differences observed in outcomes between the two groups are due to the treatment rather than other factors.

In cases where RCTs are not feasible or ethical, instrumental variables (IV) analysis can be used to estimate causal effects. IV analysis employs an instrumental variable that is related to the treatment but not directly related to the outcome of interest. This instrumental variable is used to estimate the treatment effect on the outcome, while controlling for other factors that might influence the outcome. The validity of the instrumental variable depends on the so-called “exclusion restriction” condition, which requires that the instrumental variable affects the outcome solely through its effect on the treatment.

Propensity score matching (PSM) is another approach to estimate causal effects, which aims to create a comparison group that is similar to the treatment group by matching participants based on a set of covariates or characteristics. This helps address the issue of selection bias, where participants who receive the treatment may differ systematically from those who do not. PSM can be used to estimate the treatment effect while controlling for other factors that might influence the outcome. However, PSM assumes that the covariates used for matching are sufficient to control for all confounding variables, which may not always be the case.

Finally, the back-door criterion is another approach to estimate causal effects, and involves identifying a set of variables that can block all "back-door" paths between the treatment and the outcome. The back-door criterion allows for estimation of causal effects by conditioning the data on these variables, also known as an adjustment set, and marginalizing using an appropriate formula. The back-door criterion requires strong assumptions about the causal structure of the data, and its validity depends on the accuracy of these assumptions.

In summary, estimating causal effects is essential to determine the effectiveness of interventions or policies. RCTs, IV analysis, PSM, and the back-door criterion are all useful methods to estimate causal effects, each with its advantages and limitations. Choosing the appropriate method depends on the research question, the available data, and the feasibility of each method.