Applied Science

It is a statistical technique widely used in observational studies to estimate causal effects. Its main idea is to "match" treated units with control units that share similar characteristics prior to treatment, summarized through a measure called the propensity score. The goal is to emulate randomized experimental conditions as closely as possible, thus reducing selection bias and increasing the causal validity of the estimates.

Equations of the Data-Generating Process

The control variables \(X_1\) and \(X_2\) are generated as normal random variables:

\[ X_1 \sim N(0, 1) \] \[ X_2 \sim N(0, 1) \]

The probability of receiving the treatment is given by:

\[ P(D=1 \mid X_1, X_2) = \frac{1}{1 + e^{-(\gamma + \alpha X_1 + \beta X_2)}} \]

The treatment \(D\) is randomly assigned based on this probability.

The outcome variable \(Y\) is generated as:

\[ Y = \delta + \theta X_1 + \eta X_2 + \beta_{\text{effect}} D + \varepsilon \]

Where \(\delta = 1.0\) is the intercept and \(\varepsilon \sim N(0, 1)\) is a random error term.