In econometrics, spurious regression occurs when a linear regression model is fitted between two or more non-stationary time series. The problem lies in the fact that, even when there is no real economic relationship, the model can yield an artificially high coefficient of determination (R²) and misleadingly significant p-values. Formally, if \( x_{t} \) and \( y_{t} \) are integrated of order 1 (I(1)) processes, and the following regression is estimated:
\[ y_{t} = \beta_{0} + \beta_{1}\,x_{t} + \varepsilon_{t}, \]the residuals \( \varepsilon_{t} \) are often non-stationary, leading to spurious conclusions.
On the other hand, cointegration occurs when the series \( x_{t} \) and \( y_{t} \), although both being non-stationary of order 1, combine linearly into residuals that are stationary (I(0)). In other words, if:
\[ u_{t} = y_{t} - \beta_{0} - \beta_{1}\,x_{t} \]is a stationary process, then we say that \( x_{t} \) and \( y_{t} \) are cointegrated, and the regression is not spurious. To verify the stationarity of \( u_{t} \), the Augmented Dickey-Fuller (ADF) test is applied, which, in its simplest form, tests for the presence of a unit root using a regression like:
\[ \Delta u_{t} = \alpha + \gamma\,u_{t-1} + \sum_{i=1}^{p} \phi_{i}\,\Delta u_{t-i} + \varepsilon_{t}. \]If \( \gamma \) is significantly negative (below the critical value), the null hypothesis of a unit root is rejected, concluding that \( u_{t} \) is stationary and, therefore, that the variables are cointegrated.