ZIVOT_ANDREWS
This function performs the Zivot-Andrews unit-root test, which extends unit-root diagnostics by allowing a single unknown structural break in the series.
The test evaluates a unit-root null against alternatives with break-adjusted deterministic components.
It returns the test statistic, p-value, critical values, selected lag, and estimated break index.
Excel Usage
=ZIVOT_ANDREWS(x, trim, maxlag, za_regression, za_autolag)
x(list[list], required): Time-series observations as a 2D range.trim(float, optional, default: 0.15): Fraction of observations trimmed from each end when searching for break date.maxlag(int, optional, default: null): Maximum lag included in candidate regressions; null uses default rule.za_regression(str, optional, default: “c”): Deterministic specification, c, t, or ct.za_autolag(str, optional, default: “AIC”): Lag-selection criterion (AIC, BIC, t-stat, or none).
Returns (list[list]): 2D key-value table summarizing Zivot-Andrews test outputs.
Example 1: Zivot-Andrews with default options
Inputs:
| x | trim | maxlag | za_regression | za_autolag | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1.1 | 1.2 | 1.25 | 1.3 | 1.35 | 1.33 | 1.4 | 1.45 | 1.5 | 1.55 | 1.58 | 1.6 | 1.65 | 1.7 | 0.15 | c | AIC |
Excel formula:
=ZIVOT_ANDREWS({1,1.1,1.2,1.25,1.3,1.35,1.33,1.4,1.45,1.5,1.55,1.58,1.6,1.65,1.7}, 0.15, , "c", "AIC")
Expected output:
| za_stat | NaN |
|---|---|
| p_value | NaN |
| base_lag | 4 |
| break_index | 4 |
| critical_1% | -5.27644 |
| critical_5% | -4.81067 |
| critical_10% | -4.56618 |
Example 2: Zivot-Andrews using trend-only regression
Inputs:
| x | trim | maxlag | za_regression | za_autolag | |||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2 | 2.04 | 2.08 | 2.12 | 2.16 | 2.2 | 2.24 | 2.28 | 2.32 | 2.36 | 2.4 | 2.44 | 2.48 | 2.52 | 2.56 | 2.6 | 2.66 | 2.72 | 2.78 | 2.84 | 2.9 | 2.96 | 3.02 | 3.08 | 3.14 | 3.2 | 3.26 | 3.32 | 3.38 | 3.44 | 0.15 | 1 | t | AIC |
Excel formula:
=ZIVOT_ANDREWS({2,2.04,2.08,2.12,2.16,2.2,2.24,2.28,2.32,2.36,2.4,2.44,2.48,2.52,2.56,2.6,2.66,2.72,2.78,2.84,2.9,2.96,3.02,3.08,3.14,3.2,3.26,3.32,3.38,3.44}, 0.15, 1, "t", "AIC")
Expected output:
| za_stat | NaN |
|---|---|
| p_value | NaN |
| base_lag | 1 |
| break_index | 17 |
| critical_1% | -5.03421 |
| critical_5% | -4.4058 |
| critical_10% | -4.13678 |
Example 3: Zivot-Andrews using constant and trend regression
Inputs:
| x | trim | maxlag | za_regression | za_autolag | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | 3.05 | 3.08 | 3.12 | 3.2 | 3.25 | 3.3 | 3.38 | 3.42 | 3.5 | 3.55 | 3.6 | 3.66 | 3.72 | 3.8 | 0.15 | 2 | ct | t-stat |
Excel formula:
=ZIVOT_ANDREWS({3,3.05,3.08,3.12,3.2,3.25,3.3,3.38,3.42,3.5,3.55,3.6,3.66,3.72,3.8}, 0.15, 2, "ct", "t-stat")
Expected output:
| Result | |
|---|---|
| za_stat | -5.0866 |
| p_value | 0.0485155 |
| base_lag | 0 |
| break_index | 3 |
| critical_1% | -5.57556 |
| critical_5% | -5.07332 |
| critical_10% | -4.82668 |
Example 4: Zivot-Andrews with fixed max lag and no autolag
Inputs:
| x | trim | maxlag | za_regression | za_autolag | |||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 4.03 | 4.07 | 4.1 | 4.13 | 4.17 | 4.2 | 4.24 | 4.27 | 4.31 | 4.34 | 4.38 | 4.41 | 4.45 | 4.48 | 4.52 | 4.55 | 4.6 | 4.66 | 4.71 | 4.77 | 4.82 | 4.88 | 4.93 | 4.99 | 5.04 | 5.1 | 5.15 | 5.21 | 5.26 | 0.15 | 1 | c | none |
Excel formula:
=ZIVOT_ANDREWS({4,4.03,4.07,4.1,4.13,4.17,4.2,4.24,4.27,4.31,4.34,4.38,4.41,4.45,4.48,4.52,4.55,4.6,4.66,4.71,4.77,4.82,4.88,4.93,4.99,5.04,5.1,5.15,5.21,5.26}, 0.15, 1, "c", "none")
Expected output:
| Result | |
|---|---|
| za_stat | -1.51838 |
| p_value | 0.999 |
| base_lag | 1 |
| break_index | 17 |
| critical_1% | -5.27644 |
| critical_5% | -4.81067 |
| critical_10% | -4.56618 |
Python Code
import numpy as np
from statsmodels.tsa.stattools import zivot_andrews as sm_zivot_andrews
def zivot_andrews(x, trim=0.15, maxlag=None, za_regression='c', za_autolag='AIC'):
"""
Run the Zivot-Andrews unit-root test allowing one endogenous structural break.
See: https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.zivot_andrews.html
This example function is provided as-is without any representation of accuracy.
Args:
x (list[list]): Time-series observations as a 2D range.
trim (float, optional): Fraction of observations trimmed from each end when searching for break date. Default is 0.15.
maxlag (int, optional): Maximum lag included in candidate regressions; null uses default rule. Default is None.
za_regression (str, optional): Deterministic specification, c, t, or ct. Valid options: Constant, Trend, Constant+Trend. Default is 'c'.
za_autolag (str, optional): Lag-selection criterion (AIC, BIC, t-stat, or none). Valid options: AIC, BIC, t-stat, None. Default is 'AIC'.
Returns:
list[list]: 2D key-value table summarizing Zivot-Andrews test outputs.
"""
try:
def to1d(values):
if isinstance(values, list):
if all(isinstance(row, list) for row in values):
raw = [item for row in values for item in row]
else:
raw = values
else:
raw = [values]
out = []
for item in raw:
try:
out.append(float(item))
except (TypeError, ValueError):
continue
return out
if trim < 0 or trim > 0.333:
return "Error: trim must be between 0 and 0.333"
if za_regression not in ("c", "t", "ct"):
return "Error: regression must be 'c', 't', or 'ct'"
if za_autolag in ("none", "None", "", None):
autolag_arg = None
elif za_autolag in ("AIC", "BIC", "t-stat"):
autolag_arg = za_autolag
else:
return "Error: autolag must be 'AIC', 'BIC', 't-stat', or 'none'"
series = to1d(x)
if len(series) < 10:
return "Error: x must contain at least ten numeric values"
result = sm_zivot_andrews(
np.asarray(series, dtype=float),
trim=trim,
maxlag=maxlag,
regression=za_regression,
autolag=autolag_arg,
)
za_stat = float(result[0])
p_value = float(result[1])
crit_values = result[2]
base_lag = int(result[3])
break_index = int(result[4])
rows = [
["za_stat", za_stat],
["p_value", p_value],
["base_lag", base_lag],
["break_index", break_index],
]
if isinstance(crit_values, dict):
for key in ("1%", "5%", "10%"):
if key in crit_values:
rows.append([f"critical_{key}", float(crit_values[key])])
return rows
except Exception as e:
return f"Error: {str(e)}"Online Calculator
Time-series observations as a 2D range.
Fraction of observations trimmed from each end when searching for break date.
Maximum lag included in candidate regressions; null uses default rule.
Deterministic specification, c, t, or ct.
Lag-selection criterion (AIC, BIC, t-stat, or none).