WELCH_ANOVA
Welch ANOVA tests equality of means across groups without assuming homoscedasticity.
It extends one-way ANOVA by using a weighted F-ratio and adjusted degrees of freedom, improving robustness when group variances differ.
This wrapper accepts tabular data with headers and returns the Welch ANOVA result table.
Excel Usage
=WELCH_ANOVA(data, dv, between)
data(list[list], required): Input table where the first row contains column names.dv(str, required): Name of the dependent-variable column.between(str, required): Name of the between-subject factor column.
Returns (list[list]): 2D table containing Welch ANOVA results.
Example 1: Three groups with unequal variances
Inputs:
| data | dv | between | |
|---|---|---|---|
| score | group | score | group |
| 4.1 | A | ||
| 4 | A | ||
| 4.3 | A | ||
| 5.2 | B | ||
| 6 | B | ||
| 5.8 | B | ||
| 7.1 | C | ||
| 8.4 | C | ||
| 7.7 | C |
Excel formula:
=WELCH_ANOVA({"score","group";4.1,"A";4,"A";4.3,"A";5.2,"B";6,"B";5.8,"B";7.1,"C";8.4,"C";7.7,"C"}, "score", "group")
Expected output:
| Source | ddof1 | ddof2 | F | p_unc | np2 |
|---|---|---|---|---|---|
| group | 2 | 3.0982 | 47.2455 | 0.00477524 | 0.940448 |
Example 2: Two-group Welch comparison through ANOVA interface
Inputs:
| data | dv | between | |
|---|---|---|---|
| response | arm | response | arm |
| 12.2 | control | ||
| 11.9 | control | ||
| 12.1 | control | ||
| 14.8 | treat | ||
| 15.5 | treat | ||
| 14.9 | treat |
Excel formula:
=WELCH_ANOVA({"response","arm";12.2,"control";11.9,"control";12.1,"control";14.8,"treat";15.5,"treat";14.9,"treat"}, "response", "arm")
Expected output:
| Source | ddof1 | ddof2 | F | p_unc | np2 |
|---|---|---|---|---|---|
| arm | 1 | 2.63435 | 162 | 0.00193996 | 0.975904 |
Example 3: Balanced groups with distinct means
Inputs:
| data | dv | between | |
|---|---|---|---|
| y | g | y | g |
| 2.1 | G1 | ||
| 2 | G1 | ||
| 2.2 | G1 | ||
| 3 | G2 | ||
| 3.2 | G2 | ||
| 3.1 | G2 | ||
| 4 | G3 | ||
| 4.1 | G3 | ||
| 3.9 | G3 |
Excel formula:
=WELCH_ANOVA({"y","g";2.1,"G1";2,"G1";2.2,"G1";3,"G2";3.2,"G2";3.1,"G2";4,"G3";4.1,"G3";3.9,"G3"}, "y", "g")
Expected output:
| Source | ddof1 | ddof2 | F | p_unc | np2 |
|---|---|---|---|---|---|
| g | 2 | 4 | 232.286 | 0.0000728733 | 0.989051 |
Example 4: Mild group differences
Inputs:
| data | dv | between | |
|---|---|---|---|
| metric | cluster | metric | cluster |
| 9.8 | K1 | ||
| 10.1 | K1 | ||
| 10 | K1 | ||
| 10.5 | K2 | ||
| 10.7 | K2 | ||
| 10.6 | K2 | ||
| 11 | K3 | ||
| 11.1 | K3 | ||
| 10.9 | K3 |
Excel formula:
=WELCH_ANOVA({"metric","cluster";9.8,"K1";10.1,"K1";10,"K1";10.5,"K2";10.7,"K2";10.6,"K2";11,"K3";11.1,"K3";10.9,"K3"}, "metric", "cluster")
Expected output:
| Source | ddof1 | ddof2 | F | p_unc | np2 |
|---|---|---|---|---|---|
| cluster | 2 | 3.89226 | 41.6337 | 0.00235779 | 0.949482 |
Python Code
import pandas as pd
from pingouin import welch_anova as pg_welch_anova
def welch_anova(data, dv, between):
"""
Perform Welch ANOVA for unequal variances using Pingouin.
See: https://pingouin-stats.org/build/html/generated/pingouin.welch_anova.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): Input table where the first row contains column names.
dv (str): Name of the dependent-variable column.
between (str): Name of the between-subject factor column.
Returns:
list[list]: 2D table containing Welch ANOVA results.
"""
try:
def to2d(x):
return [[x]] if not isinstance(x, list) else x
def build_dataframe(table):
table = to2d(table)
if not isinstance(table, list) or not table or not all(isinstance(row, list) for row in table):
return None, "Error: data must be a non-empty 2D list"
if len(table) < 2:
return None, "Error: data must include a header row and at least one data row"
headers = [str(h).strip() for h in table[0]]
if any(h == "" for h in headers):
return None, "Error: header row contains empty column names"
if len(set(headers)) != len(headers):
return None, "Error: header row contains duplicate column names"
rows = []
for row in table[1:]:
if len(row) != len(headers):
return None, "Error: all data rows must match header width"
rows.append([None if cell == "" else cell for cell in row])
return pd.DataFrame(rows, columns=headers), None
def dataframe_to_2d(df):
out = [list(df.columns)]
for values in df.itertuples(index=False, name=None):
row = []
for value in values:
if pd.isna(value):
row.append("")
elif isinstance(value, bool):
row.append(value)
elif isinstance(value, (int, float)):
row.append(float(value))
else:
row.append(str(value))
out.append(row)
return out
frame, error = build_dataframe(data)
if error:
return error
if dv not in frame.columns:
return f"Error: dv column '{dv}' not found"
if between not in frame.columns:
return f"Error: between column '{between}' not found"
result = pg_welch_anova(data=frame, dv=dv, between=between)
return dataframe_to_2d(result)
except Exception as e:
return f"Error: {str(e)}"Online Calculator
Input table where the first row contains column names.
Name of the dependent-variable column.
Name of the between-subject factor column.