ANOVA

One-way analysis of variance tests whether group means are equal across levels of a categorical factor.

The test compares between-group variability to within-group variability through the F statistic:

F = \frac{\text{MS}_{\text{between}}}{\text{MS}_{\text{within}}}

This wrapper accepts a table with headers, selects the dependent variable and grouping column, and returns the ANOVA table produced by Pingouin.

Excel Usage

=ANOVA(data, dv, between, ss_type, detailed, effsize)
  • data (list[list], required): Input table where the first row contains column names.
  • dv (str, required): Name of the dependent-variable column.
  • between (str, required): Name of the between-subject factor column.
  • ss_type (int, optional, default: 2): Sum-of-squares type (typically 1, 2, or 3).
  • detailed (bool, optional, default: false): Whether to return a detailed ANOVA table.
  • effsize (str, optional, default: “np2”): Effect size metric (for example, np2, n2, ng2).

Returns (list[list]): 2D table containing the ANOVA results.

Example 1: Balanced one-way design with three groups

Inputs:

data dv between
score group score group
5.1 A
5.4 A
6.3 B
6.5 B
7 C
7.2 C

Excel formula:

=ANOVA({"score","group";5.1,"A";5.4,"A";6.3,"B";6.5,"B";7,"C";7.2,"C"}, "score", "group")

Expected output:

Source ddof1 ddof2 F p_unc np2
group 2 3 61.5882 0.00366618 0.976224
Example 2: Detailed table enabled

Inputs:

data dv between detailed
score group score group true
10 X
11 X
9 X
14 Y
15 Y
16 Y

Excel formula:

=ANOVA({"score","group";10,"X";11,"X";9,"X";14,"Y";15,"Y";16,"Y"}, "score", "group", TRUE)

Expected output:

Source SS DF MS F p_unc np2
group 37.5 1 37.5 37.5 0.00360223 0.903614
Within 4 4 1
Example 3: Sum-of-squares type three

Inputs:

data dv between ss_type
score group score group 3
2.1 G1
2.2 G1
3 G2
3.1 G2
4.2 G3
4.4 G3

Excel formula:

=ANOVA({"score","group";2.1,"G1";2.2,"G1";3,"G2";3.1,"G2";4.2,"G3";4.4,"G3"}, "score", "group", 3)

Expected output:

Source ddof1 ddof2 F p_unc np2
group 2 3 233.167 0.000511046 0.993608
Example 4: Alternate effect size metric

Inputs:

data dv between effsize
outcome condition outcome condition n2
1.2 C1
1.4 C1
2 C2
2.2 C2
2.9 C3
3.1 C3

Excel formula:

=ANOVA({"outcome","condition";1.2,"C1";1.4,"C1";2,"C2";2.2,"C2";2.9,"C3";3.1,"C3"}, "outcome", "condition", "n2")

Expected output:

Source ddof1 ddof2 F p_unc n2
condition 2 3 72.3333 0.00289573 0.979684

Python Code

import pandas as pd
from pingouin import anova as pg_anova

def anova(data, dv, between, ss_type=2, detailed=False, effsize='np2'):
    """
    Perform one-way ANOVA on tabular data using Pingouin.

    See: https://pingouin-stats.org/build/html/generated/pingouin.anova.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): Input table where the first row contains column names.
        dv (str): Name of the dependent-variable column.
        between (str): Name of the between-subject factor column.
        ss_type (int, optional): Sum-of-squares type (typically 1, 2, or 3). Default is 2.
        detailed (bool, optional): Whether to return a detailed ANOVA table. Default is False.
        effsize (str, optional): Effect size metric (for example, np2, n2, ng2). Default is 'np2'.

    Returns:
        list[list]: 2D table containing the ANOVA results.
    """
    try:
        def to2d(x):
            return [[x]] if not isinstance(x, list) else x

        def build_dataframe(table):
            table = to2d(table)
            if not isinstance(table, list) or not table or not all(isinstance(row, list) for row in table):
                return None, "Error: data must be a non-empty 2D list"
            if len(table) < 2:
                return None, "Error: data must include a header row and at least one data row"

            headers = [str(h).strip() for h in table[0]]
            if not headers or any(h == "" for h in headers):
                return None, "Error: header row contains empty column names"
            if len(set(headers)) != len(headers):
                return None, "Error: header row contains duplicate column names"

            rows = []
            for row in table[1:]:
                if len(row) != len(headers):
                    return None, "Error: all data rows must match header width"
                rows.append([None if cell == "" else cell for cell in row])

            return pd.DataFrame(rows, columns=headers), None

        def dataframe_to_2d(df):
            out = [list(df.columns)]
            for values in df.itertuples(index=False, name=None):
                row = []
                for value in values:
                    if pd.isna(value):
                        row.append("")
                    elif isinstance(value, bool):
                        row.append(value)
                    elif isinstance(value, (int, float)):
                        row.append(float(value))
                    else:
                        row.append(str(value))
                out.append(row)
            return out

        frame, error = build_dataframe(data)
        if error:
            return error

        if dv not in frame.columns:
            return f"Error: dv column '{dv}' not found"
        if between not in frame.columns:
            return f"Error: between column '{between}' not found"

        result = pg_anova(
            data=frame,
            dv=dv,
            between=between,
            ss_type=int(ss_type),
            detailed=bool(detailed),
            effsize=effsize,
        )

        return dataframe_to_2d(result)
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

Input table where the first row contains column names.
Name of the dependent-variable column.
Name of the between-subject factor column.
Sum-of-squares type (typically 1, 2, or 3).
Whether to return a detailed ANOVA table.
Effect size metric (for example, np2, n2, ng2).