LINEAR_REG

Linear regression estimates coefficients that minimize squared residual error between observed targets and a linear combination of the input features. It is the standard baseline regressor for continuous outcomes.

The ordinary least squares algorithm finds coefficients w that solve the unconstrained minimization problem:

\min_{w} ||y - Xw||^2_2

This wrapper accepts tabular feature data with rows as samples and columns as features, plus a numeric target supplied as a single row or single column. It returns the training R^2 together with fitted predictions, residuals, and learned coefficient arrays.

Excel Usage

=LINEAR_REG(data, target, fit_intercept, tol, positive)

data (list[list], required): 2D array of numeric feature data with rows as samples and columns as features.
target (list[list], required): Numeric target values as a single row, single column, or scalar when only one sample is present.
fit_intercept (bool, optional, default: true): Whether to include an intercept term in the linear model.
tol (float, optional, default: 0.000001): Solver tolerance used when scipy’s LSQR path is active.
positive (bool, optional, default: false): Whether to constrain fitted coefficients to non-negative values.

Returns (dict): Excel data type containing training R^2, predictions, residuals, and fitted coefficient arrays.

Example 1: Fit linear regression on a two-feature linear trend

Inputs:

data		target	fit_intercept	tol	positive
0	0	1	true	0.000001	false
1	0	3
0	1	4
1	1	6
2	1	8
2	2	11

Excel formula:

=LINEAR_REG({0,0;1,0;0,1;1,1;2,1;2,2}, {1;3;4;6;8;11}, TRUE, 0.000001, FALSE)

Expected output:

{"type":"Double","basicValue":1,"properties":{"training_r2":{"type":"Double","basicValue":1},"mean_squared_error":{"type":"Double","basicValue":0},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":4}],[{"type":"Double","basicValue":6}],[{"type":"Double","basicValue":8}],[{"type":"Double","basicValue":11}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2},{"type":"Double","basicValue":3}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"matrix_rank":{"type":"Double","basicValue":2},"singular_values":{"type":"Array","elements":[[{"type":"Double","basicValue":2.34521}],[{"type":"Double","basicValue":1.1547}]]}}}

Example 2: Flatten a single-row numeric target range for linear regression

Inputs:

data	target						fit_intercept	tol	positive
0	1	3	5	7	9	11	true	0.000001	false
1
2
3
4
5

Excel formula:

=LINEAR_REG({0;1;2;3;4;5}, {1,3,5,7,9,11}, TRUE, 0.000001, FALSE)

Expected output:

{"type":"Double","basicValue":1,"properties":{"training_r2":{"type":"Double","basicValue":1},"mean_squared_error":{"type":"Double","basicValue":2.95823e-31},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":5}],[{"type":"Double","basicValue":7}],[{"type":"Double","basicValue":9}],[{"type":"Double","basicValue":11}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":-8.88178e-16}],[{"type":"Double","basicValue":-4.44089e-16}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":8.88178e-16}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"matrix_rank":{"type":"Double","basicValue":1},"singular_values":{"type":"Array","elements":[[{"type":"Double","basicValue":4.1833}]]}}}

Example 3: Fit a no-intercept linear model on a two-feature plane

Inputs:

data		target	fit_intercept	tol	positive
1	0	2	false	0.000001	false
0	1	3
1	1	5
2	1	7
1	2	8
2	2	10

Excel formula:

=LINEAR_REG({1,0;0,1;1,1;2,1;1,2;2,2}, {2;3;5;7;8;10}, FALSE, 0.000001, FALSE)

Expected output:

{"type":"Double","basicValue":1,"properties":{"training_r2":{"type":"Double","basicValue":1},"mean_squared_error":{"type":"Double","basicValue":1.35503e-29},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":5}],[{"type":"Double","basicValue":7}],[{"type":"Double","basicValue":8}],[{"type":"Double","basicValue":10}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":1.9984e-15}],[{"type":"Double","basicValue":8.88178e-16}],[{"type":"Double","basicValue":2.66454e-15}],[{"type":"Double","basicValue":5.32907e-15}],[{"type":"Double","basicValue":3.55271e-15}],[{"type":"Double","basicValue":5.32907e-15}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2},{"type":"Double","basicValue":3}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":0}]]},"matrix_rank":{"type":"Double","basicValue":2},"singular_values":{"type":"Array","elements":[[{"type":"Double","basicValue":4.47214}],[{"type":"Double","basicValue":1.41421}]]}}}

Example 4: Constrain coefficients to be non-negative during fitting

Inputs:

data		target	fit_intercept	tol	positive
0	0	2	true	0.000001	true
1	0	4
0	1	5
1	1	7
2	1	9
2	2	12

Excel formula:

=LINEAR_REG({0,0;1,0;0,1;1,1;2,1;2,2}, {2;4;5;7;9;12}, TRUE, 0.000001, TRUE)

Expected output:

{"type":"Double","basicValue":1,"properties":{"training_r2":{"type":"Double","basicValue":1},"mean_squared_error":{"type":"Double","basicValue":0},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":4}],[{"type":"Double","basicValue":5}],[{"type":"Double","basicValue":7}],[{"type":"Double","basicValue":9}],[{"type":"Double","basicValue":12}]]},"residuals":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":2},{"type":"Double","basicValue":3}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":2}]]}}}

Python Code

import numpy as np
from sklearn.linear_model import LinearRegression as SklearnLinearRegression

def linear_reg(data, target, fit_intercept=True, tol=1e-06, positive=False):
    """
    Fit an ordinary least-squares linear regression model and return training predictions.

    See: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
        target (list[list]): Numeric target values as a single row, single column, or scalar when only one sample is present.
        fit_intercept (bool, optional): Whether to include an intercept term in the linear model. Default is True.
        tol (float, optional): Solver tolerance used when scipy's LSQR path is active. Default is 1e-06.
        positive (bool, optional): Whether to constrain fitted coefficients to non-negative values. Default is False.

    Returns:
        dict: Excel data type containing training $R^2$, predictions, residuals, and fitted coefficient arrays.
    """
    def py(value):
        return value.item() if isinstance(value, np.generic) else value

    def cell(value):
        value = py(value)
        if isinstance(value, bool):
            return {"type": "Boolean", "basicValue": bool(value)}
        if isinstance(value, (int, float)) and not isinstance(value, bool):
            return {"type": "Double", "basicValue": float(value)}
        return {"type": "String", "basicValue": str(value)}

    def col(values):
        return [[cell(value)] for value in values]

    def mat(values):
        return [[cell(value) for value in row] for row in values]

    def parse_data(value):
        value = [[value]] if not isinstance(value, list) else value
        if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
            return None, "Error: data must be a non-empty 2D list"
        if len({len(row) for row in value}) != 1:
            return None, "Error: data must be a rectangular 2D list"
        data_np = np.array(value, dtype=float)
        if data_np.ndim != 2 or data_np.size == 0:
            return None, "Error: data must be a non-empty 2D list"
        if not np.isfinite(data_np).all():
            return None, "Error: data must contain only finite numeric values"
        return data_np, None

    def parse_target(value, sample_count):
        if not isinstance(value, list):
            labels = [value]
        elif not value:
            return None, "Error: target must be non-empty"
        elif all(not isinstance(item, list) for item in value):
            labels = value
        elif len(value) == 1:
            labels = value[0]
        elif all(isinstance(row, list) and len(row) == 1 for row in value):
            labels = [row[0] for row in value]
        else:
            return None, "Error: target must be a single row or column"

        if len(labels) != sample_count:
            return None, "Error: target length must match sample count"

        parsed = []
        for item in labels:
            item = py(item)
            if isinstance(item, bool) or not isinstance(item, (int, float)):
                return None, "Error: target values must be finite numeric scalars"
            if not np.isfinite(float(item)):
                return None, "Error: target values must be finite numeric scalars"
            parsed.append(float(item))
        return np.array(parsed, dtype=float), None

    def flat_float_list(values):
        return [float(py(item)) for item in np.asarray(values).reshape(-1).tolist()]

    try:
        data_np, error = parse_data(data)
        if error:
            return error

        target_np, error = parse_target(target, data_np.shape[0])
        if error:
            return error

        if float(tol) <= 0:
            return "Error: tol must be greater than 0"

        fitted = SklearnLinearRegression(
            fit_intercept=bool(fit_intercept),
            tol=float(tol),
            positive=bool(positive)
        ).fit(data_np, target_np)

        prediction_array = np.asarray(fitted.predict(data_np), dtype=float)
        residual_array = target_np - prediction_array
        predictions = flat_float_list(prediction_array)
        residuals = flat_float_list(residual_array)
        training_r2 = float(fitted.score(data_np, target_np))
        mse = float(np.mean(np.square(residual_array)))

        properties = {
            "training_r2": {"type": "Double", "basicValue": training_r2},
            "mean_squared_error": {"type": "Double", "basicValue": mse},
            "sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
            "feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
            "predictions": {"type": "Array", "elements": col(predictions)},
            "residuals": {"type": "Array", "elements": col(residuals)},
            "coefficients": {"type": "Array", "elements": mat(np.atleast_2d(fitted.coef_).tolist())},
            "intercepts": {"type": "Array", "elements": col(np.atleast_1d(fitted.intercept_).tolist())}
        }
        if hasattr(fitted, "rank_"):
            properties["matrix_rank"] = {"type": "Double", "basicValue": float(fitted.rank_)}
        if hasattr(fitted, "singular_"):
            properties["singular_values"] = {"type": "Array", "elements": col(np.atleast_1d(fitted.singular_).tolist())}

        return {
            "type": "Double",
            "basicValue": training_r2,
            "properties": properties
        }
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

data *

2D array of numeric feature data with rows as samples and columns as features.

target *

Numeric target values as a single row, single column, or scalar when only one sample is present.

fit_intercept

Whether to include an intercept term in the linear model.

tol

Solver tolerance used when scipy's LSQR path is active.

positive

Whether to constrain fitted coefficients to non-negative values.