LOGISTIC_CLS

Logistic regression is a linear classification method that models class membership probabilities from a weighted combination of the input features. For a binary classification problem, the probability of the positive class is modeled using the sigmoid function:

P(y=1 \mid x) = \frac{1}{1 + \exp(-(w^T x + b))}

Regularization (L1 or L2) helps stabilize the fitted coefficients and reduce overfitting on small or collinear datasets. The model parameters are estimated by minimizing the cross-entropy loss.

This wrapper accepts tabular feature data with rows as samples and columns as features, plus a target supplied as a single row or single column. It returns the training accuracy together with the learned classes, fitted predictions, class counts, class probabilities, and fitted coefficient arrays.

Excel Usage

=LOGISTIC_CLS(data, target, penalty, C, solver, max_iter, fit_intercept, random_state)
  • data (list[list], required): 2D array of numeric feature data with rows as samples and columns as features.
  • target (list[list], required): Target labels as a single row, single column, or scalar when only one sample is present.
  • penalty (str, optional, default: “l2”): Regularization penalty applied to the logistic model.
  • C (float, optional, default: 1): Inverse regularization strength. Smaller values apply stronger regularization.
  • solver (str, optional, default: “lbfgs”): Optimization algorithm used to fit the classifier.
  • max_iter (int, optional, default: 200): Maximum number of solver iterations.
  • fit_intercept (bool, optional, default: true): Whether to include an intercept term in the linear decision function.
  • random_state (int, optional, default: null): Integer seed for solvers that use randomness. Leave blank for the estimator default.

Returns (dict): Excel data type containing training accuracy, predictions, probabilities, and fitted coefficient arrays.

Example 1: Fit logistic regression for two string-labeled classes

Inputs:

data target penalty C solver max_iter fit_intercept random_state
0 0 cold l2 1 lbfgs 200 true 0
0 1 cold
1 0 cold
1 1 hot
2 1 hot
2 2 hot

Excel formula:

=LOGISTIC_CLS({0,0;0,1;1,0;1,1;2,1;2,2}, {"cold";"cold";"cold";"hot";"hot";"hot"}, "l2", 1, "lbfgs", 200, TRUE, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"cold"},{"type":"Double","basicValue":3}],[{"type":"String","basicValue":"hot"},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.812399},{"type":"Double","basicValue":0.187601}],[{"type":"Double","basicValue":0.682647},{"type":"Double","basicValue":0.317353}],[{"type":"Double","basicValue":0.635306},{"type":"Double","basicValue":0.364694}],[{"type":"Double","basicValue":0.463897},{"type":"Double","basicValue":0.536103}],[{"type":"Double","basicValue":0.258211},{"type":"Double","basicValue":0.741789}],[{"type":"Double","basicValue":0.147418},{"type":"Double","basicValue":0.852582}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":0.910626},{"type":"Double","basicValue":0.699711}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-1.46567}]]}}}

Example 2: Classify one-dimensional samples with numeric labels

Inputs:

data target penalty C solver max_iter fit_intercept random_state
0 0 l2 1 lbfgs 200 true 0
0.2 0
0.4 0
1.2 1
1.4 1
1.6 1

Excel formula:

=LOGISTIC_CLS({0;0.2;0.4;1.2;1.4;1.6}, {0;0;0;1;1;1}, "l2", 1, "lbfgs", 200, TRUE, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}]]},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.716801},{"type":"Double","basicValue":0.283199}],[{"type":"Double","basicValue":0.667409},{"type":"Double","basicValue":0.332591}],[{"type":"Double","basicValue":0.61404},{"type":"Double","basicValue":0.38596}],[{"type":"Double","basicValue":0.38596},{"type":"Double","basicValue":0.61404}],[{"type":"Double","basicValue":0.332591},{"type":"Double","basicValue":0.667409}],[{"type":"Double","basicValue":0.283199},{"type":"Double","basicValue":0.716801}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.16081}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.92865}]]}}}

Example 3: Fit a three-class logistic model on separated groups

Inputs:

data target penalty C solver max_iter fit_intercept random_state
0 0 left l2 1 lbfgs 200 true 0
0.2 0.1 left
4 4 center
4.2 3.9 center
8 0 right
8.2 0.1 right

Excel formula:

=LOGISTIC_CLS({0,0;0.2,0.1;4,4;4.2,3.9;8,0;8.2,0.1}, {"left";"left";"center";"center";"right";"right"}, "l2", 1, "lbfgs", 200, TRUE, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":3},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"right"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"right"}],[{"type":"String","basicValue":"right"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"center"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"left"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"right"},{"type":"Double","basicValue":2}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.0391472},{"type":"Double","basicValue":0.949817},{"type":"Double","basicValue":0.0110358}],[{"type":"Double","basicValue":0.0477367},{"type":"Double","basicValue":0.938711},{"type":"Double","basicValue":0.013552}],[{"type":"Double","basicValue":0.917197},{"type":"Double","basicValue":0.0437528},{"type":"Double","basicValue":0.0390499}],[{"type":"Double","basicValue":0.909183},{"type":"Double","basicValue":0.0430583},{"type":"Double","basicValue":0.0477584}],[{"type":"Double","basicValue":0.043504},{"type":"Double","basicValue":0.0136633},{"type":"Double","basicValue":0.942833}],[{"type":"Double","basicValue":0.0433284},{"type":"Double","basicValue":0.0110291},{"type":"Double","basicValue":0.945642}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":0.000202248},{"type":"Double","basicValue":0.676631}],[{"type":"Double","basicValue":-0.543183},{"type":"Double","basicValue":-0.337911}],[{"type":"Double","basicValue":0.54298},{"type":"Double","basicValue":-0.33872}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.640917}],[{"type":"Double","basicValue":2.54802}],[{"type":"Double","basicValue":-1.90711}]]}}}

Example 4: Flatten a single-row boolean target range

Inputs:

data target penalty C solver max_iter fit_intercept random_state
0 false false false true true true l2 1 lbfgs 200 true 0
0.3
0.6
1.4
1.7
2

Excel formula:

=LOGISTIC_CLS({0;0.3;0.6;1.4;1.7;2}, {FALSE,FALSE,FALSE,TRUE,TRUE,TRUE}, "l2", 1, "lbfgs", 200, TRUE, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}]]},"predictions":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Boolean","basicValue":false},{"type":"Double","basicValue":3}],[{"type":"Boolean","basicValue":true},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.7675},{"type":"Double","basicValue":0.2325}],[{"type":"Double","basicValue":0.697605},{"type":"Double","basicValue":0.302395}],[{"type":"Double","basicValue":0.617178},{"type":"Double","basicValue":0.382822}],[{"type":"Double","basicValue":0.382734},{"type":"Double","basicValue":0.617266}],[{"type":"Double","basicValue":0.302317},{"type":"Double","basicValue":0.697683}],[{"type":"Double","basicValue":0.232434},{"type":"Double","basicValue":0.767566}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.19443}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-1.19425}]]}}}

Python Code

import numpy as np
from sklearn.linear_model import LogisticRegression as SklearnLogisticRegression

def logistic_cls(data, target, penalty='l2', C=1, solver='lbfgs', max_iter=200, fit_intercept=True, random_state=None):
    """
    Fit a regularized logistic regression classifier and return training predictions.

    See: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
        target (list[list]): Target labels as a single row, single column, or scalar when only one sample is present.
        penalty (str, optional): Regularization penalty applied to the logistic model. Valid options: L2, L1. Default is 'l2'.
        C (float, optional): Inverse regularization strength. Smaller values apply stronger regularization. Default is 1.
        solver (str, optional): Optimization algorithm used to fit the classifier. Valid options: LBFGS, Liblinear, SAGA. Default is 'lbfgs'.
        max_iter (int, optional): Maximum number of solver iterations. Default is 200.
        fit_intercept (bool, optional): Whether to include an intercept term in the linear decision function. Default is True.
        random_state (int, optional): Integer seed for solvers that use randomness. Leave blank for the estimator default. Default is None.

    Returns:
        dict: Excel data type containing training accuracy, predictions, probabilities, and fitted coefficient arrays.
    """
    def py(value):
        return value.item() if isinstance(value, np.generic) else value

    def cell(value):
        value = py(value)
        if isinstance(value, bool):
            return {"type": "Boolean", "basicValue": bool(value)}
        if isinstance(value, (int, float)) and not isinstance(value, bool):
            return {"type": "Double", "basicValue": float(value)}
        return {"type": "String", "basicValue": str(value)}

    def col(values):
        return [[cell(value)] for value in values]

    def mat(values):
        return [[cell(value) for value in row] for row in values]

    def parse_data(value):
        value = [[value]] if not isinstance(value, list) else value
        if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
            return None, "Error: data must be a non-empty 2D list"
        if len({len(row) for row in value}) != 1:
            return None, "Error: data must be a rectangular 2D list"
        data_np = np.array(value, dtype=float)
        if data_np.ndim != 2 or data_np.size == 0:
            return None, "Error: data must be a non-empty 2D list"
        if not np.isfinite(data_np).all():
            return None, "Error: data must contain only finite numeric values"
        return data_np, None

    def parse_target(value, sample_count):
        if not isinstance(value, list):
            labels = [value]
        elif not value:
            return None, "Error: target must be non-empty"
        elif all(not isinstance(item, list) for item in value):
            labels = value
        elif len(value) == 1:
            labels = value[0]
        elif all(isinstance(row, list) and len(row) == 1 for row in value):
            labels = [row[0] for row in value]
        else:
            return None, "Error: target must be a single row or column"

        if len(labels) != sample_count:
            return None, "Error: target length must match sample count"

        parsed = []
        classes = []
        for item in labels:
            item = py(item)
            if isinstance(item, str):
                if not item.strip():
                    return None, "Error: target labels must not be blank"
            elif isinstance(item, bool):
                item = bool(item)
            elif isinstance(item, (int, float)) and not isinstance(item, bool):
                if not np.isfinite(float(item)):
                    return None, "Error: target labels must be finite"
                item = float(item) if isinstance(item, float) else int(item)
            else:
                return None, "Error: target labels must be scalar string, boolean, or numeric values"
            parsed.append(item)
            if not any(type(existing) is type(item) and existing == item for existing in classes):
                classes.append(item)

        if len(classes) < 2:
            return None, "Error: target must contain at least 2 classes"
        return parsed, None

    def count_table(predictions, classes):
        rows = [[{"type": "String", "basicValue": "class"}, {"type": "String", "basicValue": "count"}]]
        for class_label in classes:
            count = sum(type(prediction) is type(class_label) and prediction == class_label for prediction in predictions)
            rows.append([cell(class_label), {"type": "Double", "basicValue": float(count)}])
        return rows

    try:
        data_np, error = parse_data(data)
        if error:
            return error

        target_values, error = parse_target(target, data_np.shape[0])
        if error:
            return error

        penalty_value = str(penalty).strip().lower()
        if penalty_value not in {"l1", "l2"}:
            return "Error: penalty must be 'l1' or 'l2'"

        if float(C) <= 0:
            return "Error: C must be greater than 0"

        solver_value = str(solver).strip().lower()
        if solver_value not in {"lbfgs", "liblinear", "saga"}:
            return "Error: solver must be 'lbfgs', 'liblinear', or 'saga'"
        if penalty_value == "l1" and solver_value not in {"liblinear", "saga"}:
            return "Error: solver must be 'liblinear' or 'saga' when penalty is 'l1'"

        if int(max_iter) < 1:
            return "Error: max_iter must be at least 1"

        fitted = SklearnLogisticRegression(
            penalty=penalty_value,
            C=float(C),
            solver=solver_value,
            max_iter=int(max_iter),
            fit_intercept=bool(fit_intercept),
            random_state=None if random_state in (None, "") else int(random_state)
        ).fit(data_np, target_values)

        prediction_array = fitted.predict(data_np)
        predictions = [py(item) for item in prediction_array.tolist()]
        classes = [py(item) for item in fitted.classes_.tolist()]
        accuracy = float(np.mean([
            type(prediction) is type(actual) and prediction == actual
            for prediction, actual in zip(predictions, target_values)
        ]))

        return {
            "type": "Double",
            "basicValue": accuracy,
            "properties": {
                "accuracy": {"type": "Double", "basicValue": accuracy},
                "sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
                "feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
                "class_count": {"type": "Double", "basicValue": float(len(classes))},
                "classes": {"type": "Array", "elements": col(classes)},
                "predictions": {"type": "Array", "elements": col(predictions)},
                "prediction_counts": {"type": "Array", "elements": count_table(predictions, classes)},
                "probabilities": {"type": "Array", "elements": mat(fitted.predict_proba(data_np).tolist())},
                "coefficients": {"type": "Array", "elements": mat(fitted.coef_.tolist())},
                "intercepts": {"type": "Array", "elements": col(fitted.intercept_.tolist())}
            }
        }
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

2D array of numeric feature data with rows as samples and columns as features.
Target labels as a single row, single column, or scalar when only one sample is present.
Regularization penalty applied to the logistic model.
Inverse regularization strength. Smaller values apply stronger regularization.
Optimization algorithm used to fit the classifier.
Maximum number of solver iterations.
Whether to include an intercept term in the linear decision function.
Integer seed for solvers that use randomness. Leave blank for the estimator default.