Classification

GAUSSIAN_NB

Gaussian naive Bayes is a probabilistic classifier that applies Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable:

P(y \mid x_1, \dots, x_n) \propto P(y) \prod_{i=1}^n P(x_i \mid y)

In Gaussian naive Bayes, the likelihood of the features is assumed to follow a normal distribution:

P(x_i \mid y) = \frac{1}{\sqrt{2\pi\sigma_{iy}^2}} \exp\left(-\frac{(x_i - \mu_{iy})^2}{2\sigma_{iy}^2}\right)

This implementation accepts rows as samples and a target supplied as a single row or single column. It returns training accuracy together with predicted labels, class counts, fitted probabilities, class priors, and per-class Gaussian parameters.

Excel Usage

=GAUSSIAN_NB(data, target, var_smoothing)
  • data (list[list], required): 2D array of numeric feature data with rows as samples and columns as features.
  • target (list[list], required): Target labels as a single row, single column, or scalar when only one sample is present.
  • var_smoothing (float, optional, default: 1e-9): Portion of the largest feature variance added for numerical stability.

Returns (dict): Excel data type containing training accuracy, predictions, probabilities, and fitted Gaussian parameter arrays.

Example 1: Fit Gaussian naive Bayes for two string-labeled classes

Inputs:

data target var_smoothing
0 0 low 1e-9
0.1 0.2 low
0.2 0 low
2 2 high
2.1 2.2 high
2.2 2 high

Excel formula:

=GAUSSIAN_NB({0,0;0.1,0.2;0.2,0;2,2;2.1,2.2;2.2,2}, {"low";"low";"low";"high";"high";"high"}, 1e-9)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"high"}],[{"type":"String","basicValue":"low"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"low"}],[{"type":"String","basicValue":"low"}],[{"type":"String","basicValue":"low"}],[{"type":"String","basicValue":"high"}],[{"type":"String","basicValue":"high"}],[{"type":"String","basicValue":"high"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"high"},{"type":"Double","basicValue":3}],[{"type":"String","basicValue":"low"},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":2.83256e-248},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1.05745e-215},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":3.23477e-222},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3.45682e-209}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":9.25966e-242}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3.027e-235}]]},"class_priors":{"type":"Array","elements":[[{"type":"Double","basicValue":0.5}],[{"type":"Double","basicValue":0.5}]]},"theta":{"type":"Array","elements":[[{"type":"Double","basicValue":2.1},{"type":"Double","basicValue":2.06667}],[{"type":"Double","basicValue":0.1},{"type":"Double","basicValue":0.0666667}]]},"variances":{"type":"Array","elements":[[{"type":"Double","basicValue":0.00666667},{"type":"Double","basicValue":0.00888889}],[{"type":"Double","basicValue":0.00666667},{"type":"Double","basicValue":0.00888889}]]}}}

Example 2: Fit Gaussian naive Bayes for one-dimensional numeric labels

Inputs:

data target var_smoothing
0 0 1e-9
0.2 0
0.4 0
1.2 1
1.4 1
1.6 1

Excel formula:

=GAUSSIAN_NB({0;0.2;0.4;1.2;1.4;1.6}, {0;0;0;1;1;1}, 1e-9)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}]]},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":1},{"type":"Double","basicValue":2.31952e-16}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":1.87953e-12}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":1.523e-8}],[{"type":"Double","basicValue":1.523e-8},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1.87953e-12},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":2.31952e-16},{"type":"Double","basicValue":1}]]},"class_priors":{"type":"Array","elements":[[{"type":"Double","basicValue":0.5}],[{"type":"Double","basicValue":0.5}]]},"theta":{"type":"Array","elements":[[{"type":"Double","basicValue":0.2}],[{"type":"Double","basicValue":1.4}]]},"variances":{"type":"Array","elements":[[{"type":"Double","basicValue":0.0266667}],[{"type":"Double","basicValue":0.0266667}]]}}}

Example 3: Fit Gaussian naive Bayes for three separated groups

Inputs:

data target var_smoothing
0 0 left 1e-9
0.2 0.1 left
4 4 center
4.2 3.9 center
8 0 right
8.2 0.1 right

Excel formula:

=GAUSSIAN_NB({0,0;0.2,0.1;4,4;4.2,3.9;8,0;8.2,0.1}, {"left";"left";"center";"center";"right";"right"}, 1e-9)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":3},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"right"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"right"}],[{"type":"String","basicValue":"right"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"center"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"left"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"right"},{"type":"Double","basicValue":2}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}]]},"class_priors":{"type":"Array","elements":[[{"type":"Double","basicValue":0.333333}],[{"type":"Double","basicValue":0.333333}],[{"type":"Double","basicValue":0.333333}]]},"theta":{"type":"Array","elements":[[{"type":"Double","basicValue":4.1},{"type":"Double","basicValue":3.95}],[{"type":"Double","basicValue":0.1},{"type":"Double","basicValue":0.05}],[{"type":"Double","basicValue":8.1},{"type":"Double","basicValue":0.05}]]},"variances":{"type":"Array","elements":[[{"type":"Double","basicValue":0.01},{"type":"Double","basicValue":0.00250001}],[{"type":"Double","basicValue":0.01},{"type":"Double","basicValue":0.00250001}],[{"type":"Double","basicValue":0.01},{"type":"Double","basicValue":0.00250001}]]}}}

Example 4: Flatten a single-row boolean target range for Gaussian naive Bayes

Inputs:

data target var_smoothing
0 false false false true true true 1e-9
0.3
0.6
1.4
1.7
2

Excel formula:

=GAUSSIAN_NB({0;0.3;0.6;1.4;1.7;2}, {FALSE,FALSE,FALSE,TRUE,TRUE,TRUE}, 1e-9)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}]]},"predictions":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Boolean","basicValue":false},{"type":"Double","basicValue":3}],[{"type":"Boolean","basicValue":true},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":1},{"type":"Double","basicValue":7.35296e-11}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":8.0635e-8}],[{"type":"Double","basicValue":0.999912},{"type":"Double","basicValue":0.0000884192}],[{"type":"Double","basicValue":0.0000884192},{"type":"Double","basicValue":0.999912}],[{"type":"Double","basicValue":8.0635e-8},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":7.35296e-11},{"type":"Double","basicValue":1}]]},"class_priors":{"type":"Array","elements":[[{"type":"Double","basicValue":0.5}],[{"type":"Double","basicValue":0.5}]]},"theta":{"type":"Array","elements":[[{"type":"Double","basicValue":0.3}],[{"type":"Double","basicValue":1.7}]]},"variances":{"type":"Array","elements":[[{"type":"Double","basicValue":0.06}],[{"type":"Double","basicValue":0.06}]]}}}

Python Code

Show Code
import numpy as np
from sklearn.naive_bayes import GaussianNB as SklearnGaussianNB

def gaussian_nb(data, target, var_smoothing=1e-09):
    """
    Fit a Gaussian naive Bayes classifier and return training predictions.

    See: https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.GaussianNB.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
        target (list[list]): Target labels as a single row, single column, or scalar when only one sample is present.
        var_smoothing (float, optional): Portion of the largest feature variance added for numerical stability. Default is 1e-09.

    Returns:
        dict: Excel data type containing training accuracy, predictions, probabilities, and fitted Gaussian parameter arrays.
    """
    def py(value):
        return value.item() if isinstance(value, np.generic) else value

    def cell(value):
        value = py(value)
        if isinstance(value, bool):
            return {"type": "Boolean", "basicValue": bool(value)}
        if isinstance(value, (int, float)) and not isinstance(value, bool):
            return {"type": "Double", "basicValue": float(value)}
        return {"type": "String", "basicValue": str(value)}

    def col(values):
        return [[cell(value)] for value in values]

    def mat(values):
        return [[cell(value) for value in row] for row in values]

    def parse_data(value):
        value = [[value]] if not isinstance(value, list) else value
        if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
            return None, "Error: data must be a non-empty 2D list"
        if len({len(row) for row in value}) != 1:
            return None, "Error: data must be a rectangular 2D list"
        data_np = np.array(value, dtype=float)
        if data_np.ndim != 2 or data_np.size == 0:
            return None, "Error: data must be a non-empty 2D list"
        if not np.isfinite(data_np).all():
            return None, "Error: data must contain only finite numeric values"
        return data_np, None

    def parse_target(value, sample_count):
        if not isinstance(value, list):
            labels = [value]
        elif not value:
            return None, "Error: target must be non-empty"
        elif all(not isinstance(item, list) for item in value):
            labels = value
        elif len(value) == 1:
            labels = value[0]
        elif all(isinstance(row, list) and len(row) == 1 for row in value):
            labels = [row[0] for row in value]
        else:
            return None, "Error: target must be a single row or column"

        if len(labels) != sample_count:
            return None, "Error: target length must match sample count"

        parsed = []
        classes = []
        for item in labels:
            item = py(item)
            if isinstance(item, str):
                if not item.strip():
                    return None, "Error: target labels must not be blank"
            elif isinstance(item, bool):
                item = bool(item)
            elif isinstance(item, (int, float)) and not isinstance(item, bool):
                if not np.isfinite(float(item)):
                    return None, "Error: target labels must be finite"
                item = float(item) if isinstance(item, float) else int(item)
            else:
                return None, "Error: target labels must be scalar string, boolean, or numeric values"
            parsed.append(item)
            if not any(type(existing) is type(item) and existing == item for existing in classes):
                classes.append(item)

        if len(classes) < 2:
            return None, "Error: target must contain at least 2 classes"
        return parsed, None

    def count_table(predictions, classes):
        rows = [[{"type": "String", "basicValue": "class"}, {"type": "String", "basicValue": "count"}]]
        for class_label in classes:
            count = sum(type(prediction) is type(class_label) and prediction == class_label for prediction in predictions)
            rows.append([cell(class_label), {"type": "Double", "basicValue": float(count)}])
        return rows

    try:
        data_np, error = parse_data(data)
        if error:
            return error

        target_values, error = parse_target(target, data_np.shape[0])
        if error:
            return error

        if float(var_smoothing) <= 0:
            return "Error: var_smoothing must be greater than 0"

        fitted = SklearnGaussianNB(var_smoothing=float(var_smoothing)).fit(data_np, target_values)

        prediction_array = fitted.predict(data_np)
        predictions = [py(item) for item in prediction_array.tolist()]
        classes = [py(item) for item in fitted.classes_.tolist()]
        accuracy = float(np.mean([
            type(prediction) is type(actual) and prediction == actual
            for prediction, actual in zip(predictions, target_values)
        ]))

        return {
            "type": "Double",
            "basicValue": accuracy,
            "properties": {
                "accuracy": {"type": "Double", "basicValue": accuracy},
                "sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
                "feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
                "class_count": {"type": "Double", "basicValue": float(len(classes))},
                "classes": {"type": "Array", "elements": col(classes)},
                "predictions": {"type": "Array", "elements": col(predictions)},
                "prediction_counts": {"type": "Array", "elements": count_table(predictions, classes)},
                "probabilities": {"type": "Array", "elements": mat(fitted.predict_proba(data_np).tolist())},
                "class_priors": {"type": "Array", "elements": col(fitted.class_prior_.tolist())},
                "theta": {"type": "Array", "elements": mat(fitted.theta_.tolist())},
                "variances": {"type": "Array", "elements": mat(fitted.var_.tolist())}
            }
        }
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

2D array of numeric feature data with rows as samples and columns as features.
Target labels as a single row, single column, or scalar when only one sample is present.
Portion of the largest feature variance added for numerical stability.

GB_CLASSIFY

Gradient boosting classification builds a sequence of shallow decision trees that iteratively correct earlier mistakes. At each stage m, a new weak learner h_m(x) is added to the model to minimize the loss function:

F_{m}(x) = F_{m-1}(x) + \gamma_m h_m(x)

This flexible nonlinear classifier is well-suited for tabular data and exposes feature-importance estimates based on how often features are used to split nodes across the ensemble.

This wrapper accepts rows as samples and a target supplied as a single row or single column. It returns training accuracy together with predicted labels, class counts, class probabilities, and fitted feature importances.

Excel Usage

=GB_CLASSIFY(data, target, n_estimators, learning_rate, max_depth, subsample, random_state)
  • data (list[list], required): 2D array of numeric feature data with rows as samples and columns as features.
  • target (list[list], required): Target labels as a single row, single column, or scalar when only one sample is present.
  • n_estimators (int, optional, default: 100): Number of boosting stages to fit.
  • learning_rate (float, optional, default: 0.1): Shrinkage factor applied to each boosting stage.
  • max_depth (int, optional, default: 3): Maximum depth of each individual regression tree.
  • subsample (float, optional, default: 1): Fraction of samples used to fit each boosting stage.
  • random_state (int, optional, default: null): Integer seed for reproducible boosting and tree construction. Leave blank for the estimator default.

Returns (dict): Excel data type containing training accuracy, predictions, probabilities, and fitted feature importances.

Example 1: Fit a gradient boosting classifier for two string-labeled groups

Inputs:

data target n_estimators learning_rate max_depth subsample random_state
0 0 cold 50 0.1 2 1 0
0 1 cold
1 0 cold
2 2 hot
2 3 hot
3 2 hot

Excel formula:

=GB_CLASSIFY({0,0;0,1;1,0;2,2;2,3;3,2}, {"cold";"cold";"cold";"hot";"hot";"hot"}, 50, 0.1, 2, 1, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"cold"},{"type":"Double","basicValue":3}],[{"type":"String","basicValue":"hot"},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.996749},{"type":"Double","basicValue":0.00325104}],[{"type":"Double","basicValue":0.996749},{"type":"Double","basicValue":0.00325104}],[{"type":"Double","basicValue":0.996749},{"type":"Double","basicValue":0.00325104}],[{"type":"Double","basicValue":0.00325104},{"type":"Double","basicValue":0.996749}],[{"type":"Double","basicValue":0.00325104},{"type":"Double","basicValue":0.996749}],[{"type":"Double","basicValue":0.00325104},{"type":"Double","basicValue":0.996749}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":0.293415}],[{"type":"Double","basicValue":0.706585}]]},"estimator_count":{"type":"Double","basicValue":50}}}

Example 2: Fit gradient boosting for one-dimensional numeric labels

Inputs:

data target n_estimators learning_rate max_depth subsample random_state
0 0 50 0.1 2 1 0
0.2 0
0.4 0
1.2 1
1.4 1
1.6 1

Excel formula:

=GB_CLASSIFY({0;0.2;0.4;1.2;1.4;1.6}, {0;0;0;1;1;1}, 50, 0.1, 2, 1, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}]]},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.996749},{"type":"Double","basicValue":0.00325104}],[{"type":"Double","basicValue":0.996749},{"type":"Double","basicValue":0.00325104}],[{"type":"Double","basicValue":0.996749},{"type":"Double","basicValue":0.00325104}],[{"type":"Double","basicValue":0.00325104},{"type":"Double","basicValue":0.996749}],[{"type":"Double","basicValue":0.00325104},{"type":"Double","basicValue":0.996749}],[{"type":"Double","basicValue":0.00325104},{"type":"Double","basicValue":0.996749}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"estimator_count":{"type":"Double","basicValue":50}}}

Example 3: Fit a gradient boosting classifier for three separated groups

Inputs:

data target n_estimators learning_rate max_depth subsample random_state
0 0 left 50 0.1 2 1 0
0.2 0.1 left
4 4 center
4.2 3.9 center
8 0 right
8.2 0.1 right

Excel formula:

=GB_CLASSIFY({0,0;0.2,0.1;4,4;4.2,3.9;8,0;8.2,0.1}, {"left";"left";"center";"center";"right";"right"}, 50, 0.1, 2, 1, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":3},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"right"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"right"}],[{"type":"String","basicValue":"right"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"center"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"left"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"right"},{"type":"Double","basicValue":2}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.000505015},{"type":"Double","basicValue":0.99899},{"type":"Double","basicValue":0.000505015}],[{"type":"Double","basicValue":0.000505015},{"type":"Double","basicValue":0.99899},{"type":"Double","basicValue":0.000505015}],[{"type":"Double","basicValue":0.99899},{"type":"Double","basicValue":0.000505015},{"type":"Double","basicValue":0.000505015}],[{"type":"Double","basicValue":0.99899},{"type":"Double","basicValue":0.000505015},{"type":"Double","basicValue":0.000505015}],[{"type":"Double","basicValue":0.000505015},{"type":"Double","basicValue":0.000505015},{"type":"Double","basicValue":0.99899}],[{"type":"Double","basicValue":0.000505015},{"type":"Double","basicValue":0.000505015},{"type":"Double","basicValue":0.99899}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":0.666667}],[{"type":"Double","basicValue":0.333333}]]},"estimator_count":{"type":"Double","basicValue":50}}}

Example 4: Flatten a single-row boolean target range for gradient boosting classification

Inputs:

data target n_estimators learning_rate max_depth subsample random_state
0 false false false true true true 50 0.1 2 1 0
0.3
0.6
1.4
1.7
2

Excel formula:

=GB_CLASSIFY({0;0.3;0.6;1.4;1.7;2}, {FALSE,FALSE,FALSE,TRUE,TRUE,TRUE}, 50, 0.1, 2, 1, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}]]},"predictions":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Boolean","basicValue":false},{"type":"Double","basicValue":3}],[{"type":"Boolean","basicValue":true},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.996749},{"type":"Double","basicValue":0.00325104}],[{"type":"Double","basicValue":0.996749},{"type":"Double","basicValue":0.00325104}],[{"type":"Double","basicValue":0.996749},{"type":"Double","basicValue":0.00325104}],[{"type":"Double","basicValue":0.00325104},{"type":"Double","basicValue":0.996749}],[{"type":"Double","basicValue":0.00325104},{"type":"Double","basicValue":0.996749}],[{"type":"Double","basicValue":0.00325104},{"type":"Double","basicValue":0.996749}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"estimator_count":{"type":"Double","basicValue":50}}}

Python Code

Show Code
import numpy as np
from sklearn.ensemble import GradientBoostingClassifier as SklearnGradientBoostingClassifier

def gb_classify(data, target, n_estimators=100, learning_rate=0.1, max_depth=3, subsample=1, random_state=None):
    """
    Fit a gradient boosting classifier and return training predictions.

    See: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
        target (list[list]): Target labels as a single row, single column, or scalar when only one sample is present.
        n_estimators (int, optional): Number of boosting stages to fit. Default is 100.
        learning_rate (float, optional): Shrinkage factor applied to each boosting stage. Default is 0.1.
        max_depth (int, optional): Maximum depth of each individual regression tree. Default is 3.
        subsample (float, optional): Fraction of samples used to fit each boosting stage. Default is 1.
        random_state (int, optional): Integer seed for reproducible boosting and tree construction. Leave blank for the estimator default. Default is None.

    Returns:
        dict: Excel data type containing training accuracy, predictions, probabilities, and fitted feature importances.
    """
    def py(value):
        return value.item() if isinstance(value, np.generic) else value

    def cell(value):
        value = py(value)
        if isinstance(value, bool):
            return {"type": "Boolean", "basicValue": bool(value)}
        if isinstance(value, (int, float)) and not isinstance(value, bool):
            return {"type": "Double", "basicValue": float(value)}
        return {"type": "String", "basicValue": str(value)}

    def col(values):
        return [[cell(value)] for value in values]

    def mat(values):
        return [[cell(value) for value in row] for row in values]

    def parse_data(value):
        value = [[value]] if not isinstance(value, list) else value
        if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
            return None, "Error: data must be a non-empty 2D list"
        if len({len(row) for row in value}) != 1:
            return None, "Error: data must be a rectangular 2D list"
        data_np = np.array(value, dtype=float)
        if data_np.ndim != 2 or data_np.size == 0:
            return None, "Error: data must be a non-empty 2D list"
        if not np.isfinite(data_np).all():
            return None, "Error: data must contain only finite numeric values"
        return data_np, None

    def parse_target(value, sample_count):
        if not isinstance(value, list):
            labels = [value]
        elif not value:
            return None, "Error: target must be non-empty"
        elif all(not isinstance(item, list) for item in value):
            labels = value
        elif len(value) == 1:
            labels = value[0]
        elif all(isinstance(row, list) and len(row) == 1 for row in value):
            labels = [row[0] for row in value]
        else:
            return None, "Error: target must be a single row or column"

        if len(labels) != sample_count:
            return None, "Error: target length must match sample count"

        parsed = []
        classes = []
        for item in labels:
            item = py(item)
            if isinstance(item, str):
                if not item.strip():
                    return None, "Error: target labels must not be blank"
            elif isinstance(item, bool):
                item = bool(item)
            elif isinstance(item, (int, float)) and not isinstance(item, bool):
                if not np.isfinite(float(item)):
                    return None, "Error: target labels must be finite"
                item = float(item) if isinstance(item, float) else int(item)
            else:
                return None, "Error: target labels must be scalar string, boolean, or numeric values"
            parsed.append(item)
            if not any(type(existing) is type(item) and existing == item for existing in classes):
                classes.append(item)

        if len(classes) < 2:
            return None, "Error: target must contain at least 2 classes"
        return parsed, None

    def count_table(predictions, classes):
        rows = [[{"type": "String", "basicValue": "class"}, {"type": "String", "basicValue": "count"}]]
        for class_label in classes:
            count = sum(type(prediction) is type(class_label) and prediction == class_label for prediction in predictions)
            rows.append([cell(class_label), {"type": "Double", "basicValue": float(count)}])
        return rows

    try:
        data_np, error = parse_data(data)
        if error:
            return error

        target_values, error = parse_target(target, data_np.shape[0])
        if error:
            return error

        if int(n_estimators) < 1:
            return "Error: n_estimators must be at least 1"
        if float(learning_rate) <= 0:
            return "Error: learning_rate must be greater than 0"
        if int(max_depth) < 1:
            return "Error: max_depth must be at least 1"
        if float(subsample) <= 0 or float(subsample) > 1:
            return "Error: subsample must be greater than 0 and at most 1"

        fitted = SklearnGradientBoostingClassifier(
            n_estimators=int(n_estimators),
            learning_rate=float(learning_rate),
            max_depth=int(max_depth),
            subsample=float(subsample),
            random_state=None if random_state in (None, "") else int(random_state)
        ).fit(data_np, target_values)

        prediction_array = fitted.predict(data_np)
        predictions = [py(item) for item in prediction_array.tolist()]
        classes = [py(item) for item in fitted.classes_.tolist()]
        accuracy = float(np.mean([
            type(prediction) is type(actual) and prediction == actual
            for prediction, actual in zip(predictions, target_values)
        ]))

        return {
            "type": "Double",
            "basicValue": accuracy,
            "properties": {
                "accuracy": {"type": "Double", "basicValue": accuracy},
                "sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
                "feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
                "class_count": {"type": "Double", "basicValue": float(len(classes))},
                "classes": {"type": "Array", "elements": col(classes)},
                "predictions": {"type": "Array", "elements": col(predictions)},
                "prediction_counts": {"type": "Array", "elements": count_table(predictions, classes)},
                "probabilities": {"type": "Array", "elements": mat(fitted.predict_proba(data_np).tolist())},
                "feature_importances": {"type": "Array", "elements": col(fitted.feature_importances_.tolist())},
                "estimator_count": {"type": "Double", "basicValue": float(fitted.n_estimators_)}
            }
        }
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

2D array of numeric feature data with rows as samples and columns as features.
Target labels as a single row, single column, or scalar when only one sample is present.
Number of boosting stages to fit.
Shrinkage factor applied to each boosting stage.
Maximum depth of each individual regression tree.
Fraction of samples used to fit each boosting stage.
Integer seed for reproducible boosting and tree construction. Leave blank for the estimator default.

KNN_CLASSIFY

K-nearest neighbors classification predicts the label of a sample based on the labels of its k closest neighbors in the training set. The distance between samples x and y is typically measured using the Minkowski distance:

d(x, y) = \left( \sum_{i=1}^n |x_i - y_i|^p \right)^{1/p}

When p=2, this corresponds to the standard Euclidean distance, and when p=1, it is the Manhattan distance. The function can also return class probabilities derived from the proportion of neighbors supporting each class.

This wrapper accepts rows as samples and a target supplied as a single row or single column. It returns training accuracy together with predicted labels, class counts, fitted class probabilities, and the resolved distance metric.

Excel Usage

=KNN_CLASSIFY(data, target, n_neighbors, knn_weights, knn_metric, p)
  • data (list[list], required): 2D array of numeric feature data with rows as samples and columns as features.
  • target (list[list], required): Target labels as a single row, single column, or scalar when only one sample is present.
  • n_neighbors (int, optional, default: 5): Number of nearest neighbors used for each vote.
  • knn_weights (str, optional, default: “uniform”): Weighting scheme used when aggregating neighbor votes.
  • knn_metric (str, optional, default: “minkowski”): Distance metric used to compare samples.
  • p (int, optional, default: 2): Power parameter for the Minkowski metric.

Returns (dict): Excel data type containing training accuracy, predictions, probabilities, and k-nearest-neighbor summary properties.

Example 1: Classify two string-labeled groups with uniform neighbor votes

Inputs:

data target n_neighbors knn_weights knn_metric p
0 low 3 uniform euclidean 2
0.1 low
0.2 low
1.5 high
1.6 high
1.7 high

Excel formula:

=KNN_CLASSIFY({0;0.1;0.2;1.5;1.6;1.7}, {"low";"low";"low";"high";"high";"high"}, 3, "uniform", "euclidean", 2)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"high"}],[{"type":"String","basicValue":"low"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"low"}],[{"type":"String","basicValue":"low"}],[{"type":"String","basicValue":"low"}],[{"type":"String","basicValue":"high"}],[{"type":"String","basicValue":"high"}],[{"type":"String","basicValue":"high"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"high"},{"type":"Double","basicValue":3}],[{"type":"String","basicValue":"low"},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}]]},"neighbor_count":{"type":"Double","basicValue":3},"effective_metric":{"type":"String","basicValue":"euclidean"}}}

Example 2: Use distance weighting for numeric target labels

Inputs:

data target n_neighbors knn_weights knn_metric p
0 0 0 3 distance euclidean 2
0 0.2 0
0.2 0 0
2 2 1
2.1 2 1
2 2.1 1

Excel formula:

=KNN_CLASSIFY({0,0;0,0.2;0.2,0;2,2;2.1,2;2,2.1}, {0;0;0;1;1;1}, 3, "distance", "euclidean", 2)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}]]},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}]]},"neighbor_count":{"type":"Double","basicValue":3},"effective_metric":{"type":"String","basicValue":"euclidean"}}}

Example 3: Fit k-nearest neighbors for three groups with Manhattan distance

Inputs:

data target n_neighbors knn_weights knn_metric p
0 0 left 1 uniform manhattan 1
0.1 0.2 left
4 4 center
4.1 3.9 center
8 0 right
8.1 0.2 right

Excel formula:

=KNN_CLASSIFY({0,0;0.1,0.2;4,4;4.1,3.9;8,0;8.1,0.2}, {"left";"left";"center";"center";"right";"right"}, 1, "uniform", "manhattan", 1)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":3},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"right"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"right"}],[{"type":"String","basicValue":"right"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"center"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"left"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"right"},{"type":"Double","basicValue":2}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}]]},"neighbor_count":{"type":"Double","basicValue":1},"effective_metric":{"type":"String","basicValue":"manhattan"}}}

Example 4: Flatten a single-row boolean target range for k-nearest neighbors

Inputs:

data target n_neighbors knn_weights knn_metric p
0 false false false true true true 1 uniform euclidean 2
0.3
0.6
1.4
1.7
2

Excel formula:

=KNN_CLASSIFY({0;0.3;0.6;1.4;1.7;2}, {FALSE,FALSE,FALSE,TRUE,TRUE,TRUE}, 1, "uniform", "euclidean", 2)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}]]},"predictions":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Boolean","basicValue":false},{"type":"Double","basicValue":3}],[{"type":"Boolean","basicValue":true},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}]]},"neighbor_count":{"type":"Double","basicValue":1},"effective_metric":{"type":"String","basicValue":"euclidean"}}}

Python Code

Show Code
import numpy as np
from sklearn.neighbors import KNeighborsClassifier as SklearnKNeighborsClassifier

def knn_classify(data, target, n_neighbors=5, knn_weights='uniform', knn_metric='minkowski', p=2):
    """
    Fit a k-nearest neighbors classifier and return training predictions.

    See: https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
        target (list[list]): Target labels as a single row, single column, or scalar when only one sample is present.
        n_neighbors (int, optional): Number of nearest neighbors used for each vote. Default is 5.
        knn_weights (str, optional): Weighting scheme used when aggregating neighbor votes. Valid options: Uniform, Distance. Default is 'uniform'.
        knn_metric (str, optional): Distance metric used to compare samples. Valid options: Minkowski, Euclidean, Manhattan. Default is 'minkowski'.
        p (int, optional): Power parameter for the Minkowski metric. Default is 2.

    Returns:
        dict: Excel data type containing training accuracy, predictions, probabilities, and k-nearest-neighbor summary properties.
    """
    def py(value):
        return value.item() if isinstance(value, np.generic) else value

    def cell(value):
        value = py(value)
        if isinstance(value, bool):
            return {"type": "Boolean", "basicValue": bool(value)}
        if isinstance(value, (int, float)) and not isinstance(value, bool):
            return {"type": "Double", "basicValue": float(value)}
        return {"type": "String", "basicValue": str(value)}

    def col(values):
        return [[cell(value)] for value in values]

    def mat(values):
        return [[cell(value) for value in row] for row in values]

    def parse_data(value):
        value = [[value]] if not isinstance(value, list) else value
        if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
            return None, "Error: data must be a non-empty 2D list"
        if len({len(row) for row in value}) != 1:
            return None, "Error: data must be a rectangular 2D list"
        data_np = np.array(value, dtype=float)
        if data_np.ndim != 2 or data_np.size == 0:
            return None, "Error: data must be a non-empty 2D list"
        if not np.isfinite(data_np).all():
            return None, "Error: data must contain only finite numeric values"
        return data_np, None

    def parse_target(value, sample_count):
        if not isinstance(value, list):
            labels = [value]
        elif not value:
            return None, "Error: target must be non-empty"
        elif all(not isinstance(item, list) for item in value):
            labels = value
        elif len(value) == 1:
            labels = value[0]
        elif all(isinstance(row, list) and len(row) == 1 for row in value):
            labels = [row[0] for row in value]
        else:
            return None, "Error: target must be a single row or column"

        if len(labels) != sample_count:
            return None, "Error: target length must match sample count"

        parsed = []
        classes = []
        for item in labels:
            item = py(item)
            if isinstance(item, str):
                if not item.strip():
                    return None, "Error: target labels must not be blank"
            elif isinstance(item, bool):
                item = bool(item)
            elif isinstance(item, (int, float)) and not isinstance(item, bool):
                if not np.isfinite(float(item)):
                    return None, "Error: target labels must be finite"
                item = float(item) if isinstance(item, float) else int(item)
            else:
                return None, "Error: target labels must be scalar string, boolean, or numeric values"
            parsed.append(item)
            if not any(type(existing) is type(item) and existing == item for existing in classes):
                classes.append(item)

        if len(classes) < 2:
            return None, "Error: target must contain at least 2 classes"
        return parsed, None

    def count_table(predictions, classes):
        rows = [[{"type": "String", "basicValue": "class"}, {"type": "String", "basicValue": "count"}]]
        for class_label in classes:
            count = sum(type(prediction) is type(class_label) and prediction == class_label for prediction in predictions)
            rows.append([cell(class_label), {"type": "Double", "basicValue": float(count)}])
        return rows

    try:
        data_np, error = parse_data(data)
        if error:
            return error

        target_values, error = parse_target(target, data_np.shape[0])
        if error:
            return error

        neighbor_total = int(n_neighbors)
        if neighbor_total < 1:
            return "Error: n_neighbors must be at least 1"
        if neighbor_total > data_np.shape[0]:
            return "Error: n_neighbors cannot exceed the number of samples"

        weights_value = str(knn_weights).strip().lower()
        if weights_value not in {"uniform", "distance"}:
            return "Error: weights must be 'uniform' or 'distance'"

        metric_value = str(knn_metric).strip().lower()
        if metric_value not in {"minkowski", "euclidean", "manhattan"}:
            return "Error: metric must be 'minkowski', 'euclidean', or 'manhattan'"
        if int(p) < 1:
            return "Error: p must be at least 1"

        fitted = SklearnKNeighborsClassifier(
            n_neighbors=neighbor_total,
            weights=weights_value,
            metric=metric_value,
            p=int(p)
        ).fit(data_np, target_values)

        prediction_array = fitted.predict(data_np)
        predictions = [py(item) for item in prediction_array.tolist()]
        classes = [py(item) for item in fitted.classes_.tolist()]
        accuracy = float(np.mean([
            type(prediction) is type(actual) and prediction == actual
            for prediction, actual in zip(predictions, target_values)
        ]))

        return {
            "type": "Double",
            "basicValue": accuracy,
            "properties": {
                "accuracy": {"type": "Double", "basicValue": accuracy},
                "sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
                "feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
                "class_count": {"type": "Double", "basicValue": float(len(classes))},
                "classes": {"type": "Array", "elements": col(classes)},
                "predictions": {"type": "Array", "elements": col(predictions)},
                "prediction_counts": {"type": "Array", "elements": count_table(predictions, classes)},
                "probabilities": {"type": "Array", "elements": mat(fitted.predict_proba(data_np).tolist())},
                "neighbor_count": {"type": "Double", "basicValue": float(neighbor_total)},
                "effective_metric": {"type": "String", "basicValue": str(fitted.effective_metric_)}
            }
        }
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

2D array of numeric feature data with rows as samples and columns as features.
Target labels as a single row, single column, or scalar when only one sample is present.
Number of nearest neighbors used for each vote.
Weighting scheme used when aggregating neighbor votes.
Distance metric used to compare samples.
Power parameter for the Minkowski metric.

LINEAR_SVC

Linear support vector classification fits a max-margin linear decision boundary. The decision function for a sample x is defined as:

f(x) = w^T x + b

The model parameters w and b are determined by minimizing a combination of the squared norm of w and the hinge loss across all training samples:

\min_{w, b} \frac{1}{2} \|w\|^2 + C \sum_{i=1}^n \max(0, 1 - y_i(w^T x_i + b))

This wrapper accepts rows as samples and a target supplied as a single row or single column. It returns training accuracy together with predicted labels, class counts, decision scores, and fitted coefficient arrays.

Excel Usage

=LINEAR_SVC(data, target, penalty, loss, C, max_iter, tol, fit_intercept, random_state)
  • data (list[list], required): 2D array of numeric feature data with rows as samples and columns as features.
  • target (list[list], required): Target labels as a single row, single column, or scalar when only one sample is present.
  • penalty (str, optional, default: “l2”): Norm used in the linear SVM penalty term.
  • loss (str, optional, default: “squared_hinge”): Hinge-style loss function used during fitting.
  • C (float, optional, default: 1): Inverse regularization strength. Smaller values apply stronger regularization.
  • max_iter (int, optional, default: 1000): Maximum number of optimization iterations.
  • tol (float, optional, default: 0.0001): Convergence tolerance for the optimizer.
  • fit_intercept (bool, optional, default: true): Whether to include an intercept term in the linear decision function.
  • random_state (int, optional, default: null): Integer seed used when the underlying solver shuffles data. Leave blank for the estimator default.

Returns (dict): Excel data type containing training accuracy, predictions, decision scores, and fitted coefficient arrays.

Example 1: Fit linear support vector classification for two string-labeled classes

Inputs:

data target penalty loss C max_iter tol fit_intercept random_state
0 0 cold l2 squared_hinge 1 4000 0.0001 true 0
0 1 cold
1 0 cold
2 2 hot
2 3 hot
3 2 hot

Excel formula:

=LINEAR_SVC({0,0;0,1;1,0;2,2;2,3;3,2}, {"cold";"cold";"cold";"hot";"hot";"hot"}, "l2", "squared_hinge", 1, 4000, 0.0001, TRUE, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"cold"},{"type":"Double","basicValue":3}],[{"type":"String","basicValue":"hot"},{"type":"Double","basicValue":3}]]},"decision_scores":{"type":"Array","elements":[[{"type":"Double","basicValue":-1.01639}],[{"type":"Double","basicValue":-0.590164}],[{"type":"Double","basicValue":-0.590164}],[{"type":"Double","basicValue":0.688525}],[{"type":"Double","basicValue":1.11475}],[{"type":"Double","basicValue":1.11475}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":0.42623},{"type":"Double","basicValue":0.42623}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-1.01639}]]}}}

Example 2: Use hinge loss for one-dimensional numeric labels

Inputs:

data target penalty loss C max_iter tol fit_intercept random_state
0 0 l2 hinge 1 4000 0.0001 true 0
0.2 0
0.4 0
1.2 1
1.4 1
1.6 1

Excel formula:

=LINEAR_SVC({0;0.2;0.4;1.2;1.4;1.6}, {0;0;0;1;1;1}, "l2", "hinge", 1, 4000, 0.0001, TRUE, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}]]},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"decision_scores":{"type":"Array","elements":[[{"type":"Double","basicValue":-1.00001}],[{"type":"Double","basicValue":-0.71429}],[{"type":"Double","basicValue":-0.428575}],[{"type":"Double","basicValue":0.714285}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1.28572}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.42858}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-1.00001}]]}}}

Example 3: Fit linear support vector classification for three groups

Inputs:

data target penalty loss C max_iter tol fit_intercept random_state
0 0 left l2 squared_hinge 1 4000 0.0001 true 0
0.2 0.1 left
4 4 center
4.2 3.9 center
8 0 right
8.2 0.1 right

Excel formula:

=LINEAR_SVC({0,0;0.2,0.1;4,4;4.2,3.9;8,0;8.2,0.1}, {"left";"left";"center";"center";"right";"right"}, "l2", "squared_hinge", 1, 4000, 0.0001, TRUE, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":3},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"right"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"right"}],[{"type":"String","basicValue":"right"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"center"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"left"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"right"},{"type":"Double","basicValue":2}]]},"decision_scores":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.801853},{"type":"Double","basicValue":0.811928},{"type":"Double","basicValue":-0.804821}],[{"type":"Double","basicValue":-0.75962},{"type":"Double","basicValue":0.744738},{"type":"Double","basicValue":-0.788505}],[{"type":"Double","basicValue":0.992554},{"type":"Double","basicValue":-0.971388},{"type":"Double","basicValue":-1.04172}],[{"type":"Double","basicValue":0.939812},{"type":"Double","basicValue":-0.994626},{"type":"Double","basicValue":-0.969078}],[{"type":"Double","basicValue":-1.01204},{"type":"Double","basicValue":-0.996617},{"type":"Double","basicValue":0.97425}],[{"type":"Double","basicValue":-0.969803},{"type":"Double","basicValue":-1.06381},{"type":"Double","basicValue":0.990566}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.0262729},{"type":"Double","basicValue":0.474875}],[{"type":"Double","basicValue":-0.226068},{"type":"Double","basicValue":-0.219761}],[{"type":"Double","basicValue":0.222384},{"type":"Double","basicValue":-0.281607}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.801853}],[{"type":"Double","basicValue":0.811928}],[{"type":"Double","basicValue":-0.804821}]]}}}

Example 4: Flatten a single-row boolean target range for linear support vector classification

Inputs:

data target penalty loss C max_iter tol fit_intercept random_state
0 false false false true true true l2 squared_hinge 1 4000 0.0001 true 0
0.3
0.6
1.4
1.7
2

Excel formula:

=LINEAR_SVC({0;0.3;0.6;1.4;1.7;2}, {FALSE,FALSE,FALSE,TRUE,TRUE,TRUE}, "l2", "squared_hinge", 1, 4000, 0.0001, TRUE, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}]]},"predictions":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Boolean","basicValue":false},{"type":"Double","basicValue":3}],[{"type":"Boolean","basicValue":true},{"type":"Double","basicValue":3}]]},"decision_scores":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.918239}],[{"type":"Double","basicValue":-0.614465}],[{"type":"Double","basicValue":-0.310692}],[{"type":"Double","basicValue":0.499371}],[{"type":"Double","basicValue":0.803145}],[{"type":"Double","basicValue":1.10692}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.01258}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.918239}]]}}}

Python Code

Show Code
import numpy as np
from sklearn.svm import LinearSVC as SklearnLinearSVC

def linear_svc(data, target, penalty='l2', loss='squared_hinge', C=1, max_iter=1000, tol=0.0001, fit_intercept=True, random_state=None):
    """
    Fit a linear support vector classifier and return training predictions.

    See: https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
        target (list[list]): Target labels as a single row, single column, or scalar when only one sample is present.
        penalty (str, optional): Norm used in the linear SVM penalty term. Valid options: L2, L1. Default is 'l2'.
        loss (str, optional): Hinge-style loss function used during fitting. Valid options: Squared Hinge, Hinge. Default is 'squared_hinge'.
        C (float, optional): Inverse regularization strength. Smaller values apply stronger regularization. Default is 1.
        max_iter (int, optional): Maximum number of optimization iterations. Default is 1000.
        tol (float, optional): Convergence tolerance for the optimizer. Default is 0.0001.
        fit_intercept (bool, optional): Whether to include an intercept term in the linear decision function. Default is True.
        random_state (int, optional): Integer seed used when the underlying solver shuffles data. Leave blank for the estimator default. Default is None.

    Returns:
        dict: Excel data type containing training accuracy, predictions, decision scores, and fitted coefficient arrays.
    """
    def py(value):
        return value.item() if isinstance(value, np.generic) else value

    def cell(value):
        value = py(value)
        if isinstance(value, bool):
            return {"type": "Boolean", "basicValue": bool(value)}
        if isinstance(value, (int, float)) and not isinstance(value, bool):
            return {"type": "Double", "basicValue": float(value)}
        return {"type": "String", "basicValue": str(value)}

    def col(values):
        return [[cell(value)] for value in values]

    def mat(values):
        return [[cell(value) for value in row] for row in values]

    def parse_data(value):
        value = [[value]] if not isinstance(value, list) else value
        if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
            return None, "Error: data must be a non-empty 2D list"
        if len({len(row) for row in value}) != 1:
            return None, "Error: data must be a rectangular 2D list"
        data_np = np.array(value, dtype=float)
        if data_np.ndim != 2 or data_np.size == 0:
            return None, "Error: data must be a non-empty 2D list"
        if not np.isfinite(data_np).all():
            return None, "Error: data must contain only finite numeric values"
        return data_np, None

    def parse_target(value, sample_count):
        if not isinstance(value, list):
            labels = [value]
        elif not value:
            return None, "Error: target must be non-empty"
        elif all(not isinstance(item, list) for item in value):
            labels = value
        elif len(value) == 1:
            labels = value[0]
        elif all(isinstance(row, list) and len(row) == 1 for row in value):
            labels = [row[0] for row in value]
        else:
            return None, "Error: target must be a single row or column"

        if len(labels) != sample_count:
            return None, "Error: target length must match sample count"

        parsed = []
        classes = []
        for item in labels:
            item = py(item)
            if isinstance(item, str):
                if not item.strip():
                    return None, "Error: target labels must not be blank"
            elif isinstance(item, bool):
                item = bool(item)
            elif isinstance(item, (int, float)) and not isinstance(item, bool):
                if not np.isfinite(float(item)):
                    return None, "Error: target labels must be finite"
                item = float(item) if isinstance(item, float) else int(item)
            else:
                return None, "Error: target labels must be strings, booleans, or numbers"
            parsed.append(item)
            if not any(type(existing) is type(item) and existing == item for existing in classes):
                classes.append(item)

        if len(classes) < 2:
            return None, "Error: target must contain at least 2 classes"
        return parsed, None

    def count_table(predictions, classes):
        rows = [[{"type": "String", "basicValue": "class"}, {"type": "String", "basicValue": "count"}]]
        for class_label in classes:
            rows.append([cell(class_label), {"type": "Double", "basicValue": float(sum(type(prediction) is type(class_label) and prediction == class_label for prediction in predictions))}])
        return rows

    try:
        data_np, error = parse_data(data)
        if error:
            return error

        target_values, error = parse_target(target, data_np.shape[0])
        if error:
            return error

        penalty_value = str(penalty).strip().lower()
        if penalty_value not in {"l1", "l2"}:
            return "Error: penalty must be 'l1' or 'l2'"
        loss_value = str(loss).strip().lower()
        if loss_value not in {"hinge", "squared_hinge"}:
            return "Error: loss must be 'hinge' or 'squared_hinge'"
        if penalty_value == "l1" and loss_value == "hinge":
            return "Error: penalty 'l1' cannot be combined with loss 'hinge'"
        if float(C) <= 0:
            return "Error: C must be greater than 0"
        if int(max_iter) < 1:
            return "Error: max_iter must be at least 1"
        if float(tol) <= 0:
            return "Error: tol must be greater than 0"

        fitted = SklearnLinearSVC(
            penalty=penalty_value,
            loss=loss_value,
            C=float(C),
            max_iter=int(max_iter),
            tol=float(tol),
            fit_intercept=bool(fit_intercept),
            random_state=None if random_state in (None, "") else int(random_state),
            dual="auto"
        ).fit(data_np, target_values)

        prediction_array = fitted.predict(data_np)
        predictions = [py(item) for item in prediction_array.tolist()]
        classes = [py(item) for item in fitted.classes_.tolist()]
        scores = np.asarray(fitted.decision_function(data_np))
        score_rows = [[float(value)] for value in scores.tolist()] if scores.ndim == 1 else scores.tolist()
        accuracy = float(np.mean([type(prediction) is type(actual) and prediction == actual for prediction, actual in zip(predictions, target_values)]))

        return {
            "type": "Double",
            "basicValue": accuracy,
            "properties": {
                "accuracy": {"type": "Double", "basicValue": accuracy},
                "sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
                "feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
                "class_count": {"type": "Double", "basicValue": float(len(classes))},
                "classes": {"type": "Array", "elements": col(classes)},
                "predictions": {"type": "Array", "elements": col(predictions)},
                "prediction_counts": {"type": "Array", "elements": count_table(predictions, classes)},
                "decision_scores": {"type": "Array", "elements": mat(score_rows)},
                "coefficients": {"type": "Array", "elements": mat(np.atleast_2d(fitted.coef_).tolist())},
                "intercepts": {"type": "Array", "elements": col(np.atleast_1d(fitted.intercept_).tolist())}
            }
        }
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

2D array of numeric feature data with rows as samples and columns as features.
Target labels as a single row, single column, or scalar when only one sample is present.
Norm used in the linear SVM penalty term.
Hinge-style loss function used during fitting.
Inverse regularization strength. Smaller values apply stronger regularization.
Maximum number of optimization iterations.
Convergence tolerance for the optimizer.
Whether to include an intercept term in the linear decision function.
Integer seed used when the underlying solver shuffles data. Leave blank for the estimator default.

LOGISTIC_CLS

Logistic regression is a linear classification method that models class membership probabilities from a weighted combination of the input features. For a binary classification problem, the probability of the positive class is modeled using the sigmoid function:

P(y=1 \mid x) = \frac{1}{1 + \exp(-(w^T x + b))}

Regularization (L1 or L2) helps stabilize the fitted coefficients and reduce overfitting on small or collinear datasets. The model parameters are estimated by minimizing the cross-entropy loss.

This wrapper accepts tabular feature data with rows as samples and columns as features, plus a target supplied as a single row or single column. It returns the training accuracy together with the learned classes, fitted predictions, class counts, class probabilities, and fitted coefficient arrays.

Excel Usage

=LOGISTIC_CLS(data, target, penalty, C, solver, max_iter, fit_intercept, random_state)
  • data (list[list], required): 2D array of numeric feature data with rows as samples and columns as features.
  • target (list[list], required): Target labels as a single row, single column, or scalar when only one sample is present.
  • penalty (str, optional, default: “l2”): Regularization penalty applied to the logistic model.
  • C (float, optional, default: 1): Inverse regularization strength. Smaller values apply stronger regularization.
  • solver (str, optional, default: “lbfgs”): Optimization algorithm used to fit the classifier.
  • max_iter (int, optional, default: 200): Maximum number of solver iterations.
  • fit_intercept (bool, optional, default: true): Whether to include an intercept term in the linear decision function.
  • random_state (int, optional, default: null): Integer seed for solvers that use randomness. Leave blank for the estimator default.

Returns (dict): Excel data type containing training accuracy, predictions, probabilities, and fitted coefficient arrays.

Example 1: Fit logistic regression for two string-labeled classes

Inputs:

data target penalty C solver max_iter fit_intercept random_state
0 0 cold l2 1 lbfgs 200 true 0
0 1 cold
1 0 cold
1 1 hot
2 1 hot
2 2 hot

Excel formula:

=LOGISTIC_CLS({0,0;0,1;1,0;1,1;2,1;2,2}, {"cold";"cold";"cold";"hot";"hot";"hot"}, "l2", 1, "lbfgs", 200, TRUE, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"cold"},{"type":"Double","basicValue":3}],[{"type":"String","basicValue":"hot"},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.812399},{"type":"Double","basicValue":0.187601}],[{"type":"Double","basicValue":0.682647},{"type":"Double","basicValue":0.317353}],[{"type":"Double","basicValue":0.635306},{"type":"Double","basicValue":0.364694}],[{"type":"Double","basicValue":0.463897},{"type":"Double","basicValue":0.536103}],[{"type":"Double","basicValue":0.258211},{"type":"Double","basicValue":0.741789}],[{"type":"Double","basicValue":0.147418},{"type":"Double","basicValue":0.852582}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":0.910626},{"type":"Double","basicValue":0.699711}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-1.46567}]]}}}

Example 2: Classify one-dimensional samples with numeric labels

Inputs:

data target penalty C solver max_iter fit_intercept random_state
0 0 l2 1 lbfgs 200 true 0
0.2 0
0.4 0
1.2 1
1.4 1
1.6 1

Excel formula:

=LOGISTIC_CLS({0;0.2;0.4;1.2;1.4;1.6}, {0;0;0;1;1;1}, "l2", 1, "lbfgs", 200, TRUE, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}]]},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.716801},{"type":"Double","basicValue":0.283199}],[{"type":"Double","basicValue":0.667409},{"type":"Double","basicValue":0.332591}],[{"type":"Double","basicValue":0.61404},{"type":"Double","basicValue":0.38596}],[{"type":"Double","basicValue":0.38596},{"type":"Double","basicValue":0.61404}],[{"type":"Double","basicValue":0.332591},{"type":"Double","basicValue":0.667409}],[{"type":"Double","basicValue":0.283199},{"type":"Double","basicValue":0.716801}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.16081}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.92865}]]}}}

Example 3: Fit a three-class logistic model on separated groups

Inputs:

data target penalty C solver max_iter fit_intercept random_state
0 0 left l2 1 lbfgs 200 true 0
0.2 0.1 left
4 4 center
4.2 3.9 center
8 0 right
8.2 0.1 right

Excel formula:

=LOGISTIC_CLS({0,0;0.2,0.1;4,4;4.2,3.9;8,0;8.2,0.1}, {"left";"left";"center";"center";"right";"right"}, "l2", 1, "lbfgs", 200, TRUE, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":3},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"right"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"right"}],[{"type":"String","basicValue":"right"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"center"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"left"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"right"},{"type":"Double","basicValue":2}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.0391472},{"type":"Double","basicValue":0.949817},{"type":"Double","basicValue":0.0110358}],[{"type":"Double","basicValue":0.0477367},{"type":"Double","basicValue":0.938711},{"type":"Double","basicValue":0.013552}],[{"type":"Double","basicValue":0.917197},{"type":"Double","basicValue":0.0437528},{"type":"Double","basicValue":0.0390499}],[{"type":"Double","basicValue":0.909183},{"type":"Double","basicValue":0.0430583},{"type":"Double","basicValue":0.0477584}],[{"type":"Double","basicValue":0.043504},{"type":"Double","basicValue":0.0136633},{"type":"Double","basicValue":0.942833}],[{"type":"Double","basicValue":0.0433284},{"type":"Double","basicValue":0.0110291},{"type":"Double","basicValue":0.945642}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":0.000202248},{"type":"Double","basicValue":0.676631}],[{"type":"Double","basicValue":-0.543183},{"type":"Double","basicValue":-0.337911}],[{"type":"Double","basicValue":0.54298},{"type":"Double","basicValue":-0.33872}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.640917}],[{"type":"Double","basicValue":2.54802}],[{"type":"Double","basicValue":-1.90711}]]}}}

Example 4: Flatten a single-row boolean target range

Inputs:

data target penalty C solver max_iter fit_intercept random_state
0 false false false true true true l2 1 lbfgs 200 true 0
0.3
0.6
1.4
1.7
2

Excel formula:

=LOGISTIC_CLS({0;0.3;0.6;1.4;1.7;2}, {FALSE,FALSE,FALSE,TRUE,TRUE,TRUE}, "l2", 1, "lbfgs", 200, TRUE, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}]]},"predictions":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Boolean","basicValue":false},{"type":"Double","basicValue":3}],[{"type":"Boolean","basicValue":true},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.7675},{"type":"Double","basicValue":0.2325}],[{"type":"Double","basicValue":0.697605},{"type":"Double","basicValue":0.302395}],[{"type":"Double","basicValue":0.617178},{"type":"Double","basicValue":0.382822}],[{"type":"Double","basicValue":0.382734},{"type":"Double","basicValue":0.617266}],[{"type":"Double","basicValue":0.302317},{"type":"Double","basicValue":0.697683}],[{"type":"Double","basicValue":0.232434},{"type":"Double","basicValue":0.767566}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.19443}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-1.19425}]]}}}

Python Code

Show Code
import numpy as np
from sklearn.linear_model import LogisticRegression as SklearnLogisticRegression

def logistic_cls(data, target, penalty='l2', C=1, solver='lbfgs', max_iter=200, fit_intercept=True, random_state=None):
    """
    Fit a regularized logistic regression classifier and return training predictions.

    See: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
        target (list[list]): Target labels as a single row, single column, or scalar when only one sample is present.
        penalty (str, optional): Regularization penalty applied to the logistic model. Valid options: L2, L1. Default is 'l2'.
        C (float, optional): Inverse regularization strength. Smaller values apply stronger regularization. Default is 1.
        solver (str, optional): Optimization algorithm used to fit the classifier. Valid options: LBFGS, Liblinear, SAGA. Default is 'lbfgs'.
        max_iter (int, optional): Maximum number of solver iterations. Default is 200.
        fit_intercept (bool, optional): Whether to include an intercept term in the linear decision function. Default is True.
        random_state (int, optional): Integer seed for solvers that use randomness. Leave blank for the estimator default. Default is None.

    Returns:
        dict: Excel data type containing training accuracy, predictions, probabilities, and fitted coefficient arrays.
    """
    def py(value):
        return value.item() if isinstance(value, np.generic) else value

    def cell(value):
        value = py(value)
        if isinstance(value, bool):
            return {"type": "Boolean", "basicValue": bool(value)}
        if isinstance(value, (int, float)) and not isinstance(value, bool):
            return {"type": "Double", "basicValue": float(value)}
        return {"type": "String", "basicValue": str(value)}

    def col(values):
        return [[cell(value)] for value in values]

    def mat(values):
        return [[cell(value) for value in row] for row in values]

    def parse_data(value):
        value = [[value]] if not isinstance(value, list) else value
        if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
            return None, "Error: data must be a non-empty 2D list"
        if len({len(row) for row in value}) != 1:
            return None, "Error: data must be a rectangular 2D list"
        data_np = np.array(value, dtype=float)
        if data_np.ndim != 2 or data_np.size == 0:
            return None, "Error: data must be a non-empty 2D list"
        if not np.isfinite(data_np).all():
            return None, "Error: data must contain only finite numeric values"
        return data_np, None

    def parse_target(value, sample_count):
        if not isinstance(value, list):
            labels = [value]
        elif not value:
            return None, "Error: target must be non-empty"
        elif all(not isinstance(item, list) for item in value):
            labels = value
        elif len(value) == 1:
            labels = value[0]
        elif all(isinstance(row, list) and len(row) == 1 for row in value):
            labels = [row[0] for row in value]
        else:
            return None, "Error: target must be a single row or column"

        if len(labels) != sample_count:
            return None, "Error: target length must match sample count"

        parsed = []
        classes = []
        for item in labels:
            item = py(item)
            if isinstance(item, str):
                if not item.strip():
                    return None, "Error: target labels must not be blank"
            elif isinstance(item, bool):
                item = bool(item)
            elif isinstance(item, (int, float)) and not isinstance(item, bool):
                if not np.isfinite(float(item)):
                    return None, "Error: target labels must be finite"
                item = float(item) if isinstance(item, float) else int(item)
            else:
                return None, "Error: target labels must be scalar string, boolean, or numeric values"
            parsed.append(item)
            if not any(type(existing) is type(item) and existing == item for existing in classes):
                classes.append(item)

        if len(classes) < 2:
            return None, "Error: target must contain at least 2 classes"
        return parsed, None

    def count_table(predictions, classes):
        rows = [[{"type": "String", "basicValue": "class"}, {"type": "String", "basicValue": "count"}]]
        for class_label in classes:
            count = sum(type(prediction) is type(class_label) and prediction == class_label for prediction in predictions)
            rows.append([cell(class_label), {"type": "Double", "basicValue": float(count)}])
        return rows

    try:
        data_np, error = parse_data(data)
        if error:
            return error

        target_values, error = parse_target(target, data_np.shape[0])
        if error:
            return error

        penalty_value = str(penalty).strip().lower()
        if penalty_value not in {"l1", "l2"}:
            return "Error: penalty must be 'l1' or 'l2'"

        if float(C) <= 0:
            return "Error: C must be greater than 0"

        solver_value = str(solver).strip().lower()
        if solver_value not in {"lbfgs", "liblinear", "saga"}:
            return "Error: solver must be 'lbfgs', 'liblinear', or 'saga'"
        if penalty_value == "l1" and solver_value not in {"liblinear", "saga"}:
            return "Error: solver must be 'liblinear' or 'saga' when penalty is 'l1'"

        if int(max_iter) < 1:
            return "Error: max_iter must be at least 1"

        fitted = SklearnLogisticRegression(
            penalty=penalty_value,
            C=float(C),
            solver=solver_value,
            max_iter=int(max_iter),
            fit_intercept=bool(fit_intercept),
            random_state=None if random_state in (None, "") else int(random_state)
        ).fit(data_np, target_values)

        prediction_array = fitted.predict(data_np)
        predictions = [py(item) for item in prediction_array.tolist()]
        classes = [py(item) for item in fitted.classes_.tolist()]
        accuracy = float(np.mean([
            type(prediction) is type(actual) and prediction == actual
            for prediction, actual in zip(predictions, target_values)
        ]))

        return {
            "type": "Double",
            "basicValue": accuracy,
            "properties": {
                "accuracy": {"type": "Double", "basicValue": accuracy},
                "sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
                "feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
                "class_count": {"type": "Double", "basicValue": float(len(classes))},
                "classes": {"type": "Array", "elements": col(classes)},
                "predictions": {"type": "Array", "elements": col(predictions)},
                "prediction_counts": {"type": "Array", "elements": count_table(predictions, classes)},
                "probabilities": {"type": "Array", "elements": mat(fitted.predict_proba(data_np).tolist())},
                "coefficients": {"type": "Array", "elements": mat(fitted.coef_.tolist())},
                "intercepts": {"type": "Array", "elements": col(fitted.intercept_.tolist())}
            }
        }
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

2D array of numeric feature data with rows as samples and columns as features.
Target labels as a single row, single column, or scalar when only one sample is present.
Regularization penalty applied to the logistic model.
Inverse regularization strength. Smaller values apply stronger regularization.
Optimization algorithm used to fit the classifier.
Maximum number of solver iterations.
Whether to include an intercept term in the linear decision function.
Integer seed for solvers that use randomness. Leave blank for the estimator default.

RF_CLASSIFY

Random forest classification averages the predictions of many decision trees trained on resampled data (bootstrapping) and feature subsets. For a classification problem with B trees, the ensemble prediction \hat{y} is determined by majority vote:

\hat{y} = \text{mode}\{T_1(x), T_2(x), \dots, T_B(x)\}

This approach reduces model variance without significantly increasing bias, making it a strong default for nonlinear tabular classification. It also exposes feature-importance estimates based on impurity reduction at split nodes.

This wrapper accepts rows as samples and a target supplied as a single row or single column. It returns training accuracy together with predicted labels, class counts, class probabilities, and fitted feature importances.

Excel Usage

=RF_CLASSIFY(data, target, n_estimators, rf_criterion, max_depth, min_samples_leaf, random_state)
  • data (list[list], required): 2D array of numeric feature data with rows as samples and columns as features.
  • target (list[list], required): Target labels as a single row, single column, or scalar when only one sample is present.
  • n_estimators (int, optional, default: 100): Number of trees in the forest.
  • rf_criterion (str, optional, default: “gini”): Split quality measure used by each decision tree.
  • max_depth (int, optional, default: null): Maximum depth of each tree. Leave blank for unconstrained depth.
  • min_samples_leaf (int, optional, default: 1): Minimum number of samples required in each leaf.
  • random_state (int, optional, default: null): Integer seed for reproducible tree sampling. Leave blank for the estimator default.

Returns (dict): Excel data type containing training accuracy, predictions, probabilities, and fitted feature importances.

Example 1: Fit a random forest classifier for two string-labeled groups

Inputs:

data target n_estimators rf_criterion max_depth min_samples_leaf random_state
0 0 cold 25 gini 3 1 0
0 1 cold
1 0 cold
2 2 hot
2 3 hot
3 2 hot

Excel formula:

=RF_CLASSIFY({0,0;0,1;1,0;2,2;2,3;3,2}, {"cold";"cold";"cold";"hot";"hot";"hot"}, 25, "gini", 3, 1, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"cold"},{"type":"Double","basicValue":3}],[{"type":"String","basicValue":"hot"},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0.04},{"type":"Double","basicValue":0.96}],[{"type":"Double","basicValue":0.04},{"type":"Double","basicValue":0.96}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":0.48}],[{"type":"Double","basicValue":0.52}]]},"estimator_count":{"type":"Double","basicValue":25}}}

Example 2: Use entropy splits for one-dimensional numeric labels

Inputs:

data target n_estimators rf_criterion max_depth min_samples_leaf random_state
0 0 25 entropy 3 1 0
0.2 0
0.4 0
1.2 1
1.4 1
1.6 1

Excel formula:

=RF_CLASSIFY({0;0.2;0.4;1.2;1.4;1.6}, {0;0;0;1;1;1}, 25, "entropy", 3, 1, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}]]},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"estimator_count":{"type":"Double","basicValue":25}}}

Example 3: Fit a random forest classifier for three separated groups

Inputs:

data target n_estimators rf_criterion max_depth min_samples_leaf random_state
0 0 left 25 gini 3 1 0
0.2 0.1 left
4 4 center
4.2 3.9 center
8 0 right
8.2 0.1 right

Excel formula:

=RF_CLASSIFY({0,0;0.2,0.1;4,4;4.2,3.9;8,0;8.2,0.1}, {"left";"left";"center";"center";"right";"right"}, 25, "gini", 3, 1, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":3},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"right"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"right"}],[{"type":"String","basicValue":"right"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"center"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"left"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"right"},{"type":"Double","basicValue":2}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":0.04},{"type":"Double","basicValue":0.92},{"type":"Double","basicValue":0.04}],[{"type":"Double","basicValue":0.04},{"type":"Double","basicValue":0.88},{"type":"Double","basicValue":0.08}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0.04},{"type":"Double","basicValue":0.96}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0.12},{"type":"Double","basicValue":0.88}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":0.603273}],[{"type":"Double","basicValue":0.396727}]]},"estimator_count":{"type":"Double","basicValue":25}}}

Example 4: Flatten a single-row boolean target range for random forest classification

Inputs:

data target n_estimators rf_criterion max_depth min_samples_leaf random_state
0 false false false true true true 25 gini 3 1 0
0.3
0.6
1.4
1.7
2

Excel formula:

=RF_CLASSIFY({0;0.3;0.6;1.4;1.7;2}, {FALSE,FALSE,FALSE,TRUE,TRUE,TRUE}, 25, "gini", 3, 1, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}]]},"predictions":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Boolean","basicValue":false},{"type":"Double","basicValue":3}],[{"type":"Boolean","basicValue":true},{"type":"Double","basicValue":3}]]},"probabilities":{"type":"Array","elements":[[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1}]]},"feature_importances":{"type":"Array","elements":[[{"type":"Double","basicValue":1}]]},"estimator_count":{"type":"Double","basicValue":25}}}

Python Code

Show Code
import numpy as np
from sklearn.ensemble import RandomForestClassifier as SklearnRandomForestClassifier

def rf_classify(data, target, n_estimators=100, rf_criterion='gini', max_depth=None, min_samples_leaf=1, random_state=None):
    """
    Fit a random forest classifier and return training predictions.

    See: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
        target (list[list]): Target labels as a single row, single column, or scalar when only one sample is present.
        n_estimators (int, optional): Number of trees in the forest. Default is 100.
        rf_criterion (str, optional): Split quality measure used by each decision tree. Valid options: Gini, Entropy, Log Loss. Default is 'gini'.
        max_depth (int, optional): Maximum depth of each tree. Leave blank for unconstrained depth. Default is None.
        min_samples_leaf (int, optional): Minimum number of samples required in each leaf. Default is 1.
        random_state (int, optional): Integer seed for reproducible tree sampling. Leave blank for the estimator default. Default is None.

    Returns:
        dict: Excel data type containing training accuracy, predictions, probabilities, and fitted feature importances.
    """
    def py(value):
        return value.item() if isinstance(value, np.generic) else value

    def cell(value):
        value = py(value)
        if isinstance(value, bool):
            return {"type": "Boolean", "basicValue": bool(value)}
        if isinstance(value, (int, float)) and not isinstance(value, bool):
            return {"type": "Double", "basicValue": float(value)}
        return {"type": "String", "basicValue": str(value)}

    def col(values):
        return [[cell(value)] for value in values]

    def mat(values):
        return [[cell(value) for value in row] for row in values]

    def parse_data(value):
        value = [[value]] if not isinstance(value, list) else value
        if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
            return None, "Error: data must be a non-empty 2D list"
        if len({len(row) for row in value}) != 1:
            return None, "Error: data must be a rectangular 2D list"
        data_np = np.array(value, dtype=float)
        if data_np.ndim != 2 or data_np.size == 0:
            return None, "Error: data must be a non-empty 2D list"
        if not np.isfinite(data_np).all():
            return None, "Error: data must contain only finite numeric values"
        return data_np, None

    def parse_target(value, sample_count):
        if not isinstance(value, list):
            labels = [value]
        elif not value:
            return None, "Error: target must be non-empty"
        elif all(not isinstance(item, list) for item in value):
            labels = value
        elif len(value) == 1:
            labels = value[0]
        elif all(isinstance(row, list) and len(row) == 1 for row in value):
            labels = [row[0] for row in value]
        else:
            return None, "Error: target must be a single row or column"

        if len(labels) != sample_count:
            return None, "Error: target length must match sample count"

        parsed = []
        classes = []
        for item in labels:
            item = py(item)
            if isinstance(item, str):
                if not item.strip():
                    return None, "Error: target labels must not be blank"
            elif isinstance(item, bool):
                item = bool(item)
            elif isinstance(item, (int, float)) and not isinstance(item, bool):
                if not np.isfinite(float(item)):
                    return None, "Error: target labels must be finite"
                item = float(item) if isinstance(item, float) else int(item)
            else:
                return None, "Error: target labels must be scalar string, boolean, or numeric values"
            parsed.append(item)
            if not any(type(existing) is type(item) and existing == item for existing in classes):
                classes.append(item)

        if len(classes) < 2:
            return None, "Error: target must contain at least 2 classes"
        return parsed, None

    def count_table(predictions, classes):
        rows = [[{"type": "String", "basicValue": "class"}, {"type": "String", "basicValue": "count"}]]
        for class_label in classes:
            count = sum(type(prediction) is type(class_label) and prediction == class_label for prediction in predictions)
            rows.append([cell(class_label), {"type": "Double", "basicValue": float(count)}])
        return rows

    try:
        data_np, error = parse_data(data)
        if error:
            return error

        target_values, error = parse_target(target, data_np.shape[0])
        if error:
            return error

        if int(n_estimators) < 1:
            return "Error: n_estimators must be at least 1"
        criterion_value = str(rf_criterion).strip().lower()
        if criterion_value not in {"gini", "entropy", "log_loss"}:
            return "Error: rf_criterion must be 'gini', 'entropy', or 'log_loss'"
        depth = None if max_depth in (None, "") else int(max_depth)
        if depth is not None and depth < 1:
            return "Error: max_depth must be at least 1 when provided"
        if int(min_samples_leaf) < 1:
            return "Error: min_samples_leaf must be at least 1"

        fitted = SklearnRandomForestClassifier(
            n_estimators=int(n_estimators),
            criterion=criterion_value,
            max_depth=depth,
            min_samples_leaf=int(min_samples_leaf),
            random_state=None if random_state in (None, "") else int(random_state)
        ).fit(data_np, target_values)

        prediction_array = fitted.predict(data_np)
        predictions = [py(item) for item in prediction_array.tolist()]
        classes = [py(item) for item in fitted.classes_.tolist()]
        accuracy = float(np.mean([
            type(prediction) is type(actual) and prediction == actual
            for prediction, actual in zip(predictions, target_values)
        ]))

        return {
            "type": "Double",
            "basicValue": accuracy,
            "properties": {
                "accuracy": {"type": "Double", "basicValue": accuracy},
                "sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
                "feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
                "class_count": {"type": "Double", "basicValue": float(len(classes))},
                "classes": {"type": "Array", "elements": col(classes)},
                "predictions": {"type": "Array", "elements": col(predictions)},
                "prediction_counts": {"type": "Array", "elements": count_table(predictions, classes)},
                "probabilities": {"type": "Array", "elements": mat(fitted.predict_proba(data_np).tolist())},
                "feature_importances": {"type": "Array", "elements": col(fitted.feature_importances_.tolist())},
                "estimator_count": {"type": "Double", "basicValue": float(len(fitted.estimators_))}
            }
        }
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

2D array of numeric feature data with rows as samples and columns as features.
Target labels as a single row, single column, or scalar when only one sample is present.
Number of trees in the forest.
Split quality measure used by each decision tree.
Maximum depth of each tree. Leave blank for unconstrained depth.
Minimum number of samples required in each leaf.
Integer seed for reproducible tree sampling. Leave blank for the estimator default.

RIDGE_CLASSIFY

Ridge classification converts the target labels to \{-1, 1\} (for binary) and treats the task as a regression problem with L2 regularization. The weights w are found by minimizing the penalized squared error:

\min_{w} \|Xw - y\|^2 + \alpha \|w\|^2

The predicted class is the one with the largest linear decision score. It is a fast baseline for dense tabular classification tasks when calibrated probabilities are not required.

This wrapper accepts rows as samples and a target supplied as a single row or single column. It returns training accuracy together with predicted labels, class counts, decision scores, and fitted coefficient arrays.

Excel Usage

=RIDGE_CLASSIFY(data, target, alpha, ridge_solver, fit_intercept, tol, random_state)
  • data (list[list], required): 2D array of numeric feature data with rows as samples and columns as features.
  • target (list[list], required): Target labels as a single row, single column, or scalar when only one sample is present.
  • alpha (float, optional, default: 1): L2 regularization strength applied to the classifier.
  • ridge_solver (str, optional, default: “auto”): Linear algebra solver used to fit the classifier.
  • fit_intercept (bool, optional, default: true): Whether to include an intercept term in the linear decision function.
  • tol (float, optional, default: 0.0001): Convergence tolerance for iterative solvers.
  • random_state (int, optional, default: null): Integer seed for stochastic solvers. Leave blank for the estimator default.

Returns (dict): Excel data type containing training accuracy, predictions, decision scores, and fitted coefficient arrays.

Example 1: Fit ridge classification for two string-labeled classes

Inputs:

data target alpha ridge_solver fit_intercept tol random_state
0 0 cold 1 auto true 0.0001 0
0 1 cold
1 0 cold
2 2 hot
2 3 hot
3 2 hot

Excel formula:

=RIDGE_CLASSIFY({0,0;0,1;1,0;2,2;2,3;3,2}, {"cold";"cold";"cold";"hot";"hot";"hot"}, 1, "auto", TRUE, 0.0001, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"cold"},{"type":"Double","basicValue":3}],[{"type":"String","basicValue":"hot"},{"type":"Double","basicValue":3}]]},"decision_scores":{"type":"Array","elements":[[{"type":"Double","basicValue":-1.17073}],[{"type":"Double","basicValue":-0.731707}],[{"type":"Double","basicValue":-0.731707}],[{"type":"Double","basicValue":0.585366}],[{"type":"Double","basicValue":1.02439}],[{"type":"Double","basicValue":1.02439}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":0.439024},{"type":"Double","basicValue":0.439024}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-1.17073}]]}}}

Example 2: Flatten a single-row numeric target for ridge classification

Inputs:

data target alpha ridge_solver fit_intercept tol random_state
0 0 0 0 1 1 1 1 auto true 0.0001 0
0.2
0.4
1.2
1.4
1.6

Excel formula:

=RIDGE_CLASSIFY({0;0.2;0.4;1.2;1.4;1.6}, {0,0,0,1,1,1}, 1, "auto", TRUE, 0.0001, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}]]},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"decision_scores":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.86747}],[{"type":"Double","basicValue":-0.650602}],[{"type":"Double","basicValue":-0.433735}],[{"type":"Double","basicValue":0.433735}],[{"type":"Double","basicValue":0.650602}],[{"type":"Double","basicValue":0.86747}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":1.08434}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.86747}]]}}}

Example 3: Fit ridge classification for three separated groups

Inputs:

data target alpha ridge_solver fit_intercept tol random_state
0 0 left 1 auto true 0.0001 0
0.2 0.1 left
4 4 center
4.2 3.9 center
8 0 right
8.2 0.1 right

Excel formula:

=RIDGE_CLASSIFY({0,0;0.2,0.1;4,4;4.2,3.9;8,0;8.2,0.1}, {"left";"left";"center";"center";"right";"right"}, 1, "auto", TRUE, 0.0001, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":3},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"right"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"right"}],[{"type":"String","basicValue":"right"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"center"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"left"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"right"},{"type":"Double","basicValue":2}]]},"decision_scores":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.992335},{"type":"Double","basicValue":1.00431},{"type":"Double","basicValue":-1.01198}],[{"type":"Double","basicValue":-0.943513},{"type":"Double","basicValue":0.930727},{"type":"Double","basicValue":-0.987214}],[{"type":"Double","basicValue":0.960875},{"type":"Double","basicValue":-0.955539},{"type":"Double","basicValue":-1.00534}],[{"type":"Double","basicValue":0.912022},{"type":"Double","basicValue":-0.980309},{"type":"Double","basicValue":-0.931713}],[{"type":"Double","basicValue":-0.992936},{"type":"Double","basicValue":-0.962803},{"type":"Double","basicValue":0.955739}],[{"type":"Double","basicValue":-0.944113},{"type":"Double","basicValue":-1.03639},{"type":"Double","basicValue":0.980501}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.0000750657},{"type":"Double","basicValue":0.488378}],[{"type":"Double","basicValue":-0.245889},{"type":"Double","basicValue":-0.244073}],[{"type":"Double","basicValue":0.245964},{"type":"Double","basicValue":-0.244304}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.992335}],[{"type":"Double","basicValue":1.00431}],[{"type":"Double","basicValue":-1.01198}]]}}}

Example 4: Fit ridge classification with boolean target labels

Inputs:

data target alpha ridge_solver fit_intercept tol random_state
0 false 1 auto true 0.0001 0
0.3 false
0.6 false
1.4 true
1.7 true
2 true

Excel formula:

=RIDGE_CLASSIFY({0;0.3;0.6;1.4;1.7;2}, {FALSE;FALSE;FALSE;TRUE;TRUE;TRUE}, 1, "auto", TRUE, 0.0001, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}]]},"predictions":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Boolean","basicValue":false},{"type":"Double","basicValue":3}],[{"type":"Boolean","basicValue":true},{"type":"Double","basicValue":3}]]},"decision_scores":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.976744}],[{"type":"Double","basicValue":-0.683721}],[{"type":"Double","basicValue":-0.390698}],[{"type":"Double","basicValue":0.390698}],[{"type":"Double","basicValue":0.683721}],[{"type":"Double","basicValue":0.976744}]]},"coefficients":{"type":"Array","elements":[[{"type":"Double","basicValue":0.976744}]]},"intercepts":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.976744}]]}}}

Python Code

Show Code
import numpy as np
from sklearn.linear_model import RidgeClassifier as SklearnRidgeClassifier

def ridge_classify(data, target, alpha=1, ridge_solver='auto', fit_intercept=True, tol=0.0001, random_state=None):
    """
    Fit a ridge classifier and return training predictions with decision scores.

    See: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.RidgeClassifier.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
        target (list[list]): Target labels as a single row, single column, or scalar when only one sample is present.
        alpha (float, optional): L2 regularization strength applied to the classifier. Default is 1.
        ridge_solver (str, optional): Linear algebra solver used to fit the classifier. Valid options: Auto, SVD, LSQR, SAG, SAGA. Default is 'auto'.
        fit_intercept (bool, optional): Whether to include an intercept term in the linear decision function. Default is True.
        tol (float, optional): Convergence tolerance for iterative solvers. Default is 0.0001.
        random_state (int, optional): Integer seed for stochastic solvers. Leave blank for the estimator default. Default is None.

    Returns:
        dict: Excel data type containing training accuracy, predictions, decision scores, and fitted coefficient arrays.
    """
    def py(value):
        return value.item() if isinstance(value, np.generic) else value

    def cell(value):
        value = py(value)
        if isinstance(value, bool):
            return {"type": "Boolean", "basicValue": bool(value)}
        if isinstance(value, (int, float)) and not isinstance(value, bool):
            return {"type": "Double", "basicValue": float(value)}
        return {"type": "String", "basicValue": str(value)}

    def col(values):
        return [[cell(value)] for value in values]

    def mat(values):
        return [[cell(value) for value in row] for row in values]

    def parse_data(value):
        value = [[value]] if not isinstance(value, list) else value
        if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
            return None, "Error: data must be a non-empty 2D list"
        if len({len(row) for row in value}) != 1:
            return None, "Error: data must be a rectangular 2D list"
        data_np = np.array(value, dtype=float)
        if data_np.ndim != 2 or data_np.size == 0:
            return None, "Error: data must be a non-empty 2D list"
        if not np.isfinite(data_np).all():
            return None, "Error: data must contain only finite numeric values"
        return data_np, None

    def parse_target(value, sample_count):
        if not isinstance(value, list):
            labels = [value]
        elif not value:
            return None, "Error: target must be non-empty"
        elif all(not isinstance(item, list) for item in value):
            labels = value
        elif len(value) == 1:
            labels = value[0]
        elif all(isinstance(row, list) and len(row) == 1 for row in value):
            labels = [row[0] for row in value]
        else:
            return None, "Error: target must be a single row or column"

        if len(labels) != sample_count:
            return None, "Error: target length must match sample count"

        parsed = []
        classes = []
        for item in labels:
            item = py(item)
            if isinstance(item, str):
                if not item.strip():
                    return None, "Error: target labels must not be blank"
            elif isinstance(item, bool):
                item = bool(item)
            elif isinstance(item, (int, float)) and not isinstance(item, bool):
                if not np.isfinite(float(item)):
                    return None, "Error: target labels must be finite"
                item = float(item) if isinstance(item, float) else int(item)
            else:
                return None, "Error: target labels must be scalar string, boolean, or numeric values"
            parsed.append(item)
            if not any(type(existing) is type(item) and existing == item for existing in classes):
                classes.append(item)

        if len(classes) < 2:
            return None, "Error: target must contain at least 2 classes"
        return parsed, None

    def count_table(predictions, classes):
        rows = [[{"type": "String", "basicValue": "class"}, {"type": "String", "basicValue": "count"}]]
        for class_label in classes:
            count = sum(type(prediction) is type(class_label) and prediction == class_label for prediction in predictions)
            rows.append([cell(class_label), {"type": "Double", "basicValue": float(count)}])
        return rows

    def score_rows(values):
        values = np.asarray(values)
        return [[float(value)] for value in values.tolist()] if values.ndim == 1 else values.tolist()

    try:
        data_np, error = parse_data(data)
        if error:
            return error

        target_values, error = parse_target(target, data_np.shape[0])
        if error:
            return error

        solver_value = str(ridge_solver).strip().lower()
        if solver_value not in {"auto", "svd", "lsqr", "sag", "saga"}:
            return "Error: solver must be 'auto', 'svd', 'lsqr', 'sag', or 'saga'"
        if float(alpha) < 0:
            return "Error: alpha must be non-negative"
        if float(tol) <= 0:
            return "Error: tol must be greater than 0"

        fitted = SklearnRidgeClassifier(
            alpha=float(alpha),
            solver=solver_value,
            fit_intercept=bool(fit_intercept),
            tol=float(tol),
            random_state=None if random_state in (None, "") else int(random_state)
        ).fit(data_np, target_values)

        prediction_array = fitted.predict(data_np)
        predictions = [py(item) for item in prediction_array.tolist()]
        classes = [py(item) for item in fitted.classes_.tolist()]
        accuracy = float(np.mean([
            type(prediction) is type(actual) and prediction == actual
            for prediction, actual in zip(predictions, target_values)
        ]))

        return {
            "type": "Double",
            "basicValue": accuracy,
            "properties": {
                "accuracy": {"type": "Double", "basicValue": accuracy},
                "sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
                "feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
                "class_count": {"type": "Double", "basicValue": float(len(classes))},
                "classes": {"type": "Array", "elements": col(classes)},
                "predictions": {"type": "Array", "elements": col(predictions)},
                "prediction_counts": {"type": "Array", "elements": count_table(predictions, classes)},
                "decision_scores": {"type": "Array", "elements": mat(score_rows(fitted.decision_function(data_np)))},
                "coefficients": {"type": "Array", "elements": mat(np.atleast_2d(fitted.coef_).tolist())},
                "intercepts": {"type": "Array", "elements": col(np.atleast_1d(fitted.intercept_).tolist())}
            }
        }
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

2D array of numeric feature data with rows as samples and columns as features.
Target labels as a single row, single column, or scalar when only one sample is present.
L2 regularization strength applied to the classifier.
Linear algebra solver used to fit the classifier.
Whether to include an intercept term in the linear decision function.
Convergence tolerance for iterative solvers.
Integer seed for stochastic solvers. Leave blank for the estimator default.

SVC_CLASSIFY

Support vector classification (SVC) fits separating hyperplanes that maximize the margin between classes. By using the “kernel trick,” SVC can efficiently fit curved decision boundaries in higher-dimensional spaces. The decision function for a sample x is:

f(x) = \sum_{i=1}^n \alpha_i y_i K(x_i, x) + b

where K(x_i, x) is the kernel function. For the common Radial Basis Function (RBF) kernel, it is defined as:

K(x, x') = \exp(-\gamma \|x - x'\|^2)

This wrapper accepts rows as samples and a target supplied as a single row or single column. It returns training accuracy together with predicted labels, class counts, decision scores, and support-vector summary properties.

Excel Usage

=SVC_CLASSIFY(data, target, C, svc_kernel, degree, svc_gamma, tol, random_state)
  • data (list[list], required): 2D array of numeric feature data with rows as samples and columns as features.
  • target (list[list], required): Target labels as a single row, single column, or scalar when only one sample is present.
  • C (float, optional, default: 1): Inverse regularization strength. Smaller values apply stronger regularization.
  • svc_kernel (str, optional, default: “rbf”): Kernel function used to build the separating boundary.
  • degree (int, optional, default: 3): Polynomial degree when the polynomial kernel is used.
  • svc_gamma (str, optional, default: “scale”): Gamma scaling mode for non-linear kernels.
  • tol (float, optional, default: 0.001): Convergence tolerance for the optimizer.
  • random_state (int, optional, default: null): Integer seed for operations that use randomness. Leave blank for the estimator default.

Returns (dict): Excel data type containing training accuracy, predictions, decision scores, and support-vector summary properties.

Example 1: Fit an RBF support vector classifier for two string-labeled groups

Inputs:

data target C svc_kernel degree svc_gamma tol random_state
0 0 cold 1 rbf 3 scale 0.001 0
0 1 cold
1 0 cold
2 2 hot
2 3 hot
3 2 hot

Excel formula:

=SVC_CLASSIFY({0,0;0,1;1,0;2,2;2,3;3,2}, {"cold";"cold";"cold";"hot";"hot";"hot"}, 1, "rbf", 3, "scale", 0.001, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"cold"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}],[{"type":"String","basicValue":"hot"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"cold"},{"type":"Double","basicValue":3}],[{"type":"String","basicValue":"hot"},{"type":"Double","basicValue":3}]]},"decision_scores":{"type":"Array","elements":[[{"type":"Double","basicValue":-0.999733}],[{"type":"Double","basicValue":-0.999851}],[{"type":"Double","basicValue":-0.999782}],[{"type":"Double","basicValue":0.999772}],[{"type":"Double","basicValue":0.999772}],[{"type":"Double","basicValue":0.999822}]]},"support_vector_count":{"type":"Double","basicValue":6},"support_counts":{"type":"Array","elements":[[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":3}]]},"support_indices":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":4}],[{"type":"Double","basicValue":5}]]}}}

Example 2: Fit a linear support vector classifier for one-dimensional numeric labels

Inputs:

data target C svc_kernel degree svc_gamma tol random_state
0 0 1 linear 3 scale 0.001 0
0.2 0
0.4 0
1.2 1
1.4 1
1.6 1

Excel formula:

=SVC_CLASSIFY({0;0.2;0.4;1.2;1.4;1.6}, {0;0;0;1;1;1}, 1, "linear", 3, "scale", 0.001, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}]]},"predictions":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"decision_scores":{"type":"Array","elements":[[{"type":"Double","basicValue":-1.33333}],[{"type":"Double","basicValue":-1}],[{"type":"Double","basicValue":-0.666667}],[{"type":"Double","basicValue":0.666667}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1.33333}]]},"support_vector_count":{"type":"Double","basicValue":4},"support_counts":{"type":"Array","elements":[[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":2}]]},"support_indices":{"type":"Array","elements":[[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":4}]]}}}

Example 3: Fit a support vector classifier for three separated groups

Inputs:

data target C svc_kernel degree svc_gamma tol random_state
0 0 left 1 rbf 3 scale 0.001 0
0.2 0.1 left
4 4 center
4.2 3.9 center
8 0 right
8.2 0.1 right

Excel formula:

=SVC_CLASSIFY({0,0;0.2,0.1;4,4;4.2,3.9;8,0;8.2,0.1}, {"left";"left";"center";"center";"right";"right"}, 1, "rbf", 3, "scale", 0.001, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":2},"class_count":{"type":"Double","basicValue":3},"classes":{"type":"Array","elements":[[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"right"}]]},"predictions":{"type":"Array","elements":[[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"left"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"center"}],[{"type":"String","basicValue":"right"}],[{"type":"String","basicValue":"right"}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"String","basicValue":"center"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"left"},{"type":"Double","basicValue":2}],[{"type":"String","basicValue":"right"},{"type":"Double","basicValue":2}]]},"decision_scores":{"type":"Array","elements":[[{"type":"Double","basicValue":0.847182},{"type":"Double","basicValue":2.22222},{"type":"Double","basicValue":-0.178542}],[{"type":"Double","basicValue":0.851956},{"type":"Double","basicValue":2.22115},{"type":"Double","basicValue":-0.17988}],[{"type":"Double","basicValue":2.22194},{"type":"Double","basicValue":0.835983},{"type":"Double","basicValue":-0.168615}],[{"type":"Double","basicValue":2.22125},{"type":"Double","basicValue":-0.167408},{"type":"Double","basicValue":0.836299}],[{"type":"Double","basicValue":0.850081},{"type":"Double","basicValue":-0.17988},{"type":"Double","basicValue":2.22184}],[{"type":"Double","basicValue":0.848998},{"type":"Double","basicValue":-0.179853},{"type":"Double","basicValue":2.22222}]]},"support_vector_count":{"type":"Double","basicValue":6},"support_counts":{"type":"Array","elements":[[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":2}]]},"support_indices":{"type":"Array","elements":[[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":4}],[{"type":"Double","basicValue":5}]]}}}

Example 4: Flatten a single-row boolean target range for support vector classification

Inputs:

data target C svc_kernel degree svc_gamma tol random_state
0 false false false true true true 1 linear 3 scale 0.001 0
0.3
0.6
1.4
1.7
2

Excel formula:

=SVC_CLASSIFY({0;0.3;0.6;1.4;1.7;2}, {FALSE,FALSE,FALSE,TRUE,TRUE,TRUE}, 1, "linear", 3, "scale", 0.001, 0)

Expected output:

{"type":"Double","basicValue":1,"properties":{"accuracy":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":1},"class_count":{"type":"Double","basicValue":2},"classes":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}]]},"predictions":{"type":"Array","elements":[[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":false}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}],[{"type":"Boolean","basicValue":true}]]},"prediction_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"class"},{"type":"String","basicValue":"count"}],[{"type":"Boolean","basicValue":false},{"type":"Double","basicValue":3}],[{"type":"Boolean","basicValue":true},{"type":"Double","basicValue":3}]]},"decision_scores":{"type":"Array","elements":[[{"type":"Double","basicValue":-1.42857}],[{"type":"Double","basicValue":-1}],[{"type":"Double","basicValue":-0.571429}],[{"type":"Double","basicValue":0.571428}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1.42857}]]},"support_vector_count":{"type":"Double","basicValue":4},"support_counts":{"type":"Array","elements":[[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":2}]]},"support_indices":{"type":"Array","elements":[[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":2}],[{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":4}]]}}}

Python Code

Show Code
import numpy as np
from sklearn.svm import SVC as SklearnSVC

def svc_classify(data, target, C=1, svc_kernel='rbf', degree=3, svc_gamma='scale', tol=0.001, random_state=None):
    """
    Fit a support vector classifier and return training predictions.

    See: https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): 2D array of numeric feature data with rows as samples and columns as features.
        target (list[list]): Target labels as a single row, single column, or scalar when only one sample is present.
        C (float, optional): Inverse regularization strength. Smaller values apply stronger regularization. Default is 1.
        svc_kernel (str, optional): Kernel function used to build the separating boundary. Valid options: RBF, Linear, Polynomial, Sigmoid. Default is 'rbf'.
        degree (int, optional): Polynomial degree when the polynomial kernel is used. Default is 3.
        svc_gamma (str, optional): Gamma scaling mode for non-linear kernels. Valid options: Scale, Auto. Default is 'scale'.
        tol (float, optional): Convergence tolerance for the optimizer. Default is 0.001.
        random_state (int, optional): Integer seed for operations that use randomness. Leave blank for the estimator default. Default is None.

    Returns:
        dict: Excel data type containing training accuracy, predictions, decision scores, and support-vector summary properties.
    """
    def py(value):
        return value.item() if isinstance(value, np.generic) else value

    def cell(value):
        value = py(value)
        if isinstance(value, bool):
            return {"type": "Boolean", "basicValue": bool(value)}
        if isinstance(value, (int, float)) and not isinstance(value, bool):
            return {"type": "Double", "basicValue": float(value)}
        return {"type": "String", "basicValue": str(value)}

    def col(values):
        return [[cell(value)] for value in values]

    def mat(values):
        return [[cell(value) for value in row] for row in values]

    def parse_data(value):
        value = [[value]] if not isinstance(value, list) else value
        if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
            return None, "Error: data must be a non-empty 2D list"
        if len({len(row) for row in value}) != 1:
            return None, "Error: data must be a rectangular 2D list"
        data_np = np.array(value, dtype=float)
        if data_np.ndim != 2 or data_np.size == 0:
            return None, "Error: data must be a non-empty 2D list"
        if not np.isfinite(data_np).all():
            return None, "Error: data must contain only finite numeric values"
        return data_np, None

    def parse_target(value, sample_count):
        if not isinstance(value, list):
            labels = [value]
        elif not value:
            return None, "Error: target must be non-empty"
        elif all(not isinstance(item, list) for item in value):
            labels = value
        elif len(value) == 1:
            labels = value[0]
        elif all(isinstance(row, list) and len(row) == 1 for row in value):
            labels = [row[0] for row in value]
        else:
            return None, "Error: target must be a single row or column"

        if len(labels) != sample_count:
            return None, "Error: target length must match sample count"

        parsed = []
        classes = []
        for item in labels:
            item = py(item)
            if isinstance(item, str):
                if not item.strip():
                    return None, "Error: target labels must not be blank"
            elif isinstance(item, bool):
                item = bool(item)
            elif isinstance(item, (int, float)) and not isinstance(item, bool):
                if not np.isfinite(float(item)):
                    return None, "Error: target labels must be finite"
                item = float(item) if isinstance(item, float) else int(item)
            else:
                return None, "Error: target labels must be scalar string, boolean, or numeric values"
            parsed.append(item)
            if not any(type(existing) is type(item) and existing == item for existing in classes):
                classes.append(item)

        if len(classes) < 2:
            return None, "Error: target must contain at least 2 classes"
        return parsed, None

    def count_table(predictions, classes):
        rows = [[{"type": "String", "basicValue": "class"}, {"type": "String", "basicValue": "count"}]]
        for class_label in classes:
            count = sum(type(prediction) is type(class_label) and prediction == class_label for prediction in predictions)
            rows.append([cell(class_label), {"type": "Double", "basicValue": float(count)}])
        return rows

    def score_rows(values):
        values = np.asarray(values)
        return [[float(value)] for value in values.tolist()] if values.ndim == 1 else values.tolist()

    try:
        data_np, error = parse_data(data)
        if error:
            return error

        target_values, error = parse_target(target, data_np.shape[0])
        if error:
            return error

        if float(C) <= 0:
            return "Error: C must be greater than 0"
        kernel_value = str(svc_kernel).strip().lower()
        if kernel_value not in {"rbf", "linear", "poly", "sigmoid"}:
            return "Error: svc_kernel must be 'rbf', 'linear', 'poly', or 'sigmoid'"
        if int(degree) < 1:
            return "Error: degree must be at least 1"
        gamma_value = str(svc_gamma).strip().lower()
        if gamma_value not in {"scale", "auto"}:
            return "Error: svc_gamma must be 'scale' or 'auto'"
        if float(tol) <= 0:
            return "Error: tol must be greater than 0"

        fitted = SklearnSVC(
            C=float(C),
            kernel=kernel_value,
            degree=int(degree),
            gamma=gamma_value,
            tol=float(tol),
            random_state=None if random_state in (None, "") else int(random_state)
        ).fit(data_np, target_values)

        prediction_array = fitted.predict(data_np)
        predictions = [py(item) for item in prediction_array.tolist()]
        classes = [py(item) for item in fitted.classes_.tolist()]
        accuracy = float(np.mean([
            type(prediction) is type(actual) and prediction == actual
            for prediction, actual in zip(predictions, target_values)
        ]))

        return {
            "type": "Double",
            "basicValue": accuracy,
            "properties": {
                "accuracy": {"type": "Double", "basicValue": accuracy},
                "sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
                "feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
                "class_count": {"type": "Double", "basicValue": float(len(classes))},
                "classes": {"type": "Array", "elements": col(classes)},
                "predictions": {"type": "Array", "elements": col(predictions)},
                "prediction_counts": {"type": "Array", "elements": count_table(predictions, classes)},
                "decision_scores": {"type": "Array", "elements": mat(score_rows(fitted.decision_function(data_np)))},
                "support_vector_count": {"type": "Double", "basicValue": float(len(fitted.support_))},
                "support_counts": {"type": "Array", "elements": col(fitted.n_support_.tolist())},
                "support_indices": {"type": "Array", "elements": col(fitted.support_.tolist())}
            }
        }
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

2D array of numeric feature data with rows as samples and columns as features.
Target labels as a single row, single column, or scalar when only one sample is present.
Inverse regularization strength. Smaller values apply stronger regularization.
Kernel function used to build the separating boundary.
Polynomial degree when the polynomial kernel is used.
Gamma scaling mode for non-linear kernels.
Convergence tolerance for the optimizer.
Integer seed for operations that use randomness. Leave blank for the estimator default.