NMF_FACTOR

Nonnegative matrix factorization decomposes a nonnegative data matrix into a product of two smaller nonnegative matrices. It is commonly used for topic-like count matrices, additive parts-based representations, and compact nonnegative latent factors.

The decomposition approximates the data matrix V:

V \approx W H

where W contains the non-negative factor weights and H contains the non-negative components.

This wrapper accepts rows as samples and columns as features. It returns the sample-by-component weight matrix together with the learned component matrix, reconstruction error, and iteration count for the fitted factorization.

Excel Usage

=NMF_FACTOR(data, n_components, nmf_init, nmf_solver, max_iter, random_state)
  • data (list[list], required): 2D array of nonnegative numeric input data with rows as samples and columns as features.
  • n_components (int, optional, default: 2): Number of latent components to learn.
  • nmf_init (str, optional, default: “nndsvda”): Initialization strategy for the nonnegative factors.
  • nmf_solver (str, optional, default: “cd”): Numerical solver used to optimize the factorization.
  • max_iter (int, optional, default: 400): Maximum number of optimization iterations.
  • random_state (int, optional, default: null): Integer seed used by randomized initialization paths. Leave blank for the estimator default.

Returns (dict): Excel data type containing factor weights, learned components, and reconstruction error.

Example 1: Factor a small count-like matrix into two nonnegative components

Inputs:

data n_components nmf_init nmf_solver max_iter random_state
4 1 0 0 2 nndsvda cd 600 0
5 1 0 0
0 0 3 4
0 0 4 5
3 1 0 0
0 0 5 4

Excel formula:

=NMF_FACTOR({4,1,0,0;5,1,0,0;0,0,3,4;0,0,4,5;3,1,0,0;0,0,5,4}, 2, "nndsvda", "cd", 600, 0)

Expected output:

{"type":"Double","basicValue":1.22303,"properties":{"reconstruction_error":{"type":"Double","basicValue":1.22303},"component_count":{"type":"Double","basicValue":2},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":4},"weights":{"type":"Array","elements":[[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1.23956}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1.53187}],[{"type":"Double","basicValue":0.32396},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0.416078},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0.947249}],[{"type":"Double","basicValue":0.412981},{"type":"Double","basicValue":3.06249e-8}]]},"components":{"type":"Array","elements":[[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0},{"type":"Double","basicValue":10.4789},{"type":"Double","basicValue":11.2079}],[{"type":"Double","basicValue":3.23389},{"type":"Double","basicValue":0.777898},{"type":"Double","basicValue":4.30763e-9},{"type":"Double","basicValue":0}]]},"n_iter":{"type":"Double","basicValue":7}}}

Example 2: Fit one nonnegative factor on a one-trend matrix

Inputs:

data n_components nmf_init nmf_solver max_iter random_state
1 2 1 nndsvda cd 600 0
2 4
3 6
4 8
5 10

Excel formula:

=NMF_FACTOR({1,2;2,4;3,6;4,8;5,10}, 1, "nndsvda", "cd", 600, 0)

Expected output:

{"type":"Double","basicValue":2.49492e-15,"properties":{"reconstruction_error":{"type":"Double","basicValue":2.49492e-15},"component_count":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":5},"feature_count":{"type":"Double","basicValue":2},"weights":{"type":"Array","elements":[[{"type":"Double","basicValue":0.5491}],[{"type":"Double","basicValue":1.0982}],[{"type":"Double","basicValue":1.6473}],[{"type":"Double","basicValue":2.1964}],[{"type":"Double","basicValue":2.7455}]]},"components":{"type":"Array","elements":[[{"type":"Double","basicValue":1.82116},{"type":"Double","basicValue":3.64232}]]},"n_iter":{"type":"Double","basicValue":5}}}

Example 3: Use seeded random initialization for two-factor fitting

Inputs:

data n_components nmf_init nmf_solver max_iter random_state
2 0 1 2 random cd 600 7
3 0 1
0 4 2
0 5 3
2 0 2
0 4 3

Excel formula:

=NMF_FACTOR({2,0,1;3,0,1;0,4,2;0,5,3;2,0,2;0,4,3}, 2, "random", "cd", 600, 7)

Expected output:

{"type":"Double","basicValue":1.06794,"properties":{"reconstruction_error":{"type":"Double","basicValue":1.06794},"component_count":{"type":"Double","basicValue":2},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":3},"weights":{"type":"Array","elements":[[{"type":"Double","basicValue":0.911616},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1.27198},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3.66795}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":4.80079}],[{"type":"Double","basicValue":1.06152},{"type":"Double","basicValue":0.337401}],[{"type":"Double","basicValue":0.0791118},{"type":"Double","basicValue":4.06045}]]},"components":{"type":"Array","elements":[[{"type":"Double","basicValue":2.16697},{"type":"Double","basicValue":0},{"type":"Double","basicValue":1.14776}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1.03418},{"type":"Double","basicValue":0.63678}]]},"n_iter":{"type":"Double","basicValue":10}}}

Example 4: Fit a nonnegative matrix with the multiplicative update solver

Inputs:

data n_components nmf_init nmf_solver max_iter random_state
3 1 0 2 nndsvda mu 800 0
4 1 0
0 2 5
0 3 6
3 1 1
0 2 4

Excel formula:

=NMF_FACTOR({3,1,0;4,1,0;0,2,5;0,3,6;3,1,1;0,2,4}, 2, "nndsvda", "mu", 800, 0)

Expected output:

{"type":"Double","basicValue":0.471063,"properties":{"reconstruction_error":{"type":"Double","basicValue":0.471063},"component_count":{"type":"Double","basicValue":2},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":3},"weights":{"type":"Array","elements":[[{"type":"Double","basicValue":0.0169444},{"type":"Double","basicValue":1.07715}],[{"type":"Double","basicValue":1.2837e-8},{"type":"Double","basicValue":1.41179}],[{"type":"Double","basicValue":1.17558},{"type":"Double","basicValue":0.0000200891}],[{"type":"Double","basicValue":1.46493},{"type":"Double","basicValue":0.0153092}],[{"type":"Double","basicValue":0.218198},{"type":"Double","basicValue":1.04456}],[{"type":"Double","basicValue":0.976617},{"type":"Double","basicValue":0.0102081}]]},"components":{"type":"Array","elements":[[{"type":"Double","basicValue":2.41145e-13},{"type":"Double","basicValue":1.92801},{"type":"Double","basicValue":4.14787}],[{"type":"Double","basicValue":2.82985},{"type":"Double","basicValue":0.721546},{"type":"Double","basicValue":0.0127197}]]},"n_iter":{"type":"Double","basicValue":70}}}

Python Code

import numpy as np
from sklearn.decomposition import NMF as SklearnNMF

def nmf_factor(data, n_components=2, nmf_init='nndsvda', nmf_solver='cd', max_iter=400, random_state=None):
    """
    Fit nonnegative matrix factorization and return factor weights with learned components.

    See: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): 2D array of nonnegative numeric input data with rows as samples and columns as features.
        n_components (int, optional): Number of latent components to learn. Default is 2.
        nmf_init (str, optional): Initialization strategy for the nonnegative factors. Valid options: NNDSVDa, NNDSVD, NNDSVDar, Random. Default is 'nndsvda'.
        nmf_solver (str, optional): Numerical solver used to optimize the factorization. Valid options: Coordinate Descent, Multiplicative Update. Default is 'cd'.
        max_iter (int, optional): Maximum number of optimization iterations. Default is 400.
        random_state (int, optional): Integer seed used by randomized initialization paths. Leave blank for the estimator default. Default is None.

    Returns:
        dict: Excel data type containing factor weights, learned components, and reconstruction error.
    """
    def py(value):
        return value.item() if isinstance(value, np.generic) else value

    def cell(value):
        value = py(value)
        if isinstance(value, bool):
            return {"type": "Boolean", "basicValue": bool(value)}
        if isinstance(value, (int, float)) and not isinstance(value, bool):
            return {"type": "Double", "basicValue": float(value)}
        return {"type": "String", "basicValue": str(value)}

    def col(values):
        return [[cell(value)] for value in values]

    def mat(values):
        return [[cell(value) for value in row] for row in values]

    def parse_data(value):
        value = [[value]] if not isinstance(value, list) else value
        if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
            return None, "Error: data must be a non-empty 2D list"
        if len({len(row) for row in value}) != 1:
            return None, "Error: data must be a rectangular 2D list"
        data_np = np.array(value, dtype=float)
        if data_np.ndim != 2 or data_np.size == 0:
            return None, "Error: data must be a non-empty 2D list"
        if not np.isfinite(data_np).all():
            return None, "Error: data must contain only finite numeric values"
        if np.any(data_np < 0):
            return None, "Error: data must contain only nonnegative numeric values"
        if data_np.shape[0] < 2:
            return None, "Error: data must contain at least 2 samples"
        return data_np, None

    try:
        data_np, error = parse_data(data)
        if error:
            return error

        component_total = int(n_components)
        max_components = min(data_np.shape[0], data_np.shape[1])
        if component_total < 1 or component_total > max_components:
            return f"Error: n_components must be between 1 and {max_components}"

        init_value = str(nmf_init).strip().lower()
        if init_value not in {"nndsvda", "nndsvd", "nndsvdar", "random"}:
            return "Error: nmf_init must be 'nndsvda', 'nndsvd', 'nndsvdar', or 'random'"

        solver_value = str(nmf_solver).strip().lower()
        if solver_value not in {"cd", "mu"}:
            return "Error: nmf_solver must be 'cd' or 'mu'"

        if int(max_iter) < 1:
            return "Error: max_iter must be at least 1"

        fitted = SklearnNMF(
            n_components=component_total,
            init=init_value,
            solver=solver_value,
            max_iter=int(max_iter),
            random_state=None if random_state in (None, "") else int(random_state)
        )

        weights_np = np.asarray(fitted.fit_transform(data_np), dtype=float)
        components_np = np.asarray(fitted.components_, dtype=float)
        reconstruction_error = float(fitted.reconstruction_err_)

        return {
            "type": "Double",
            "basicValue": reconstruction_error,
            "properties": {
                "reconstruction_error": {"type": "Double", "basicValue": reconstruction_error},
                "component_count": {"type": "Double", "basicValue": float(components_np.shape[0])},
                "sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
                "feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
                "weights": {"type": "Array", "elements": mat(weights_np.tolist())},
                "components": {"type": "Array", "elements": mat(components_np.tolist())},
                "n_iter": {"type": "Double", "basicValue": float(fitted.n_iter_)}
            }
        }
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

2D array of nonnegative numeric input data with rows as samples and columns as features.
Number of latent components to learn.
Initialization strategy for the nonnegative factors.
Numerical solver used to optimize the factorization.
Maximum number of optimization iterations.
Integer seed used by randomized initialization paths. Leave blank for the estimator default.