NMF_FACTOR

Nonnegative matrix factorization decomposes a nonnegative data matrix into a product of two smaller nonnegative matrices. It is commonly used for topic-like count matrices, additive parts-based representations, and compact nonnegative latent factors.

The decomposition approximates the data matrix V:

V \approx W H

where W contains the non-negative factor weights and H contains the non-negative components.

This wrapper accepts rows as samples and columns as features. It returns the sample-by-component weight matrix together with the learned component matrix, reconstruction error, and iteration count for the fitted factorization.

Excel Usage

=NMF_FACTOR(data, n_components, nmf_init, nmf_solver, max_iter, random_state)

data (list[list], required): 2D array of nonnegative numeric input data with rows as samples and columns as features.
n_components (int, optional, default: 2): Number of latent components to learn.
nmf_init (str, optional, default: “nndsvda”): Initialization strategy for the nonnegative factors.
nmf_solver (str, optional, default: “cd”): Numerical solver used to optimize the factorization.
max_iter (int, optional, default: 400): Maximum number of optimization iterations.
random_state (int, optional, default: null): Integer seed used by randomized initialization paths. Leave blank for the estimator default.

Returns (dict): Excel data type containing factor weights, learned components, and reconstruction error.

Example 1: Factor a small count-like matrix into two nonnegative components

Inputs:

data				n_components	nmf_init	nmf_solver	max_iter	random_state
4	1	0	0	2	nndsvda	cd	600	0
5	1	0	0
0	0	3	4
0	0	4	5
3	1	0	0
0	0	5	4

Excel formula:

=NMF_FACTOR({4,1,0,0;5,1,0,0;0,0,3,4;0,0,4,5;3,1,0,0;0,0,5,4}, 2, "nndsvda", "cd", 600, 0)

Expected output:

{"type":"Double","basicValue":1.22303,"properties":{"reconstruction_error":{"type":"Double","basicValue":1.22303},"component_count":{"type":"Double","basicValue":2},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":4},"weights":{"type":"Array","elements":[[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1.23956}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1.53187}],[{"type":"Double","basicValue":0.32396},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0.416078},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0.947249}],[{"type":"Double","basicValue":0.412981},{"type":"Double","basicValue":3.06249e-8}]]},"components":{"type":"Array","elements":[[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0},{"type":"Double","basicValue":10.4789},{"type":"Double","basicValue":11.2079}],[{"type":"Double","basicValue":3.23389},{"type":"Double","basicValue":0.777898},{"type":"Double","basicValue":4.30763e-9},{"type":"Double","basicValue":0}]]},"n_iter":{"type":"Double","basicValue":7}}}

Example 2: Fit one nonnegative factor on a one-trend matrix

Inputs:

data		n_components	nmf_init	nmf_solver	max_iter	random_state
1	2	1	nndsvda	cd	600	0
2	4
3	6
4	8
5	10

Excel formula:

=NMF_FACTOR({1,2;2,4;3,6;4,8;5,10}, 1, "nndsvda", "cd", 600, 0)

Expected output:

{"type":"Double","basicValue":2.49492e-15,"properties":{"reconstruction_error":{"type":"Double","basicValue":2.49492e-15},"component_count":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":5},"feature_count":{"type":"Double","basicValue":2},"weights":{"type":"Array","elements":[[{"type":"Double","basicValue":0.5491}],[{"type":"Double","basicValue":1.0982}],[{"type":"Double","basicValue":1.6473}],[{"type":"Double","basicValue":2.1964}],[{"type":"Double","basicValue":2.7455}]]},"components":{"type":"Array","elements":[[{"type":"Double","basicValue":1.82116},{"type":"Double","basicValue":3.64232}]]},"n_iter":{"type":"Double","basicValue":5}}}

Example 3: Use seeded random initialization for two-factor fitting

Inputs:

data			n_components	nmf_init	nmf_solver	max_iter	random_state
2	0	1	2	random	cd	600	7
3	0	1
0	4	2
0	5	3
2	0	2
0	4	3

Excel formula:

=NMF_FACTOR({2,0,1;3,0,1;0,4,2;0,5,3;2,0,2;0,4,3}, 2, "random", "cd", 600, 7)

Expected output:

{"type":"Double","basicValue":1.06794,"properties":{"reconstruction_error":{"type":"Double","basicValue":1.06794},"component_count":{"type":"Double","basicValue":2},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":3},"weights":{"type":"Array","elements":[[{"type":"Double","basicValue":0.911616},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1.27198},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3.66795}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":4.80079}],[{"type":"Double","basicValue":1.06152},{"type":"Double","basicValue":0.337401}],[{"type":"Double","basicValue":0.0791118},{"type":"Double","basicValue":4.06045}]]},"components":{"type":"Array","elements":[[{"type":"Double","basicValue":2.16697},{"type":"Double","basicValue":0},{"type":"Double","basicValue":1.14776}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1.03418},{"type":"Double","basicValue":0.63678}]]},"n_iter":{"type":"Double","basicValue":10}}}

Example 4: Fit a nonnegative matrix with the multiplicative update solver

Inputs:

data			n_components	nmf_init	nmf_solver	max_iter	random_state
3	1	0	2	nndsvda	mu	800	0
4	1	0
0	2	5
0	3	6
3	1	1
0	2	4

Excel formula:

=NMF_FACTOR({3,1,0;4,1,0;0,2,5;0,3,6;3,1,1;0,2,4}, 2, "nndsvda", "mu", 800, 0)

Expected output:

{"type":"Double","basicValue":0.471063,"properties":{"reconstruction_error":{"type":"Double","basicValue":0.471063},"component_count":{"type":"Double","basicValue":2},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":3},"weights":{"type":"Array","elements":[[{"type":"Double","basicValue":0.0169444},{"type":"Double","basicValue":1.07715}],[{"type":"Double","basicValue":1.2837e-8},{"type":"Double","basicValue":1.41179}],[{"type":"Double","basicValue":1.17558},{"type":"Double","basicValue":0.0000200891}],[{"type":"Double","basicValue":1.46493},{"type":"Double","basicValue":0.0153092}],[{"type":"Double","basicValue":0.218198},{"type":"Double","basicValue":1.04456}],[{"type":"Double","basicValue":0.976617},{"type":"Double","basicValue":0.0102081}]]},"components":{"type":"Array","elements":[[{"type":"Double","basicValue":2.41145e-13},{"type":"Double","basicValue":1.92801},{"type":"Double","basicValue":4.14787}],[{"type":"Double","basicValue":2.82985},{"type":"Double","basicValue":0.721546},{"type":"Double","basicValue":0.0127197}]]},"n_iter":{"type":"Double","basicValue":70}}}

Python Code

import numpy as np
from sklearn.decomposition import NMF as SklearnNMF

def nmf_factor(data, n_components=2, nmf_init='nndsvda', nmf_solver='cd', max_iter=400, random_state=None):
    """
    Fit nonnegative matrix factorization and return factor weights with learned components.

    See: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): 2D array of nonnegative numeric input data with rows as samples and columns as features.
        n_components (int, optional): Number of latent components to learn. Default is 2.
        nmf_init (str, optional): Initialization strategy for the nonnegative factors. Valid options: NNDSVDa, NNDSVD, NNDSVDar, Random. Default is 'nndsvda'.
        nmf_solver (str, optional): Numerical solver used to optimize the factorization. Valid options: Coordinate Descent, Multiplicative Update. Default is 'cd'.
        max_iter (int, optional): Maximum number of optimization iterations. Default is 400.
        random_state (int, optional): Integer seed used by randomized initialization paths. Leave blank for the estimator default. Default is None.

    Returns:
        dict: Excel data type containing factor weights, learned components, and reconstruction error.
    """
    def py(value):
        return value.item() if isinstance(value, np.generic) else value

    def cell(value):
        value = py(value)
        if isinstance(value, bool):
            return {"type": "Boolean", "basicValue": bool(value)}
        if isinstance(value, (int, float)) and not isinstance(value, bool):
            return {"type": "Double", "basicValue": float(value)}
        return {"type": "String", "basicValue": str(value)}

    def col(values):
        return [[cell(value)] for value in values]

    def mat(values):
        return [[cell(value) for value in row] for row in values]

    def parse_data(value):
        value = [[value]] if not isinstance(value, list) else value
        if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
            return None, "Error: data must be a non-empty 2D list"
        if len({len(row) for row in value}) != 1:
            return None, "Error: data must be a rectangular 2D list"
        data_np = np.array(value, dtype=float)
        if data_np.ndim != 2 or data_np.size == 0:
            return None, "Error: data must be a non-empty 2D list"
        if not np.isfinite(data_np).all():
            return None, "Error: data must contain only finite numeric values"
        if np.any(data_np < 0):
            return None, "Error: data must contain only nonnegative numeric values"
        if data_np.shape[0] < 2:
            return None, "Error: data must contain at least 2 samples"
        return data_np, None

    try:
        data_np, error = parse_data(data)
        if error:
            return error

        component_total = int(n_components)
        max_components = min(data_np.shape[0], data_np.shape[1])
        if component_total < 1 or component_total > max_components:
            return f"Error: n_components must be between 1 and {max_components}"

        init_value = str(nmf_init).strip().lower()
        if init_value not in {"nndsvda", "nndsvd", "nndsvdar", "random"}:
            return "Error: nmf_init must be 'nndsvda', 'nndsvd', 'nndsvdar', or 'random'"

        solver_value = str(nmf_solver).strip().lower()
        if solver_value not in {"cd", "mu"}:
            return "Error: nmf_solver must be 'cd' or 'mu'"

        if int(max_iter) < 1:
            return "Error: max_iter must be at least 1"

        fitted = SklearnNMF(
            n_components=component_total,
            init=init_value,
            solver=solver_value,
            max_iter=int(max_iter),
            random_state=None if random_state in (None, "") else int(random_state)
        )

        weights_np = np.asarray(fitted.fit_transform(data_np), dtype=float)
        components_np = np.asarray(fitted.components_, dtype=float)
        reconstruction_error = float(fitted.reconstruction_err_)

        return {
            "type": "Double",
            "basicValue": reconstruction_error,
            "properties": {
                "reconstruction_error": {"type": "Double", "basicValue": reconstruction_error},
                "component_count": {"type": "Double", "basicValue": float(components_np.shape[0])},
                "sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
                "feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
                "weights": {"type": "Array", "elements": mat(weights_np.tolist())},
                "components": {"type": "Array", "elements": mat(components_np.tolist())},
                "n_iter": {"type": "Double", "basicValue": float(fitted.n_iter_)}
            }
        }
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

data *

2D array of nonnegative numeric input data with rows as samples and columns as features.

n_components

Number of latent components to learn.

nmf_init

Initialization strategy for the nonnegative factors.

nmf_solver

Numerical solver used to optimize the factorization.

max_iter

Maximum number of optimization iterations.

random_state

Integer seed used by randomized initialization paths. Leave blank for the estimator default.