NMF_FACTOR
Nonnegative matrix factorization decomposes a nonnegative data matrix into a product of two smaller nonnegative matrices. It is commonly used for topic-like count matrices, additive parts-based representations, and compact nonnegative latent factors.
The decomposition approximates the data matrix V:
V \approx W H
where W contains the non-negative factor weights and H contains the non-negative components.
This wrapper accepts rows as samples and columns as features. It returns the sample-by-component weight matrix together with the learned component matrix, reconstruction error, and iteration count for the fitted factorization.
Excel Usage
=NMF_FACTOR(data, n_components, nmf_init, nmf_solver, max_iter, random_state)
data(list[list], required): 2D array of nonnegative numeric input data with rows as samples and columns as features.n_components(int, optional, default: 2): Number of latent components to learn.nmf_init(str, optional, default: “nndsvda”): Initialization strategy for the nonnegative factors.nmf_solver(str, optional, default: “cd”): Numerical solver used to optimize the factorization.max_iter(int, optional, default: 400): Maximum number of optimization iterations.random_state(int, optional, default: null): Integer seed used by randomized initialization paths. Leave blank for the estimator default.
Returns (dict): Excel data type containing factor weights, learned components, and reconstruction error.
Example 1: Factor a small count-like matrix into two nonnegative components
Inputs:
| data | n_components | nmf_init | nmf_solver | max_iter | random_state | |||
|---|---|---|---|---|---|---|---|---|
| 4 | 1 | 0 | 0 | 2 | nndsvda | cd | 600 | 0 |
| 5 | 1 | 0 | 0 | |||||
| 0 | 0 | 3 | 4 | |||||
| 0 | 0 | 4 | 5 | |||||
| 3 | 1 | 0 | 0 | |||||
| 0 | 0 | 5 | 4 |
Excel formula:
=NMF_FACTOR({4,1,0,0;5,1,0,0;0,0,3,4;0,0,4,5;3,1,0,0;0,0,5,4}, 2, "nndsvda", "cd", 600, 0)
Expected output:
{"type":"Double","basicValue":1.22303,"properties":{"reconstruction_error":{"type":"Double","basicValue":1.22303},"component_count":{"type":"Double","basicValue":2},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":4},"weights":{"type":"Array","elements":[[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1.23956}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1.53187}],[{"type":"Double","basicValue":0.32396},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0.416078},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0.947249}],[{"type":"Double","basicValue":0.412981},{"type":"Double","basicValue":3.06249e-8}]]},"components":{"type":"Array","elements":[[{"type":"Double","basicValue":0},{"type":"Double","basicValue":0},{"type":"Double","basicValue":10.4789},{"type":"Double","basicValue":11.2079}],[{"type":"Double","basicValue":3.23389},{"type":"Double","basicValue":0.777898},{"type":"Double","basicValue":4.30763e-9},{"type":"Double","basicValue":0}]]},"n_iter":{"type":"Double","basicValue":7}}}
Example 2: Fit one nonnegative factor on a one-trend matrix
Inputs:
| data | n_components | nmf_init | nmf_solver | max_iter | random_state | |
|---|---|---|---|---|---|---|
| 1 | 2 | 1 | nndsvda | cd | 600 | 0 |
| 2 | 4 | |||||
| 3 | 6 | |||||
| 4 | 8 | |||||
| 5 | 10 |
Excel formula:
=NMF_FACTOR({1,2;2,4;3,6;4,8;5,10}, 1, "nndsvda", "cd", 600, 0)
Expected output:
{"type":"Double","basicValue":2.49492e-15,"properties":{"reconstruction_error":{"type":"Double","basicValue":2.49492e-15},"component_count":{"type":"Double","basicValue":1},"sample_count":{"type":"Double","basicValue":5},"feature_count":{"type":"Double","basicValue":2},"weights":{"type":"Array","elements":[[{"type":"Double","basicValue":0.5491}],[{"type":"Double","basicValue":1.0982}],[{"type":"Double","basicValue":1.6473}],[{"type":"Double","basicValue":2.1964}],[{"type":"Double","basicValue":2.7455}]]},"components":{"type":"Array","elements":[[{"type":"Double","basicValue":1.82116},{"type":"Double","basicValue":3.64232}]]},"n_iter":{"type":"Double","basicValue":5}}}
Example 3: Use seeded random initialization for two-factor fitting
Inputs:
| data | n_components | nmf_init | nmf_solver | max_iter | random_state | ||
|---|---|---|---|---|---|---|---|
| 2 | 0 | 1 | 2 | random | cd | 600 | 7 |
| 3 | 0 | 1 | |||||
| 0 | 4 | 2 | |||||
| 0 | 5 | 3 | |||||
| 2 | 0 | 2 | |||||
| 0 | 4 | 3 |
Excel formula:
=NMF_FACTOR({2,0,1;3,0,1;0,4,2;0,5,3;2,0,2;0,4,3}, 2, "random", "cd", 600, 7)
Expected output:
{"type":"Double","basicValue":1.06794,"properties":{"reconstruction_error":{"type":"Double","basicValue":1.06794},"component_count":{"type":"Double","basicValue":2},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":3},"weights":{"type":"Array","elements":[[{"type":"Double","basicValue":0.911616},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1.27198},{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3.66795}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":4.80079}],[{"type":"Double","basicValue":1.06152},{"type":"Double","basicValue":0.337401}],[{"type":"Double","basicValue":0.0791118},{"type":"Double","basicValue":4.06045}]]},"components":{"type":"Array","elements":[[{"type":"Double","basicValue":2.16697},{"type":"Double","basicValue":0},{"type":"Double","basicValue":1.14776}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":1.03418},{"type":"Double","basicValue":0.63678}]]},"n_iter":{"type":"Double","basicValue":10}}}
Example 4: Fit a nonnegative matrix with the multiplicative update solver
Inputs:
| data | n_components | nmf_init | nmf_solver | max_iter | random_state | ||
|---|---|---|---|---|---|---|---|
| 3 | 1 | 0 | 2 | nndsvda | mu | 800 | 0 |
| 4 | 1 | 0 | |||||
| 0 | 2 | 5 | |||||
| 0 | 3 | 6 | |||||
| 3 | 1 | 1 | |||||
| 0 | 2 | 4 |
Excel formula:
=NMF_FACTOR({3,1,0;4,1,0;0,2,5;0,3,6;3,1,1;0,2,4}, 2, "nndsvda", "mu", 800, 0)
Expected output:
{"type":"Double","basicValue":0.471063,"properties":{"reconstruction_error":{"type":"Double","basicValue":0.471063},"component_count":{"type":"Double","basicValue":2},"sample_count":{"type":"Double","basicValue":6},"feature_count":{"type":"Double","basicValue":3},"weights":{"type":"Array","elements":[[{"type":"Double","basicValue":0.0169444},{"type":"Double","basicValue":1.07715}],[{"type":"Double","basicValue":1.2837e-8},{"type":"Double","basicValue":1.41179}],[{"type":"Double","basicValue":1.17558},{"type":"Double","basicValue":0.0000200891}],[{"type":"Double","basicValue":1.46493},{"type":"Double","basicValue":0.0153092}],[{"type":"Double","basicValue":0.218198},{"type":"Double","basicValue":1.04456}],[{"type":"Double","basicValue":0.976617},{"type":"Double","basicValue":0.0102081}]]},"components":{"type":"Array","elements":[[{"type":"Double","basicValue":2.41145e-13},{"type":"Double","basicValue":1.92801},{"type":"Double","basicValue":4.14787}],[{"type":"Double","basicValue":2.82985},{"type":"Double","basicValue":0.721546},{"type":"Double","basicValue":0.0127197}]]},"n_iter":{"type":"Double","basicValue":70}}}
Python Code
import numpy as np
from sklearn.decomposition import NMF as SklearnNMF
def nmf_factor(data, n_components=2, nmf_init='nndsvda', nmf_solver='cd', max_iter=400, random_state=None):
"""
Fit nonnegative matrix factorization and return factor weights with learned components.
See: https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.NMF.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): 2D array of nonnegative numeric input data with rows as samples and columns as features.
n_components (int, optional): Number of latent components to learn. Default is 2.
nmf_init (str, optional): Initialization strategy for the nonnegative factors. Valid options: NNDSVDa, NNDSVD, NNDSVDar, Random. Default is 'nndsvda'.
nmf_solver (str, optional): Numerical solver used to optimize the factorization. Valid options: Coordinate Descent, Multiplicative Update. Default is 'cd'.
max_iter (int, optional): Maximum number of optimization iterations. Default is 400.
random_state (int, optional): Integer seed used by randomized initialization paths. Leave blank for the estimator default. Default is None.
Returns:
dict: Excel data type containing factor weights, learned components, and reconstruction error.
"""
def py(value):
return value.item() if isinstance(value, np.generic) else value
def cell(value):
value = py(value)
if isinstance(value, bool):
return {"type": "Boolean", "basicValue": bool(value)}
if isinstance(value, (int, float)) and not isinstance(value, bool):
return {"type": "Double", "basicValue": float(value)}
return {"type": "String", "basicValue": str(value)}
def col(values):
return [[cell(value)] for value in values]
def mat(values):
return [[cell(value) for value in row] for row in values]
def parse_data(value):
value = [[value]] if not isinstance(value, list) else value
if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
return None, "Error: data must be a non-empty 2D list"
if len({len(row) for row in value}) != 1:
return None, "Error: data must be a rectangular 2D list"
data_np = np.array(value, dtype=float)
if data_np.ndim != 2 or data_np.size == 0:
return None, "Error: data must be a non-empty 2D list"
if not np.isfinite(data_np).all():
return None, "Error: data must contain only finite numeric values"
if np.any(data_np < 0):
return None, "Error: data must contain only nonnegative numeric values"
if data_np.shape[0] < 2:
return None, "Error: data must contain at least 2 samples"
return data_np, None
try:
data_np, error = parse_data(data)
if error:
return error
component_total = int(n_components)
max_components = min(data_np.shape[0], data_np.shape[1])
if component_total < 1 or component_total > max_components:
return f"Error: n_components must be between 1 and {max_components}"
init_value = str(nmf_init).strip().lower()
if init_value not in {"nndsvda", "nndsvd", "nndsvdar", "random"}:
return "Error: nmf_init must be 'nndsvda', 'nndsvd', 'nndsvdar', or 'random'"
solver_value = str(nmf_solver).strip().lower()
if solver_value not in {"cd", "mu"}:
return "Error: nmf_solver must be 'cd' or 'mu'"
if int(max_iter) < 1:
return "Error: max_iter must be at least 1"
fitted = SklearnNMF(
n_components=component_total,
init=init_value,
solver=solver_value,
max_iter=int(max_iter),
random_state=None if random_state in (None, "") else int(random_state)
)
weights_np = np.asarray(fitted.fit_transform(data_np), dtype=float)
components_np = np.asarray(fitted.components_, dtype=float)
reconstruction_error = float(fitted.reconstruction_err_)
return {
"type": "Double",
"basicValue": reconstruction_error,
"properties": {
"reconstruction_error": {"type": "Double", "basicValue": reconstruction_error},
"component_count": {"type": "Double", "basicValue": float(components_np.shape[0])},
"sample_count": {"type": "Double", "basicValue": float(data_np.shape[0])},
"feature_count": {"type": "Double", "basicValue": float(data_np.shape[1])},
"weights": {"type": "Array", "elements": mat(weights_np.tolist())},
"components": {"type": "Array", "elements": mat(components_np.tolist())},
"n_iter": {"type": "Double", "basicValue": float(fitted.n_iter_)}
}
}
except Exception as e:
return f"Error: {str(e)}"