SPECTRAL_CLUSTER
Spectral clustering builds an affinity graph from the input samples, computes a low-dimensional embedding from the graph Laplacian, and then partitions that embedding into clusters.
The algorithm computes the unnormalized graph Laplacian L from the affinity matrix A and degree matrix D (D_{ii} = \sum_j A_{ij}):
L = D - A
It then uses the eigenvectors corresponding to the smallest eigenvalues of the Laplacian (or a normalized version like L_{sym} = I - D^{-1/2} A D^{-1/2}) to define a lower-dimensional subspace where k-means or another label assignment strategy is applied.
This wrapper accepts data with rows as samples and columns as features. It returns the fitted labels, a compact label count table, and the discovered cluster count while intentionally omitting the full affinity matrix to keep results compact.
Excel Usage
=SPECTRAL_CLUSTER(data, n_clusters, spec_affinity, gamma, n_neighbors, spec_assign, random_state)
data(list[list], required): 2D array of input data with rows as samples and columns as features.n_clusters(int, optional, default: 8): Number of clusters to extract from the spectral embedding.spec_affinity(str, optional, default: “rbf”): Strategy for constructing the affinity graph.gamma(float, optional, default: 1): Kernel coefficient used for the RBF affinity.n_neighbors(int, optional, default: 10): Number of neighbors used when affinity is nearest_neighbors.spec_assign(str, optional, default: “kmeans”): Method used to convert the embedding into discrete labels.random_state(int, optional, default: null): Integer seed for reproducible spectral initialization. Leave blank for non-deterministic runs.
Returns (dict): Excel data type containing cluster counts, labels, label counts, and the key spectral settings used.
Example 1: Split two separated point clouds with the RBF affinity
Inputs:
| data | n_clusters | spec_affinity | spec_assign | random_state | |
|---|---|---|---|---|---|
| 0 | 0 | 2 | rbf | kmeans | 0 |
| 0 | 1 | ||||
| 1 | 0 | ||||
| 5 | 5 | ||||
| 5 | 6 | ||||
| 6 | 5 |
Excel formula:
=SPECTRAL_CLUSTER({0,0;0,1;1,0;5,5;5,6;6,5}, 2, "rbf", "kmeans", 0)
Expected output:
{"type":"Double","basicValue":2,"properties":{"cluster_count":{"type":"Double","basicValue":2},"labels":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}]]},"label_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"label"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"affinity":{"type":"String","basicValue":"rbf"},"assign_labels":{"type":"String","basicValue":"kmeans"}}}
Example 2: Use nearest-neighbor affinity on two compact groups
Inputs:
| data | n_clusters | spec_affinity | n_neighbors | spec_assign | random_state | |
|---|---|---|---|---|---|---|
| 0 | 0 | 2 | nearest_neighbors | 5 | kmeans | 0 |
| 0 | 1 | |||||
| 1 | 0 | |||||
| 5 | 5 | |||||
| 5 | 6 | |||||
| 6 | 5 |
Excel formula:
=SPECTRAL_CLUSTER({0,0;0,1;1,0;5,5;5,6;6,5}, 2, "nearest_neighbors", 5, "kmeans", 0)
Expected output:
{"type":"Double","basicValue":2,"properties":{"cluster_count":{"type":"Double","basicValue":2},"labels":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}]]},"label_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"label"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"affinity":{"type":"String","basicValue":"nearest_neighbors"},"assign_labels":{"type":"String","basicValue":"kmeans"}}}
Example 3: Use discretization to assign labels in the spectral embedding
Inputs:
| data | n_clusters | spec_affinity | spec_assign | random_state | |
|---|---|---|---|---|---|
| 1 | 1 | 2 | rbf | discretize | 1 |
| 1.2 | 0.8 | ||||
| 0.8 | 1.1 | ||||
| 8 | 8 | ||||
| 8.2 | 7.9 | ||||
| 7.8 | 8.1 |
Excel formula:
=SPECTRAL_CLUSTER({1,1;1.2,0.8;0.8,1.1;8,8;8.2,7.9;7.8,8.1}, 2, "rbf", "discretize", 1)
Expected output:
{"type":"Double","basicValue":2,"properties":{"cluster_count":{"type":"Double","basicValue":2},"labels":{"type":"Array","elements":[[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}]]},"label_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"label"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"affinity":{"type":"String","basicValue":"rbf"},"assign_labels":{"type":"String","basicValue":"discretize"}}}
Example 4: Use cluster QR label extraction on two separated groups
Inputs:
| data | n_clusters | spec_affinity | spec_assign | random_state |
|---|---|---|---|---|
| 0 | 2 | rbf | cluster_qr | 2 |
| 0.2 | ||||
| 0.4 | ||||
| 4.8 | ||||
| 5 | ||||
| 5.2 |
Excel formula:
=SPECTRAL_CLUSTER({0;0.2;0.4;4.8;5;5.2}, 2, "rbf", "cluster_qr", 2)
Expected output:
{"type":"Double","basicValue":2,"properties":{"cluster_count":{"type":"Double","basicValue":2},"labels":{"type":"Array","elements":[[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":0}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}],[{"type":"Double","basicValue":1}]]},"label_counts":{"type":"Array","elements":[[{"type":"String","basicValue":"label"},{"type":"String","basicValue":"count"}],[{"type":"Double","basicValue":0},{"type":"Double","basicValue":3}],[{"type":"Double","basicValue":1},{"type":"Double","basicValue":3}]]},"affinity":{"type":"String","basicValue":"rbf"},"assign_labels":{"type":"String","basicValue":"cluster_qr"}}}
Python Code
import numpy as np
from sklearn.cluster import SpectralClustering as SklearnSpectralClustering
def spectral_cluster(data, n_clusters=8, spec_affinity='rbf', gamma=1, n_neighbors=10, spec_assign='kmeans', random_state=None):
"""
Cluster samples by partitioning a graph-based spectral embedding.
See: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.SpectralClustering.html
This example function is provided as-is without any representation of accuracy.
Args:
data (list[list]): 2D array of input data with rows as samples and columns as features.
n_clusters (int, optional): Number of clusters to extract from the spectral embedding. Default is 8.
spec_affinity (str, optional): Strategy for constructing the affinity graph. Valid options: RBF, Nearest Neighbors. Default is 'rbf'.
gamma (float, optional): Kernel coefficient used for the RBF affinity. Default is 1.
n_neighbors (int, optional): Number of neighbors used when affinity is nearest_neighbors. Default is 10.
spec_assign (str, optional): Method used to convert the embedding into discrete labels. Valid options: K-means, Discretize, Cluster QR. Default is 'kmeans'.
random_state (int, optional): Integer seed for reproducible spectral initialization. Leave blank for non-deterministic runs. Default is None.
Returns:
dict: Excel data type containing cluster counts, labels, label counts, and the key spectral settings used.
"""
def to2d(value):
return [[value]] if not isinstance(value, list) else value
def parse_matrix(value):
value = to2d(value)
if not isinstance(value, list) or not value or not all(isinstance(row, list) and row for row in value):
return None, "Error: data must be a non-empty 2D list"
if len({len(row) for row in value}) != 1:
return None, "Error: data must be a rectangular 2D list"
matrix = np.array(value, dtype=float)
if matrix.ndim != 2 or matrix.size == 0:
return None, "Error: data must be a non-empty 2D list"
if not np.isfinite(matrix).all():
return None, "Error: data must contain only finite numeric values"
return matrix, None
def as_column(values):
return [[{"type": "Double", "basicValue": float(item)}] for item in values]
def label_count_table(labels):
unique_labels, counts = np.unique(labels, return_counts=True)
rows = [[{"type": "String", "basicValue": "label"}, {"type": "String", "basicValue": "count"}]]
rows.extend(
[[{"type": "Double", "basicValue": float(label)}, {"type": "Double", "basicValue": float(count)}]
for label, count in zip(unique_labels.tolist(), counts.tolist())]
)
return rows
try:
data_np, error = parse_matrix(data)
if error:
return error
cluster_total = int(n_clusters)
if cluster_total < 1:
return "Error: n_clusters must be at least 1"
if cluster_total > data_np.shape[0]:
return "Error: n_clusters cannot exceed the number of samples"
affinity_value = str(spec_affinity).strip()
if affinity_value not in {"rbf", "nearest_neighbors"}:
return "Error: affinity must be 'rbf' or 'nearest_neighbors'"
label_mode = str(spec_assign).strip()
if label_mode not in {"kmeans", "discretize", "cluster_qr"}:
return "Error: assign_labels must be 'kmeans', 'discretize', or 'cluster_qr'"
if float(gamma) <= 0:
return "Error: gamma must be greater than 0"
if int(n_neighbors) < 1:
return "Error: n_neighbors must be at least 1"
if affinity_value == "nearest_neighbors" and int(n_neighbors) >= data_np.shape[0]:
return "Error: n_neighbors must be smaller than the number of samples when affinity is nearest_neighbors"
seed = None if random_state in (None, "") else int(random_state)
fitted = SklearnSpectralClustering(
n_clusters=cluster_total,
affinity=affinity_value,
gamma=float(gamma),
n_neighbors=int(n_neighbors),
assign_labels=label_mode,
random_state=seed,
n_init=10
).fit(data_np)
labels = fitted.labels_
cluster_count = int(np.unique(labels).size)
return {
"type": "Double",
"basicValue": float(cluster_count),
"properties": {
"cluster_count": {"type": "Double", "basicValue": float(cluster_count)},
"labels": {"type": "Array", "elements": as_column(labels.tolist())},
"label_counts": {"type": "Array", "elements": label_count_table(labels)},
"affinity": {"type": "String", "basicValue": affinity_value},
"assign_labels": {"type": "String", "basicValue": label_mode}
}
}
except Exception as e:
return f"Error: {str(e)}"