DENDROGRAM

Excel Usage

=DENDROGRAM(data, dendrogram_method, metric, dendro_orient, title, xlabel, ylabel)
  • data (list[list], required): Numeric data for clustering. Rows are observations, columns are variables.
  • dendrogram_method (str, optional, default: “ward”): Linkage method for clustering.
  • metric (str, optional, default: “euclidean”): Distance metric to use.
  • dendro_orient (str, optional, default: “top”): Orientation of the dendrogram.
  • title (str, optional, default: ““): Chart title.
  • xlabel (str, optional, default: ““): Label for X-axis.
  • ylabel (str, optional, default: ““): Label for Y-axis.

Returns (object): Matplotlib Figure object (standard Python) or base64 encoded PNG string (Pyodide).

Example 1: Basic dendrogram with 2D data

Inputs:

data
1 2
2 3
10 11
11 12

Excel formula:

=DENDROGRAM({1,2;2,3;10,11;11,12})

Expected output:

"chart"

Example 2: Dendrogram with complete linkage

Inputs:

data dendrogram_method
1 2 complete
2 3
10 11
11 12

Excel formula:

=DENDROGRAM({1,2;2,3;10,11;11,12}, "complete")

Expected output:

"chart"

Example 3: Horizontal dendrogram (left)

Inputs:

data dendro_orient title
1 2 left Horizontal Dendrogram
2 3
5 5
10 11
11 12

Excel formula:

=DENDROGRAM({1,2;2,3;5,5;10,11;11,12}, "left", "Horizontal Dendrogram")

Expected output:

"chart"

Example 4: Dendrogram with cityblock metric

Inputs:

data dendrogram_method metric
1 2 average cityblock
2 3
10 11
11 12

Excel formula:

=DENDROGRAM({1,2;2,3;10,11;11,12}, "average", "cityblock")

Expected output:

"chart"

Python Code

import sys
import matplotlib
IS_PYODIDE = sys.platform == "emscripten"
if IS_PYODIDE:
    matplotlib.use('Agg')
import matplotlib.pyplot as plt
import io
import base64
import numpy as np
from scipy.cluster.hierarchy import dendrogram as hierarchy_dendrogram
from scipy.cluster.hierarchy import linkage

def dendrogram(data, dendrogram_method='ward', metric='euclidean', dendro_orient='top', title='', xlabel='', ylabel=''):
    """
    Performs hierarchical (agglomerative) clustering and returns a dendrogram as an image.

    See: https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.dendrogram.html

    This example function is provided as-is without any representation of accuracy.

    Args:
        data (list[list]): Numeric data for clustering. Rows are observations, columns are variables.
        dendrogram_method (str, optional): Linkage method for clustering. Valid options: Ward, Single, Complete, Average, Weighted, Centroid, Median. Default is 'ward'.
        metric (str, optional): Distance metric to use. Valid options: Euclidean, Cityblock, Cosine, Correlation, Chebychev, Canberra, Braycurtis, Mahalanobis. Default is 'euclidean'.
        dendro_orient (str, optional): Orientation of the dendrogram. Valid options: Top, Bottom, Left, Right. Default is 'top'.
        title (str, optional): Chart title. Default is ''.
        xlabel (str, optional): Label for X-axis. Default is ''.
        ylabel (str, optional): Label for Y-axis. Default is ''.

    Returns:
        object: Matplotlib Figure object (standard Python) or base64 encoded PNG string (Pyodide).
    """
    def to2d(x):
        return [[x]] if not isinstance(x, list) else x

    try:
        data = to2d(data)

        if not isinstance(data, list) or not all(isinstance(row, list) for row in data):
            return "Error: Invalid input - data must be a 2D list"

        # Extract numeric data, ignoring non-numeric rows
        arr_clean = []
        for row in data:
            try:
                arr_clean.append([float(x) for x in row if x is not None])
            except (TypeError, ValueError):
                continue

        if not arr_clean:
            return "Error: No valid numeric data found"

        # Ensure rectangular array
        max_cols = max(len(row) for row in arr_clean)
        arr = np.array([row + [0.0] * (max_cols - len(row)) for row in arr_clean])

        if arr.shape[0] < 2:
            return "Error: At least two data points are required for clustering"

        # Perform hierarchical clustering
        try:
            linkage_matrix = linkage(arr, method=dendrogram_method, metric=metric)
        except Exception as e:
            return f"Error during clustering: {str(e)}"

        # Create plot
        fig, ax = plt.subplots(figsize=(10, 6))

        # Plot dendrogram
        hierarchy_dendrogram(linkage_matrix, orientation=dendro_orient, ax=ax)

        chart_title = title if title else f"Hierarchical Clustering Dendrogram ({dendrogram_method})"
        ax.set_title(chart_title)

        if xlabel:
            ax.set_xlabel(xlabel)
        elif dendro_orient in ['top', 'bottom']:
            ax.set_xlabel("Sample Index")
        else:
            ax.set_xlabel("Distance")

        if ylabel:
            ax.set_ylabel(ylabel)
        elif dendro_orient in ['top', 'bottom']:
            ax.set_ylabel("Distance")
        else:
            ax.set_ylabel("Sample Index")

        plt.tight_layout()

        # Return based on platform
        if IS_PYODIDE:
            buf = io.BytesIO()
            plt.savefig(buf, format='png', dpi=100, bbox_inches='tight')
            buf.seek(0)
            img_base64 = base64.b64encode(buf.read()).decode('utf-8')
            plt.close(fig)
            return f"data:image/png;base64,{img_base64}"
        else:
            return fig
    except Exception as e:
        return f"Error: {str(e)}"

Online Calculator

Numeric data for clustering. Rows are observations, columns are variables.
Linkage method for clustering.
Distance metric to use.
Orientation of the dendrogram.
Chart title.
Label for X-axis.
Label for Y-axis.