# MultiDendrograms

605 downloads

A hierarchical clustering tool

MultiDendrograms is a simple yet powerful application to make the Hierarchical Clustering of real data, distributed under an open source license. Starting from a distances (or weights) matrix, MultiDendrograms calculates its dendrogram using the most common Agglomerative Hierarchical Clustering algorithms, allows the tuning of many of the graphical representation parameters, and the results may be easily exported to file. A summary of characteristics:

- Multiplatform: developed in Java, runs in all operating systems (e.g. Windows, Linux and MacOS).

- Graphical user interface: data selection, hierarchical clustering options, dendrogram representation parameters, navigation across the dendrogram, deviation measures.

- Hierarchical Clustering algorithms implemented: variable-group Single Linkage, Complete Linkage, Unweighted average, Weighted average, Unweighted centroid, Weighted centroid and Joint between-within.

- Representation parameters: size, orientation, labels, axis, etc.

- Deviation measures: Cophenetic Correlation Coefficient, Normalized Mean Squared Error and Normalized Mean Absolute Error.

- Export: ultrametric matrix, dendrogram details in text and Newick tree formats.

- Plot: dendrogram image in JPG, PNG and EPS formats.

MultiDendrograms implements the variable-group algorithms in to solve the non-uniqueness problem found in the standard pair-group algorithms and implementations. This problem arises when two or more minimum distances between different clusters are equal during the amalgamation process. The standard approach consists in choosing a pair, breaking the ties between distances, and proceeds in the same way until the final hierarchical classification is obtained. However, different clusterings are possible depending on the criterion used to break the ties (usually a pair is just chosen at random!).

The variable-group algorithms group more than two clusters at the same time when ties occur, given rise to a graphical representation called multidendrogram. Their main properties are:

- When there are no ties, the variable-group algorithms give the same results as the pair-group ones.

- They always give a uniquely determined solution.

- In the multidendrogram representation for the results one can explicitly observe the occurrence of ties during the agglomerative process. Furthermore, the height of any fusion interval (the bands in the program) indicates the degree of heterogeneity inside the corresponding cluster.

- Multiplatform: developed in Java, runs in all operating systems (e.g. Windows, Linux and MacOS).

- Graphical user interface: data selection, hierarchical clustering options, dendrogram representation parameters, navigation across the dendrogram, deviation measures.

- Hierarchical Clustering algorithms implemented: variable-group Single Linkage, Complete Linkage, Unweighted average, Weighted average, Unweighted centroid, Weighted centroid and Joint between-within.

- Representation parameters: size, orientation, labels, axis, etc.

- Deviation measures: Cophenetic Correlation Coefficient, Normalized Mean Squared Error and Normalized Mean Absolute Error.

- Export: ultrametric matrix, dendrogram details in text and Newick tree formats.

- Plot: dendrogram image in JPG, PNG and EPS formats.

MultiDendrograms implements the variable-group algorithms in to solve the non-uniqueness problem found in the standard pair-group algorithms and implementations. This problem arises when two or more minimum distances between different clusters are equal during the amalgamation process. The standard approach consists in choosing a pair, breaking the ties between distances, and proceeds in the same way until the final hierarchical classification is obtained. However, different clusterings are possible depending on the criterion used to break the ties (usually a pair is just chosen at random!).

The variable-group algorithms group more than two clusters at the same time when ties occur, given rise to a graphical representation called multidendrogram. Their main properties are:

- When there are no ties, the variable-group algorithms give the same results as the pair-group ones.

- They always give a uniquely determined solution.

- In the multidendrogram representation for the results one can explicitly observe the occurrence of ties during the agglomerative process. Furthermore, the height of any fusion interval (the bands in the program) indicates the degree of heterogeneity inside the corresponding cluster.

Last updated on October 17th, 2011

## 0 User reviews so far.

SUBMIT