Skip to content

The section on hierarchical clustering needs improvement #3

@pedrohbraga

Description

@pedrohbraga

The section on hierarchical clustering needs a lot of improvement on its clarity and on its content.

The instruction of the clustering algorithms in this workshop would improve a lot if they included working examples that build step by step with the distance matrix calculated for a few organisms (e.g., four or five), with the first step showing the first clustering along with both the first branch length estimation and the first distance matrix update, followed by the second step, until the final one, where a dendrogram is displayed.

The same working example could be used to compare other types of linkages.

The explanation of these methods and their distinctions are also more easily depicted if the formulas are included in the slides.

In addition to this, this workshop only includes single-linkage clustering, complete-linkage clustering and Ward's criterion). Unweighted pair group method with arithmetic mean (UPGMA) are widely used in ecology and evolution and could be covered in this section.

An explanation of the decision on how many groups to keep should be added to this section.

A short explanation of the distance metrics and a few comparisons should also be provided.

Finally, an interactive exercise should be added to this section to help participants assimilate this content.

If possible, other visualization methods for the dendrograms could be added, e.g. ggdendro::ggdendrogram() or dendextend.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions