Search
- Browse Authors in the U.C. Berkeley Division of Biostatistics Working Paper Series
Notification
Most popular papers
COBRA Notification
Most Popular Papers
Institutions: Join COBRA
About COBRA
- Cluster Analysis of Genomic Data with Applications in R
-
-
Download the Paper
Forward to a colleague
- Abstract:
- In this paper, we provide an overview of existing partitioning and
hierarchical clustering algorithms in R. We discuss statistical issues and
methods in choosing the number of clusters, the choice of clustering
algorithm, and the choice of dissimilarity matrix. In particular, we
illustrate how the bootstrap can be employed as a statistical method in
cluster analysis to establish the reproducibility of the clusters and the
overall variability of the followed procedure. We also show how to
visualize a clustering result by plotting ordered dissimilarity matrices
in R. We present a new R package, hopach, which implements the
hybrid clustering method, Hierarchical Ordered Partitioning And Collapsing
Hybrid (HOPACH). The methodology combines the strengths of
both partitioning and agglomerative hierarchical clustering methods. At
each node, a cluster is split into two or more smaller clusters with an
enforced ordering of the clusters. Collapsing steps uniting the two
closest clusters into one cluster are used to correct for errors made in
the partitioning steps. The hopach function uses the median split
silhouette (MSS) criterion to automatically choose (i) the number of
children at each node, (ii) which clusters to collapse, and (iii) the main
clusters (pruning the tree to produce a partition of homogeneous
clusters). The methodology is illustrated with gene expression data.
- Subject Area:
- Computation, Computational Biology/Bioinformatics, Human Genetics, Microarrays, Multivariate Analysis
- Suggested Citation:
- Katherine S. Pollard and Mark J. van der Laan,
"Cluster Analysis of Genomic Data with Applications in R"
(January 2005).
U.C. Berkeley Division of Biostatistics Working Paper Series.
Working Paper 167.
http://www.bepress.com/ucbbiostat/paper167