Difference between revisions of "K-Means Clustering"

(Created page with "==Overview== This component provides an interface to running K-Means Clustering on a [http://www.broadinstitute.org/cancer/software/genepattern/ GenePattern] server, and a viewe...")
 
Line 2: Line 2:
  
 
This component provides an interface to running K-Means Clustering on a [http://www.broadinstitute.org/cancer/software/genepattern/ GenePattern] server, and a viewer for the results.
 
This component provides an interface to running K-Means Clustering on a [http://www.broadinstitute.org/cancer/software/genepattern/ GenePattern] server, and a viewer for the results.
 +
 +
As described in the GenePattern documentation:
 +
 +
"K-Means clustering is a clustering algorithm that classifies or groups objects into a specified number of clusters. Initially, ''k'' cluster centroids (centers) are randomly selected from the given data set and each data point is assigned to the cluster of the nearest cluster center. Each cluster center is then recalculated to be the mean value of its members and all data points are re-assigned to the cluster with the closest centroid. This process is repeated until the distance between consecutive cluster centers converges".
 +
 +
 +
==Prerequisites==
 +
The K-Means component must be loaded in the [[Component_Configuration_Manager|Component_Configuration_Manager]]
 +
 +
A gene expression dataset must be loaded in the Workspace.
 +
 +
 +
==References==
 +
J. B. MacQueen (1967) Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, 1:281-297

Revision as of 15:08, 13 January 2014

Overview

This component provides an interface to running K-Means Clustering on a GenePattern server, and a viewer for the results.

As described in the GenePattern documentation:

"K-Means clustering is a clustering algorithm that classifies or groups objects into a specified number of clusters. Initially, k cluster centroids (centers) are randomly selected from the given data set and each data point is assigned to the cluster of the nearest cluster center. Each cluster center is then recalculated to be the mean value of its members and all data points are re-assigned to the cluster with the closest centroid. This process is repeated until the distance between consecutive cluster centers converges".


Prerequisites

The K-Means component must be loaded in the Component_Configuration_Manager

A gene expression dataset must be loaded in the Workspace.


References

J. B. MacQueen (1967) Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, 1:281-297