Difference between revisions of "Hierarchical Clustering"

 
Line 9: Line 9:
 
Note - hierarchical clustering is memory intensive.  With the default memory settings  (see [[FAQ#Q._How_do_I_increase_the_amount_of_memory_available_to_Java_to_run_geWorkbench.3F | here]] to change), clustering more than about 2000 markers is not recommended.
 
Note - hierarchical clustering is memory intensive.  With the default memory settings  (see [[FAQ#Q._How_do_I_increase_the_amount_of_memory_available_to_Java_to_run_geWorkbench.3F | here]] to change), clustering more than about 2000 markers is not recommended.
  
 +
 +
==Parameters==
 +
 +
 +
 +
===Clustering Method===
 +
====Single Linkage====
 +
====Average Linkage====
 +
====Total Linkage====
 +
 +
 +
====Clustering Dimension====
 +
====Marker====
 +
====Microarray====
 +
====Both====
 +
 +
===Clustering Metric===
 +
====Euclidean====
 +
====Pearson's====
 +
====Spearman's====
 +
 +
===All Arrays===
 +
 +
===All Markers===
 +
 +
==Example==
 +
===Running the calculation===
 +
 +
This example will take off with the set of markers produced in the [[Tutorial_-_ANOVA | ANOVA]] example.  Please follow the steps for that example to produce the starting marker set, or just create/select another set of markers of your own.
 +
 +
1. If following the ANOVA example, activate the set of markers labeled "Significant Genes [1786]" ( which contains 1786 markers).
 +
 +
 +
[[Image:T_HC_set_activation.png]]
 +
 +
 +
2. Set the parameters as shown in the following figure.
 +
 +
 +
[[Image:T_HC_setup.png]]
 +
 +
* Clustering methods: Average Linkage.
 +
* Clustering Dimension: Marker.
 +
* Clustering Metric: Euclidean.
 +
 +
3. Click '''Analyze'''.
 +
 +
4. A progress bar will be visible during the calculation.
  
  
 
[[Image:T_HC_computing.png]]
 
[[Image:T_HC_computing.png]]
 +
 +
 +
The results are placed in the Project Folders component and labeled "Hierarchical Clustering", and can be displayed in the Dendrogram component.
 +
 +
 +
===Displaying results in the Dendrogram component===
 +
  
 
[[Image:T_HC_Dendrogram_add-to-set.png]]
 
[[Image:T_HC_Dendrogram_add-to-set.png]]
Line 27: Line 82:
  
 
[[Image:T_HC_MarkerSets-ClusterTree.png]]
 
[[Image:T_HC_MarkerSets-ClusterTree.png]]
 
 
[[Image:T_HC_set_activation.png]]
 
 
[[Image:T_HC_setup.png]]
 

Revision as of 18:57, 28 July 2009

Home | Quick Start | Basics | Menu Bar | Preferences | Component Configuration Manager | Workspace | Information Panel | Local Data Files | File Formats | caArray | Array Sets | Marker Sets | Microarray Dataset Viewers | Filtering | Normalization | Tutorial Data | geWorkbench-web Tutorials

Analysis Framework | ANOVA | ARACNe | BLAST | Cellular Networks KnowledgeBase | CeRNA/Hermes Query | Classification (KNN, WV) | Color Mosaic | Consensus Clustering | Cytoscape | Cupid | DeMAND | Expression Value Distribution | Fold-Change | Gene Ontology Term Analysis | Gene Ontology Viewer | GenomeSpace | genSpace | Grid Services | GSEA | Hierarchical Clustering | IDEA | Jmol | K-Means Clustering | LINCS Query | Marker Annotations | MarkUs | Master Regulator Analysis | (MRA-FET Method) | (MRA-MARINa Method) | MatrixREDUCE | MINDy | Pattern Discovery | PCA | Promoter Analysis | Pudge | SAM | Sequence Retriever | SkyBase | SkyLine | SOM | SVM | T-Test | Viper Analysis | Volcano Plot



Overview

geWorkbench implements its own code for agglomerative hierarchical clustering.


Note - hierarchical clustering is memory intensive. With the default memory settings (see here to change), clustering more than about 2000 markers is not recommended.


Parameters

Clustering Method

Single Linkage

Average Linkage

Total Linkage

Clustering Dimension

Marker

Microarray

Both

Clustering Metric

Euclidean

Pearson's

Spearman's

All Arrays

All Markers

Example

Running the calculation

This example will take off with the set of markers produced in the ANOVA example. Please follow the steps for that example to produce the starting marker set, or just create/select another set of markers of your own.

1. If following the ANOVA example, activate the set of markers labeled "Significant Genes [1786]" ( which contains 1786 markers).


T HC set activation.png


2. Set the parameters as shown in the following figure.


T HC setup.png

  • Clustering methods: Average Linkage.
  • Clustering Dimension: Marker.
  • Clustering Metric: Euclidean.

3. Click Analyze.

4. A progress bar will be visible during the calculation.


T HC computing.png


The results are placed in the Project Folders component and labeled "Hierarchical Clustering", and can be displayed in the Dendrogram component.


Displaying results in the Dendrogram component

T HC Dendrogram add-to-set.png


T HC Dendrogram EnableSelection.png

T HC Dendrogram marked.png


T HC Dendrogram selecting.png

T HC Dendrogram selection.png


T HC MarkerSets-ClusterTree.png