Difference between revisions of "SOM"

(SOM Example)
Line 15: Line 15:
 
Self-organizing maps group the markers into a user-specified number of bins.  In geWorkbench, a SOM visualizer component displays the results graphically.  Hierarchical clustering constructs a tree-like relationship among the expression patterns of all markers present.  Results are viewed in the Dendrogram component.
 
Self-organizing maps group the markers into a user-specified number of bins.  In geWorkbench, a SOM visualizer component displays the results graphically.  Hierarchical clustering constructs a tree-like relationship among the expression patterns of all markers present.  Results are viewed in the Dendrogram component.
  
 
==Preparation==
 
 
For the clustering examples, load a microarray dataset such as "webmatrix_quantile_log2_dev1_mv0.exp", available in the tutorial_data.zip [[Download]].
 
  
  
 
==SOM Example==
 
==SOM Example==
 
+
* load the microarray dataset "webmatrix_quantile_log2_dev1_mv0.exp", available in the tutorial_data.zip [[Download]].
 
* In the '''Arrays/Phenotypes''' component pulldown menu, select the group labeled "Class".
 
* In the '''Arrays/Phenotypes''' component pulldown menu, select the group labeled "Class".
* Activate two sets of arrays to compare, the GC B-cell and non-GC B-cell, by checking the boxes before the names.
+
* Activate two sets of arrays to compare, e.g. GC B-cell and non-GC B-cell, by checking the boxes before the names (these are chosen here because they are the smallest groups).
 
* Go to the '''Analysis''' component, and select '''SOM Analysis'''.
 
* Go to the '''Analysis''' component, and select '''SOM Analysis'''.
 +
 +
Parameters:
 +
Rows, Columns - give the number of bins into which to separate the different marker expression patterns.
 +
Radius -
 +
Iterations -
 +
Alpha -
 +
Function - Bubble or Gaussian
  
 
The default parameters are shown below.  We will accept these parameters.
 
The default parameters are shown below.  We will accept these parameters.
Line 31: Line 34:
 
[[Image:T_SOM_Analysis_Parameters.png]]
 
[[Image:T_SOM_Analysis_Parameters.png]]
  
The resulting display of nine clusters is shown here:
+
 
 +
The resulting display of nine clusters is shown below.  The user should experiment with different parameters to attempt to discern informative groupings.
 +
 
  
 
[[Image:T_SOM_display.png]]
 
[[Image:T_SOM_display.png]]
 +
 +
Any individual graph can be right-clicked on and "Add to Set" chosen.  This will add these markers to a new Set in the '''Markers''' component.  Each will be given a name starting with "Cluster Grid" and the number of markers will be shown.
 +
  
 
==Hierarchical Clustering - Example==
 
==Hierarchical Clustering - Example==

Revision as of 17:18, 6 June 2006

Home | Quick Start | Basics | Menu Bar | Preferences | Component Configuration Manager | Workspace | Information Panel | Local Data Files | File Formats | caArray | Array Sets | Marker Sets | Microarray Dataset Viewers | Filtering | Normalization | Tutorial Data | geWorkbench-web Tutorials

Analysis Framework | ANOVA | ARACNe | BLAST | Cellular Networks KnowledgeBase | CeRNA/Hermes Query | Classification (KNN, WV) | Color Mosaic | Consensus Clustering | Cytoscape | Cupid | DeMAND | Expression Value Distribution | Fold-Change | Gene Ontology Term Analysis | Gene Ontology Viewer | GenomeSpace | genSpace | Grid Services | GSEA | Hierarchical Clustering | IDEA | Jmol | K-Means Clustering | LINCS Query | Marker Annotations | MarkUs | Master Regulator Analysis | (MRA-FET Method) | (MRA-MARINa Method) | MatrixREDUCE | MINDy | Pattern Discovery | PCA | Promoter Analysis | Pudge | SAM | Sequence Retriever | SkyBase | SkyLine | SOM | SVM | T-Test | Viper Analysis | Volcano Plot


Note

(June 6, 2006) A problem has been identified in the Hierarchical Clustering analysis with the implementations of Average Linkage and Total Linkage. This problem is present in the current version (1.0.3) of geWorkbench and all previous versions. This problem will be fixed in the next release. The implementation of single linkage is believed to be working correctly.

Background

Clustering methods can allow identification of groups of markers with similar expression. geWorkbench supports two clustering methods:

  1. Self-organizing maps (SOMs)
  2. Hierarchical Clustering

Self-organizing maps group the markers into a user-specified number of bins. In geWorkbench, a SOM visualizer component displays the results graphically. Hierarchical clustering constructs a tree-like relationship among the expression patterns of all markers present. Results are viewed in the Dendrogram component.


SOM Example

  • load the microarray dataset "webmatrix_quantile_log2_dev1_mv0.exp", available in the tutorial_data.zip Download.
  • In the Arrays/Phenotypes component pulldown menu, select the group labeled "Class".
  • Activate two sets of arrays to compare, e.g. GC B-cell and non-GC B-cell, by checking the boxes before the names (these are chosen here because they are the smallest groups).
  • Go to the Analysis component, and select SOM Analysis.

Parameters: Rows, Columns - give the number of bins into which to separate the different marker expression patterns. Radius - Iterations - Alpha - Function - Bubble or Gaussian

The default parameters are shown below. We will accept these parameters.

T SOM Analysis Parameters.png


The resulting display of nine clusters is shown below. The user should experiment with different parameters to attempt to discern informative groupings.


T SOM display.png

Any individual graph can be right-clicked on and "Add to Set" chosen. This will add these markers to a new Set in the Markers component. Each will be given a name starting with "Cluster Grid" and the number of markers will be shown.


Hierarchical Clustering - Example

  • In the Arrays/Phenotypes component, select the set of arrays labeled "Class".
  • Activate two classes of arrays to compare, the GC B-cell and non-GC B-cell, by checking the boxes before the names.
  • Go to the Analysis component, and select Hierarchical Clustering.
  • At the bottom of the Analysis component, the box that says All Arrays should be unchecked, so that the array selection above is used.


  • In Hierarchical Clustering, set the parameters to:
    • Clustering Method: Total Linkage
    • Clustering Dimension: Both
    • Clustering Metric: Euclidean
  • Click Analyze.

The results will be displayed in the Dendrogram component.

T HierarchicalClustering BCregion.png

By scrolling down a bit, one finds a large interesting area, showing clear differences between groups of arrays. We will select two clearly differentiated clusters. Check the Enable Zoom checkbox. Then highlight the first cluster of 12 markers as shown here:

T HierarchicalClustering BC12Markers.png


Then left-click to select this subset of the dendrogram. It will be displayed alone.

T HierarchicalClustering BC12MarkersZoom.png


Now right-click and select "Add to Set". In the Markers component, the select genes are added as Cluster Tree [12], where 12 is the number of markers selected.


Repeat for the similar region just below, which contains another 44 markers.


T HierarchicalClustering BC44Markers.png


This will result in two sets of markers having been added to the Markers component, as shown below:

T Markers ClusterTree12and44.png