SOM

Revision as of 19:12, 28 July 2009 by Smith (talk | contribs) (Tutorial - Clustering moved to Tutorial - SOM)

Home | Quick Start | Basics | Menu Bar | Preferences | Component Configuration Manager | Workspace | Information Panel | Local Data Files | File Formats | caArray | Array Sets | Marker Sets | Microarray Dataset Viewers | Filtering | Normalization | Tutorial Data | geWorkbench-web Tutorials

Analysis Framework | ANOVA | ARACNe | BLAST | Cellular Networks KnowledgeBase | CeRNA/Hermes Query | Classification (KNN, WV) | Color Mosaic | Consensus Clustering | Cytoscape | Cupid | DeMAND | Expression Value Distribution | Fold-Change | Gene Ontology Term Analysis | Gene Ontology Viewer | GenomeSpace | genSpace | Grid Services | GSEA | Hierarchical Clustering | IDEA | Jmol | K-Means Clustering | LINCS Query | Marker Annotations | MarkUs | Master Regulator Analysis | (MRA-FET Method) | (MRA-MARINa Method) | MatrixREDUCE | MINDy | Pattern Discovery | PCA | Promoter Analysis | Pudge | SAM | Sequence Retriever | SkyBase | SkyLine | SOM | SVM | T-Test | Viper Analysis | Volcano Plot



Overview

Clustering methods can allow identification of groups of markers with similar expression. A common application is to search for genes that appear to be co-regulated. A list of such markers, saved to the Markers component, can be used for further steps, such as retrieving upstream sequences, Gene Ontology analysis, or viewing of annotations.


The SOM (Self-Organizing Maps) method clusters markers into a user-specified number of bins. In geWorkbench, a SOM visualizer component displays the results graphically.

SOM Example

  • Note - for the distance calculations used in SOM analysis to be valid, the data must have been normalized such that the scale of variation over each array is equal. (More details to be added here).
  • Load the microarray dataset "webmatrix_quantile_log2_dev1_mv0.exp", available in the tutorial_data.zip Download.
  • In the Arrays/Phenotypes component pulldown menu, select the group labeled "Class".
  • Activate two sets of arrays to compare, e.g. GC B-cell and non-GC B-cell, by checking the boxes before the names (these are chosen here because they are the smallest groups).
  • Go to the Analysis component, and select SOM Analysis.

Parameters: Rows, Columns - give the number of bins into which to separate the different marker expression patterns. Radius - Iterations - Alpha - Function - Bubble or Gaussian

The default parameters are shown below. We will accept these parameters.

T SOM Analysis Parameters.png


The resulting display of nine clusters is shown below. The user should experiment with different parameters to attempt to discern informative groupings.


T SOM display.png


Any individual graph can be right-clicked on and "Add to Set" chosen. This will add these markers to a new Set in the Markers component. Each will be given a name starting with "Cluster Grid" and the number of markers will be shown.


Saving the list of genes

For use in future examples, you can save a list of genes from the Markers panel:

  • Highlight its entry in the Markers component (Cluster Tree[84]).
  • Right-click and select "Save".
  • Enter a name. We have saved the list from the above hierarchical clustering example as "cluster_tree_total_pearsons_84_markers.csv". (The .csv is added automatically). This list is available under this name in the tutorial data download.