Difference between revisions of "Tutorial - GO Term Enrichment"

Line 5: Line 5:
  
 
==Example==
 
==Example==
In this example we will use the file "webmatrix_quantile_log2_dev1_mv0.exp" available in the tutorial data section (coming soon).
+
In this example we will use the file "webmatrix_quantile_log2_dev1_mv0.exp" available in the tutorial data section (coming soon).  The lists of markers were obtained as shown in the Clustering tutorial.  The first contains 12 markers and the second 44, comprising together one larger cluster.  The marker lists can also be directly loaded from the files cluster_tree_12.csv and cluster_tree_44.csv found in the tutorial data section (coming soon).
  
===Preparations===
+
* Activate both of the lists by checking their boxes in the '''Markers''' component as shown:
We must first obtain a list of interesting genes.  We will use the results obtained in the tutorial on hierarchical clustering.  In brief, this clustering result can be recreated by these steps:
 
  
* In the Arrays/Phenotypes component, select the set of arrays labeled "ultrashort designation".
+
[[Image:T_Markers_ClusterTree12and44sel.png]]
* Activate two classes of arrays to compare, the GC B-cell and non-GC B-cell, by checking the boxes before the names.
 
* Go to the Analysis component, and select Hierarchical Clustering.
 
* At the bottom of the Analysis component, '''uncheck the box that says "All Arrays"'''.  This will allow the clustering to be done only on those arrays which were activated in the Arrays/Phenotypes component. 
 
  
 +
* In the Gene Ontology component, choose the type of GO term that one wants, either '''Component''', '''Function''' or '''Process'''.  In this example we will select the '''Process''' tab.
 +
* Click on '''Map List(s)'''.
 +
* We see in the picture below that 33 of the 56 total markers were placed in functional categories.  By scrolling and by clicking on individual tree nodes, we can explore the tree.  We see that the largest single category, with 23 hits, was binding.
  
* In Hierarchical Clustering, set the parameters to:
 
** Clustering Method: Total Linkage
 
** Clustering Dimension: Both
 
** Clustering Metric: Euclidean
 
  
*Click '''Analyze'''.
+
[[Image:T_GeneOntology_TreeView.png]]
  
The results will be displayed in the Dendrogram component.
 
  
[[Image:T_HierarchicalClustering_BCregion.png]]
 
  
By scrolling down a bit, one finds a large interesting area, showing clear differences between groups of arrays.  We will select two clearly differentiated clusters.  Check the '''Enable Zoom''' checkbox.  Then highlight the first cluster of 12 markers as shown here:
+
[[Image:T_GeneOntology_TreeView_detail.png]]
 
 
[[Image:T_HierarchicalClustering_BC12Markers.png]]
 
 
 
 
 
Then left-click to select this subset of the dendrogram.  It will be displayed alone.
 
 
 
[[Image:T_HierarchicalClustering_BC12MarkersZoom.png]]
 
 
 
 
 
Now right-click and select "Add to Set".  In the Markers component, the select genes are added as Cluster Tree [12], where 12 is the number of markers selected.
 
 
 
 
 
Repeat for the similar region just below, which contains another 44 markers.
 
 
 
 
 
[[Image:T_HierarchicalClustering_BC44Markers.png]]
 

Revision as of 23:48, 24 April 2006

Home | Quick Start | Basics | Menu Bar | Preferences | Component Configuration Manager | Workspace | Information Panel | Local Data Files | File Formats | caArray | Array Sets | Marker Sets | Microarray Dataset Viewers | Filtering | Normalization | Tutorial Data | geWorkbench-web Tutorials

Analysis Framework | ANOVA | ARACNe | BLAST | Cellular Networks KnowledgeBase | CeRNA/Hermes Query | Classification (KNN, WV) | Color Mosaic | Consensus Clustering | Cytoscape | Cupid | DeMAND | Expression Value Distribution | Fold-Change | Gene Ontology Term Analysis | Gene Ontology Viewer | GenomeSpace | genSpace | Grid Services | GSEA | Hierarchical Clustering | IDEA | Jmol | K-Means Clustering | LINCS Query | Marker Annotations | MarkUs | Master Regulator Analysis | (MRA-FET Method) | (MRA-MARINa Method) | MatrixREDUCE | MINDy | Pattern Discovery | PCA | Promoter Analysis | Pudge | SAM | Sequence Retriever | SkyBase | SkyLine | SOM | SVM | T-Test | Viper Analysis | Volcano Plot


Overview

The Gene Ontology component allows the exploration of the Gene Ontology (GO) terms represented within a list of genes. Several different display options are available. The entire GO can be displayed as a tree (TreeView), with the selected genes being shown within the tree. Or, the list of genes can be displayed sorted by their overrepresentation P-values (TableView). This P-value is calculated from the observed vs expected number of hits to a category based on its representation in markers annotated in the microarray type as a whole, e.g. HG_U95Av2.

Example

In this example we will use the file "webmatrix_quantile_log2_dev1_mv0.exp" available in the tutorial data section (coming soon). The lists of markers were obtained as shown in the Clustering tutorial. The first contains 12 markers and the second 44, comprising together one larger cluster. The marker lists can also be directly loaded from the files cluster_tree_12.csv and cluster_tree_44.csv found in the tutorial data section (coming soon).

  • Activate both of the lists by checking their boxes in the Markers component as shown:

T Markers ClusterTree12and44sel.png

  • In the Gene Ontology component, choose the type of GO term that one wants, either Component, Function or Process. In this example we will select the Process tab.
  • Click on Map List(s).
  • We see in the picture below that 33 of the 56 total markers were placed in functional categories. By scrolling and by clicking on individual tree nodes, we can explore the tree. We see that the largest single category, with 23 hits, was binding.


T GeneOntology TreeView.png


T GeneOntology TreeView detail.png