- 1 Overview
- 2 geWorkbench Configuration
- 3 Workspace and Data Management
- 4 Microarray Data Displays
- 5 Gene and Pathway Annotations
- 6 Statistical tests, clustering and classification
- 6.1 Gene Ontology Term Over-representation Analysis
- 6.2 T Test
- 6.3 Analysis of Variance (ANOVA)
- 6.4 Hierarchical Clustering Dendrogram
- 6.5 SOM Clustering
- 6.6 Consensus Clustering (GenePattern)
- 6.7 Classification (GenePattern)
- 7 Sequence Analysis / Pattern Discovery
- 8 Network Discovery and Visualization
- 9 Molecular Structure
The geWorkbench graphical interface with a CNKB query result dislayed in Cytoscape.
Component Configuration Manager
Individual components can be loaded as needed.
Workspace and Data Management
Data sets are loaded into the Workspace. Individual analysis results are stored under their parent dataset.
Sets of arrays can be defined and included or excluded from particular analyses (via checkboxes). Sets can be individually marked as belonging to case (here indicated with a red thumbtack) or control groups.
Sets of arrays can be defined and included or excluded from particular analyses (via checkboxes).
In addition, routines such as ANOVA or t-test return lists of significant markers to the Markers component.
Retrieval from caArray
Microarray data can be retrieved directly from instances of caArray.
Microarray Data Displays
The Microarray Viewer displaying marker values for selected array.
Tabular Microarray Viewer
The Tabular Microarray Viewer displays expression values in spreadsheet format.
CEL Image Viewer
Allows viewing of Affyemtrix CEL files.
The Color Mosaic component displaying a result from ANOVA analysis. It can also directly display the loaded expression data, or subsets of that data created using Marker and Array sets.
Expression Profile plotting values for selected markers and arrays. Individual values can be seen by hovering over a desired data point.
Expression Value Distribution
The dataset has been quantile normalized and log2 transformed.
Compare multiple markers or arrays with the standard Scatter Plot analysis.
Array vs Array
Marker vs Marker
Gene and Pathway Annotations
Retrieve and display gene and pathway information from bioDBNet.
Marker Annotations - BioCarta Pathways
Displays BioCarta images retrieved via bioDBNet.
Statistical tests, clustering and classification
Gene Ontology Term Over-representation Analysis
A t-test result display on a "volcano plot": Log significance vs log fold change.
The t-test result can also be displayed in the Color Mosaic component.
(Visualization preference setting: Relative)
Analysis of Variance (ANOVA)
Detects markers for which a statistically significant difference exists in a data set containing multiple classes of samples.
Color Mosaic View
(Visualization preference setting: Relative)
Hierarchical Clustering Dendrogram
A Dendrogram displays the results of the Hierarchical clustering analysis.
Self Ordered Map clustering results are displayed as series of expression profiles corresponding to discovered groupings.
Consensus Clustering (GenePattern)
GenePattern components that perform classification on microarray datasets have been adapted to geWorkbench.
K-Nearest Neighbors (KNN)
Example of classifier result
The classifiers return groups of markers to the Markers component:
Sequence Analysis / Pattern Discovery
Retrieve genomic and protein sequences for selected markers. Retrieved sequences can be individually selected and added to the project as new data nodes.
The Sequence Alignment component submits BLAST jobs to the NCBI server and displays the results such that individual hits can be used in further analysis steps.
Use the SPLASH algorithm to discover sparse amino or nucleic acid patterns in a loaded sequence.
Motif discovery and display
The Pattern Discovery component itself with results displayed in the sequence viewer.
The Pattern Discovery component with results displayed as histogram of support for selected discovered motifs across the sequence data set. Support indicates what fraction of the seqeunces are matched by the motif at a within a sliding window about a given location.
Individual motifs from the JASPAR Transcription Factor Binding Profile Database can be scanned against loaded genomic sequences.
Motif selection and Logo display
Result of a scan against a single sequence
Sequence-level display of match
MatrixREDUCE is a tool for inferring the binding specificity and nuclear concentration of transcription factors from microarray data.
Network Discovery and Visualization
Cytoscape - ARACNe Network display
The adjacency matrix generated by an ARACNe network reverse engineering run displayed in Cytoscape.
Cellular Network Knowledge Base (CNKB)
Results of queries against the CNKB can be filtered based on confidence values using the throttle graph.
Query results and Throttle Graph
CNKB query results displayed in Cytoscape
Display of CNKB interactions in Cytoscape.
Master Regulator Analysis
JMOL Structure Viewer
JMOL is a viewer for PDB protein structure files.
Mark-Us - Protein Functional Annotation
Mark-Us is a web server to assist the assessment of the biochemical function for a given protein structure. MarkUs identifies related protein structures and sequences, detects protein cavities, and calculates the surface electrostatic potentials and amino acid conservation profile.
Pudge is a server for the prediction of the 3D structures of proteins. While the server can be run without any user intervention, it is primarily designed to be interactive and to allow functional information to be used as a guide to the modeling.
PUDGE allows a pipeline of modeling and evaluation steps, depicted below, to be set up and run.
Pudge has a number of output types, the following illustrates a sequence alignment: