Grid Services

Revision as of 13:09, 11 October 2011 by Smith (talk | contribs) (List of grid services)

Home | Quick Start | Basics | Menu Bar | Preferences | Component Configuration Manager | Workspace | Information Panel | Local Data Files | File Formats | caArray | Array Sets | Marker Sets | Microarray Dataset Viewers | Filtering | Normalization | Tutorial Data | geWorkbench-web Tutorials

Analysis Framework | ANOVA | ARACNe | BLAST | Cellular Networks KnowledgeBase | CeRNA/Hermes Query | Classification (KNN, WV) | Color Mosaic | Consensus Clustering | Cytoscape | Cupid | DeMAND | Expression Value Distribution | Fold-Change | Gene Ontology Term Analysis | Gene Ontology Viewer | GenomeSpace | genSpace | Grid Services | GSEA | Hierarchical Clustering | IDEA | Jmol | K-Means Clustering | LINCS Query | Marker Annotations | MarkUs | Master Regulator Analysis | (MRA-FET Method) | (MRA-MARINa Method) | MatrixREDUCE | MINDy | Pattern Discovery | PCA | Promoter Analysis | Pudge | SAM | Sequence Retriever | SkyBase | SkyLine | SOM | SVM | T-Test | Viper Analysis | Volcano Plot


Overview

Most analytic routines (e.g. clustering, t-test etc.) available in geWorkbench are implemented directly in the geWorkbench desktop application code. They run on your local PC.

In cooperation with caBIG(R), the National Cancer Institute's Cancer Biomedical Informatics Grid program, a number of the geWorkbench analysis components have also been adapted to run as services on caGrid, the primary infrastructure component of caBIG. In accordance with caBIG principles, each has a well-defined object design and a public application programming interface (API) via which data can be exchanged. Annotations describing each service, object and parameter are stored in the caDSR (NCI's Cancer Data Standards Repository), using standard vocabulary terms available from the Enterprise Vocabulary Services (EVS).

Some services are implemented only remotely.

Each geWorkbench analysis component that has an associated grid service will show a Services tab in the Analysis framework, adjacent to the Parameters tab.

Services tab

Grid services empty.png

  • Local - When the "Local" radio button is selected, the calculation will be performed directly within geWorkbench, if available. Some analyses have no local implementation.
  • Change Index Service - Index Services maintain lists of available grid services. geWorkbench is delivered with the URL of a Columbia Index Service preconfigured, which provides access to demonstration grid service implementations.
  • Change Dispatcher - The Dispatcher is a geWorkbench server-side component which provides connectivity between geWorkbench and caGrid. geWorkbench is delivered with the URL of a Columbia Dispatcher Service preconfigured.


Search Grid Services

Grid service ANOVA.png

When the Search Grid Services button is pushed, the list of available services of the desired type will be retrieved from the specified index service. The list will appear in the area below, with each available service preceded by a radio button. The desired remote grid service can be selected using these radio buttons.


Service Details

Grid service ANOVA selected.png

Once a particular grid service has been selected (via its radio button), the details of the service will be displayed in the lower window.

Running a grid job

  1. The Index Service and Dispatcher URLs are set to default geWorkbench services. If needed, choose an appropriate alternate Index service and/or Dispatcher service.
  2. Push the Grid Services button
  3. Select an available grid service.
  4. Return to the Parameters tab, and when ready, push the Analyze button.

Some services require login credentials. If so, a dialog will appear asking for a Username and Password. If you possess the appropriate credentials for the service you have selected, enter them here and push OK.

Grid services Username.png


A message may appear indicating that the job is being submitted.

T Grid Services Submitting request.png


While the job is running, a node marked "Pending" will be placed in the Project Folders component, preceeded by an hourglass icon. Note that the progress bar that appears when analyses are run locally within geWorkbench will not appear for grid jobs.

Grid services pending.png

Further aspects of running grid jobs

  1. The grid job, once started, is independent of geWorkbench. The dispatcher component cooperates with geWorkbench to track job status. A geWorkbench workspace containing running grid jobs can be saved and later restored. At the time that the saved workspace is reloaded, geWorkbench will resume monitoring the job for completion, and retrieve the finished results if available.
  2. Once a grid job has been started, its execution cannot be canceled from within geWorkbench. However, the "pending" node can be removed from the Project Folders component. In this case, geWorkbench will not receive any results when the calculation actually completes.


List of grid services

The following analyses in geWorkbench are provided via grid services. Some are password protected.

Some analyses can be run either locally within geWorkbench, or via an external grid service. Some grid services hosted at Columbia require a username/password. The following table summarizes this information for geWorkbench components that can utilized grid services.


component local avail. remote service type grid username/password req'd.
ANOVA yes grid yes
ARACNe yes grid yes
Hierarchical Clustering yes grid yes
MarkUs no web or grid no
MINDy yes grid yes
MatrixREDUCE yes grid yes
SkyBase no grid no
SkyLine no grid yes
SOM yes grid yes