Home

Revision as of 11:04, 13 September 2010 by Ut2102 (talk | contribs) (Quick Start)

Quick Start

Please see the Quick Start guide to geWorkbench to see how to get started using geWorkbench right away. We are continuing to develop new material for this guide.

Overview

Welcome to geWorkbench. The current version is 2.1.0, released September 10th, 2010.


The latest Release Notes and downloads can be obtained from https://gforge.nci.nih.gov/frs/?group_id=78. Installation instructions can be found on the Download and Installation page of this Wiki.


geWorkbench (genomics Workbench) is a Java-based open-source platform for integrated genomics. Using a component architecture it allows individually developed plug-ins to be configured into complex bioinformatic applications. At present there are more than 70 available plug-ins supporting the visualization and analysis of gene expression and sequence data. Example use cases include:

  • loading data from local or remote data sources.
  • visualizing gene expression, molecular interaction networks, protein sequence and protein structure data in a variety of ways.
  • providing access to client- and server-side computational analysis tools such as t-test analysis, hierarchical clustering, self organizing maps, regulatory neworks reconstruction, BLAST searches, pattern/motif discovery, etc.
  • validating computational hypothesis through the integration of gene and pathway annotation information from curated sources as well as through Gene Ontology enrichment analysis.


geWorkbench is the Bioinformatics platform of MAGNet, the National Center for the Multi-scale Analysis of Genomic and Cellular Networks (one of the 7 National Centers for Biomedial Computing funded through the NIH Roadmap). Additionally, geWorkbench is supported by caBIG®, NCI's cancer Biomedical Informatics Grid initiative.


End-user and developer support for geWorkbench is provided through the caBIG® Molecular Analysis Tools Knowledge Center, a component of the caBIG® Enterprise Support Network.

Graphical User Interface

GeWB GUI Cytoscape.png


Summary of changes in geWorkbench release 2.1.0

Release Date: September 10, 2010.

  • BLAST
    • A major upgrade of the built-in BLAST interface now provides almost all query options available on the NCBI BLAST website.
    • geWorkbench can retrieve full or partial sequences for BLAST hits.

A recent change at NCBI caused this to stop working. This is fixed in this release.

  • Filtering
    • A new Coefficient of Variation data filter has been added. This scales expression profile standard deviations by their means, so that profiles can be filtered on a directly comparable measure of variation.
  • Gene Ontology expandable tree views were added to the
    • Gene Ontology Enrichment viewer
    • Cellular Network Knowledge Base (CNKB) viewer
  • System Information tool
    • A menu item was added which provides system information such as Java memory allocated and used, path to the current JRE, and Operating System details.
  • Arrays component
    • The members of an array set can now be saved as a list to a file on disk, matching functionality already present for markers.
  • Online Help chapter updates (help files built-in to geWorkbench)
    • BLAST (Sequence Alignment component) – fully revised.
    • Filtering - added section for Coefficient of Variation filter.
    • MINDy - added section on using ARACNe preprocessing.
    • Pattern Discovery – fully revised.
  • Cytoscape component
    • Updated to Cytoscape version 2.7.0.
  • Bugs
    • A number of bugs were fixed; full details are available in the Release Notes.
  • Refactoring
    • A project of ongoing refactoring and simplification was continued in order to enhance long-term maintainability and performance of the code.

Summary of changes in geWorkbench release 2.0.2

Release Date: July 16, 2010.

  • Fixed problem with genSpace logging.
  • Fully revised Online Help chapter for MINDy.

Summary of changes in geWorkbench release 2.0.1

Release Date: June 25, 2010

  • Fixed a problem with caGrid connectivity.
  • Fully revised Online Help chapter for the Cellular Networks Knowledge Base (CNKB) component.

Summary of changes in geWorkbench release 2.0.0

Release Date: June 9th, 2010

Major new features

  • Filtering - completely revamped - now works directly for all modes, allows specification of minimum % matching arrays before filtering occurs.
  • File parsers added:
    • MAGE-TAB data matix
    • GEO Soft format - added series (GSE) and curated matrix (GDS).
  • Java 6 - Moved from Java 5 to Java 6. geWorkbench now requires Java 6. Works on both 32 bit and 64 bit VMs (JREs).
  • Look and Feel - Switched to new, more modern Look and Feel (Nimbus). geWorkbench appearance now consistent across all platforms.
  • caBIO component updated from 4.2 to 4.3.
  • Cellular Network Knowledge Base (CNKB) - Revamped interface to allow choice of interactome and data types.
  • More than 250 additional "bug reports" were closed. These included improvements in the usability of numerous components, and actual bug fixes.

New Components

  • Skybase - SkyBase is a database that stores the homology models built by SkyLine analysis for all NESG PSI2 protein structures. It is queried using FASTA-format protein sequence files.
  • Skyline - A high-throughput comparative modeling pipeline. It creates structural homology models for protein sequences with similarity to a protein with an experimentally determined 3-D structure. The input is a PDB file. (Depends on an internal server, external use not yet enabled).
  • Pudge - Interface to a protein structure prediction server which integrates tools used at different stages of the structural prediction process. Modeling starts with a FASTA-format protein sequence file.


Other major changes

  • caArray - Improved memory usage on downloads from caArray.
  • CNKB - Can now return markers direct from CNKB without use of Cytoscape.
  • Color Mosaic - enhancements to display (bug 2147):
    • toggle array names on/off
    • search on array name, accession, or label
  • Component Configuration Manager - now can filter display list by categories: Analysis, Viewer, Normalizer, Filter.
  • Cytoscape - Corrected mapping between gene names in Cytoscape display and markers in Marker Sets panel (now uses Entrez IDs).
  • Dendrogram - can now create Array subsets as well as marker subsets.
  • Markers and Arrays - Hover text available in Markers and Arrays phenotypes to visualize long names if needed.
  • Marker Annotation - search results can be saved to a text file, including relevant URLs and pathway BioCarta pathway names.
  • File loading - Checking for "out of memory" errors during file loading.
  • GUI - in switching to new Look and Feel, fixed many text highlighting problems that were previously seen on Macintosh only but now appeared on Windows also.
  • File parser menu - The file parser selection menu now shows valid file extensions for each type.
  • Promoter - JASPAR promoter motifs now filterable by taxon.
  • Sequence alignment (BLAST) - many enhancements, including added additional databases to match those listed at NCBI improved handling of results from searches containing long query sequences.


See also the list of changes in previous releases.