Release Notes geWorkbench V2.2.2 August 19th, 2011 Joint Centers for Systems Biology, Columbia University New York, NY 10032 http://www.geworkbench.org ================================================================ Contents ================================================================ 1.0 geWorkbench Installation Notes 2.0 geWorkbench Introduction and History 3.0 Release History: New Features and Updates 4.0 Known Issues/Defects 5.0 Bug Reports and Support 6.0 Documentation and Files 7.0 geWorkbench Web Pages ================================================================ 1.0 geWorkbench Installation Notes ================================================================ System Requirements: Java: The Java 6 JRE is required. On Windows and Linux it can be installed separately or together with geWorkbench. On MacOSX, the Java 6 JRE is included with MacOSX versions 10.5 (with updates) and 10.6. Please note that Java 6 is, in places, referred to by Sun as Java 1.6. 32-bit and 64-bit versions of Java can be used on appropriate platforms. See http://java.sun.com/javase/downloads/index.jsp Memory: At least 2 GB is recommended. geWorkbench by default will request up to 1 GB of memory for the Java VM. Operating System: Windows XP/Vista/Windows 7 (32 or 64-bit): no special requirements. MacOSX: Version 10.5 (with updates) or higher is required to provide the Java 6 JRE. Linux: no special requirements known. All three platform-specific versions of geWorkbench (Windows, Linux, and Macintosh) provide an installation wizard (generated using InstallAnywhere). A generic version of geWorkbench, which does not use any installer, is also available. Additional installation details are provided below, and at www.geworkbench.org. All user documentation is maintained in online form at www.geworkbench.org. geWorkbench, unless otherwise noted for particular components, can be run on both 32 and 64-bit operating systems and JREs. Platform-specific release details: 1. Windows (XP/Vista/Windows 7) Special note for Vista/Windows 7 - if you run the installer on Vista or Windows 7, please install geWorkbench to your user home directory, e.g. c:\Users\username\geWorkbench_2.2.2, where "username" is your login name, rather than to C:\Program Files\geWorkbench_2.2.2. File: geWorkbench_v2.2.2_Windows_installer_with_JRE6.exe Includes the 32-bit Sun Java 6 JRE. File: geWorkbench_v2.2.2_Windows_installer_noJRE.exe No JRE is included, you must make sure that an appropriate Java 6 JRE is installed on your system before installing geWorkbench. Download and double-click the installer file to begin installation. 2. MacOSX File: geWorkbench_v2.2.2_MacOSX_installer.zip. This version uses the Java 6 JRE included with recent updates to the MacOSX operating system. Double-click geWorkbench_v2.2.2_MacOSX_installer.zip to begin installation. Notes - Requires Mac OS X 10.5 (with updates) or higher. 3. Linux File: geWorkbench_v2.2.2_Linux_installer_with_JRE6.bin. Includes the 32-bit Sun Java 6 JRE. The Linux version of geWorkbench relies on X-Windows being installed and running. If you are running Linux on a server and e.g. Windows on your desktop, you will also need to run an X-windows server on your desktop machine. Further information can be found on the Download and Installation page of www.geworkench.org. After downloading, cd (if needed) to the directory to which you downloaded the installer. To begin the installation, type the command: "sh ./geWorkbench_v2.2.2_Linux_installer_with_JRE6.bin". This will extract geWorkbench into a new directory called geWorkbench_2.2.2. To run geWorkbench, and assuming you are using the Linux bash shell, issue the command: "./rungeWorkbench_2.2.2" 4. Generic - A non-installer-based version of geWorkbench is supplied in a Zip file which should work on any platform. File: geWorkbench_v2.2.2_Generic.zip Installation: Unzip the file. It will create a directory geWorkbench_2.2.2. Running geWorkbench (generic): You must have the Java 6 JRE installed and the JRE must be in the path for geWorkbench. Windows: you can double click on the file "launch_geWorkbench.bat" to launch geWorkbench, or run it from a command window. Linux/Unix: Execute the script "launch_geworkbench.sh". Any: Alternatively, if you have Apache Ant installed, you can type "ant run" in the geWorkbench directory. ================================================================ 2.0 - geWorkbench Introduction and History ================================================================ geWorkBench, an open source bioinformatics platform written in Java, makes sophisticated tools for data management, analysis and visualization available to the community in a convenient fashion. geWorkbench evolved from a project, caWorkbench, which was originally sponsored by the National Cancer Institute Center for Bioinformatics (NCICB). Some of the most fully developed capabilities of the platform include microarray data analysis, regulatory network inference, sequence analysis, transcription factor binding site analysis, and pattern discovery. ================================================================ 3.0 Release History: New Features and Updates ================================================================ geWorkbench 2.2.2 (2011-08-19) ============================== ---Summary--- geWorkbench 2.2.2 is a bug fix release. It corrects three issues with the Gene Ontology Analysis and Gene Ontology Viewer components. In certain circumstances after browsing the GO tree, if genes for a term were copied to a new Marker set, genes with no EntrezID were also copied. In the second issue, running GO analysis after restoring a saved workspace did not work. In the third issue, creating a set of new markers from a GO term did not update pulldown menus properly, e.g. in the ARACNe hub selection pulldown. A new feature was added such that the list of arrays available for download from caArray ("remote" option on file open box) are presented in alphabetical order. The installation instructions for Windows 7 and Vista have been updated. As only users with administrative privileges on their machines can install to C:\, we are now recommending installation to the user's home directory. --- Changes/fixes in release 2.2.2 --- File Loading - #2705 - caArray array list searchability Gene Ontology - #2710 - GO tree browser returns additional markers - #2711 - Gene Ontology analysis fails after reloading workspace - #2713 - Marker set from GO does not appear in ARACNe Hub Markers box when selected There are no further changes included in this release. geWorkbench 2.2.1 (2011-07-29) ============================== ---Summary--- geWorkbench 2.2.2provides a number of improvements, especially as related to network import, display and export, sequence retrieval, and pattern discovery. If a network is generated or loaded that may be to large to view in Cytocape, it can now instead by viewed as a text file. The new "Fold Change" analysis component is included, which was to be released in version 2.2.0 but was omitted. A new feature to overlay a t-test result onto a Cytoscape network, which was inadvertantly disabled in release 2.2.0, now works again. The Gene Ontology analysis component now supports uses of alternate ontology files (e.g. from the GO website). Sequence retrieval for DNA sequences is now done using the UCSC refGene table, and is available for all organisms with genomes supported by UCSC. --- New features and changes in release 2.2.1--- ANOVA - #2618 - Control buttons repositioned for easier use. - #2694 - The simple range check as a test of whether data was log-normalized was removed. ARACNe - #2494 - Bonferroni correction option added. - The "Choose edges with hightest MI" option was renamed to "Merge multiple probesets". - #2648 - The adjacency matrix is no longer made symmetric. Each line now starts with a hub gene. - Similarily, the adjacency matrix, when saved to file, is not made symmetric. - #2679 - The "merge multiple probesets" option was not recorded in the dataset history. - Merging of multiple probesets is now just within a hub gene, not global. - #2684 - When a subset of markers was activated, the grid service version used an incorrect offset into the marker list. BLAST - #2665 - Previous search results were not cleared if a following search returned no hits. Color Mosaic - #2401 - Export button and right-click menu - both now support export of significance results (p-values) from both t-test and ANOVA. CNKB - Export of network adjacency matrix from Project Folder omitted gene/probeset names. Cytoscape - t-test overlay on network did not work in release 2.2.0 due to a late change. Fixed. - #2650 - Display all networks at the gene level. When multiple probesets represent a gene in the network, the network will be summarized at the gene level. As a result, the count of the number of nodes and edges displayed in Cytoscape may differ from the counts present in, e.g. a complete ARACNe adjacency matrix (ARACNe probeset merging option not used). - #2589, #2638 - If a network might be too large to view in Cytoscape, the user is offered an alternate visualization (text). The threshold for warning can be changed by the user. This replaces warnings that were previously built in to several components e.g. ARACNe. Now, an adjacency matrix is always created, regardless of size. - #2597, #2598, #2599 - Inconsistencies in how selected nodes in Cytoscape were reflected as markers in the Markers component were found and fixed. File Open Dialog - #2698 - The "merge" checkbox is no longer disabled when a file type is selected which cannot be merged. There were synchronization problems. Fold-Change Analysis - New component accidentally omitted from release 2.2.0. Network Import - Networks can now be imported either from Adjacency matrix files (#2432) or from SIF-format (#2434) files. The networks can be represented by gene symbols, Entrez IDs, or probeset names, or by some other identifier (#2432). Networks are represented in memory with the same identifiers that were imported. That is, if a network is read in using gene symbols, then it is represented at this level within geWorkbench. Probesets for each gene can be obtained as needed for particular analyses. - #2608, #2645 - On import, .SIF and .ADJ file extensions are automatically recognized. - On export, .ADJ is automatically added to file. Gene Ontology Analysis - #2512 - The analysis now supports using an alternate annotation file rather than an Affymetrix annotation file. Organism annotation files from the geneontology.org website can now be used. GenSpace - The J2EE framework used in geWorkbench 2.2.0 was replaced with one based on SOAP. - #2666 - The "Remove Friend" feature did not properly update the GUI. Marker Annotations - #2658 - a missing CGAP URL was added to the export file. MarkUs - #2622 - Added support for job description and email notification fields. Master Regulator Analysis - #2610 - MRA can now be run from gene-level adjacency matrices created by CNKB. - #2301 - When saving results (target lists for MRs), a ".csv" file chooser is used and will append this extension automatically. MINDy - #2659 - If MINDy was run remotely on the grid service, sorting on the modulators column of the results table did not work. - #2657 - The "All Markers" checkbox was removed, as its function has been taken over by a separate filter control. - #2314 - An inconsistency in how modulators and targets were selected/deselected with respect to the "Displayed Targets Filter" was fixed. Pattern Discovery - #2682 - Some parameters were not properly captured in the dataset history. - #2672 - Pattern nodes were not always properly restored when the workspace was saved and restored. - #2664 - In Exhaustive discovery mode, the "min support" parameter is now input only as an integer representing the floor for support. The option to enter the minimum exhaustive support as a percent of initial support was removed. - #2663 - If discovered patterns were saved to file and reloaded, they were not always displayed. - #2593 - Problems that occurred when a run was canceled or the server was unreachable were fixed. - The Pattern Discovery (Splash) server URL was changed to splash.c2b2.columbia.edu. PCA - #2670 - PCA did not run if geWorkbench was installed to a path containing a space. Project Folders - #2626 - Mousing-over a network node will display the number of nodes and edges in hover text. - #2646 - When saving a network, the ".adj" file filter will append ".adj" to the file created. Sequence Retriever (#2630, #2631, #2632) - For DNA, now queries the "refGene" table rather than UCSC "known genes", so can now retrieve genomic sequence for all UCSC supported species. - The transcription start point was off by one in query and display. - Identifiers for retrieved sequences were being truncated if long. - The "local" option for sequence retrieval was removed. - The sequence location "spinner" controls can now be set to zero, and step in increments of 100. geWorkbench Installer - When run in Windows, the installer version of geWorkbench now displays its correct icon in the Windows task bar. ---Online Help updates--- Help files were updated for the following components: - ANOVA - ARACNe - BLAST - CNKB - Cytoscape - Filtering - JMOL - MarkUs - Master Regulator Analysis - MINDy - Normalization - Pattern Discovery - Promoter Analysis - Pudge - t-test ---Versions of external files/components included in release 2.2.1--- - Cytoscape 2.7.0 (no change). - gene_ontology.1_2.obo downloaded 2011-06-29 from geneontology.org. - Ontologizer.jar version 2.0, file released 2010-03-10 (no change). - Jaspar_CORE (http://jaspar.genereg.net/) SQL files last updated on server 2009-10. (no change). - JMOL 12.0.45 (updated). geWorkbench 2.2.0 (2011-05-24) ============================== ---Summary--- geWorkbench 2.2.0 is a major release containing more than 180 new features, enhancements and bug fixes. The most important of each are summarize below. New filters were added to give the user options to deal with many-to-many relationships between genes and markers (probesets). New network comparison and manipulation features were added to the Cytoscape component. Signficant improvements were made to the Master Regulator Analysis component to enable the use of recently published procedures. The Gene Ontology component can now serve as a full, standalone GO term browser. Options for import and export of interaction networks (interactomes) were added. --- New features and changes in release 2.2.0--- CNKB - #2389 - export complete CNKB interactomes (SIF or ADJ formats). Cytoscape - #2424 - 1) calculate the Pearson's correlation coefficient for the expression profiles of each pair of nodes connected by an edge in an interaction network. Filter edges to display based on the magnitude of the correlation coefficient. - #2424 - 2) create a new subnetwork containing only edges exceeding correlation threshold calculated in (1). - #2429 - From an existing network, create a subnetwork containing only nodes in a marker set defined in the Markers component. - Color edges by interaction type. GenSpace - numerous improvements. File Parsers - #2388 - import an ARACNe adjacency matrix from a file. Filters - #2444 - Multiple probeset per gene filter - for genes with multiple probesets (markers), remove all but one probeset based on: retain only (a) highest coefficient of variation, (b) highest mean, (c) highest median. - #2445 - Multiple Entrez GeneID Filter: Filter out markers which are annotated to (a) no Entrez gene id, or (b) multiple Entrez gene ids. Fold-change Analysis - #2431 - a new component that performs fold-change analysis and places markers that pass the specified threshold into two new sets in the Markers component: one for positive fold-change, and the other for negative fold-change. Gene Ontology - The Gene Ontology component is now always available when a microarray dataset has been loaded along with its annotation file. - The GO Tree can be browsed or searched for any term. - The markers annotated to any term can be returned to a new set in the Markers component. - #1875 - most recent Gene Ontology OBO file now downloaded automatically from internet when geWorkbench started, with option to instead use a specified file from disk. GenePattern GSEA - analysis component added. Master Regulator Analyis - #2523 and others. Master Regulator Analysis component fully revised. Now allow any list of markers to be used for the phenotype signature. Bar code graph revised to match style of published work. ---Enhancements and selected bug fixes--- ARACNe - #2366 - in ARACNe, bootstrapping is re-enabled but only single threaded. - #2482 - ARACNe results can now be pruned to retain only highest MI edge per gene-gene pair, or return all edges. BLAST - #2419 - continued improvements to BLAST interface to match NCBI website functionality and to improve usability. CCM - "Sequence Analysis" -> "BLAST Analysis", "Alignment Viewer" -> "BLAST Alignment Viewer". Gene Ontology Viewer - #2391 - When a marker set is returned for a GO term, the set is given the term name. GEO Soft - #2402, #2462, #2465 - GEO Soft parsers improved to handle various special cases - multiple platforms, missing values, mixed sample and data matrix files... Grid Services - #1773 - simplified grid service activation (removed one radio button). JMOL - #2505 - updated to JMOL version 12.0.35. Markers/Arrays component - #2430 - dynamic filtering of displayed marker or array list as search term is entered. MarkUs - #2500 - Add ability to retrieve prior MarkUs jobs by job id - #2509 - add private key option to MarkUs job submission MINDy - #2214 - corrected sign of modulation effect in table displays. Pattern Discovery #2119 - corrected display problems in Pattern Discovery related to regular expressions and use of substitution matrices. Preferences - #2393 - Added ability to reorder data sorted by marker name, gene name, or original order (set in preferences, affects all components). Sequence Retriever - #2518 - fixed problem with obtaining name of latest human genome build from Santa Cruz. Tabular Microarray Viewer - #2253 - Tabular Microarray viewer now allows adjustable precision in display, and choice of fixed or scientific notation. t-test - #1626 - changed math package used to correct precision problem with p-value calculation at very small p-values. Volcano Plot - #2492 - extreme point color range corrected. ---Online Help updates--- - ARACNe - better descriptions of some features. - CNKB - fully updated to document new features. - Cytoscape - fully updated to document new features. - Master Regulator Analysis - full update to reflect - changed handling of signature genes - changes to "bar code" graph generation - MINDy - updated to reflect bug fix. - PatternDiscovery - Replaced all screenshots to reflect GUI improvements. - t-test - fully updated. ---Refactoring--- -A major refactoring occured throughout the core and many components. ---Versions of external files/components included in release 2.2.0--- - Cytoscape 2.7.0 (no change). - gene_ontology.1_2.obo downloaded 2011-04-19 from geneontology.org. - Ontologizer.jar version 2.0, file released 2010-03-10 (no change). - Jaspar_CORE (http://jaspar.genereg.net/) SQL files last updated on server 2009-10. (no change). - JMOL 12.0.35 (updated). geWorkbench 2.1.0 (2010-09-10) ============================== --- New features and changes in release 2.1.0--- - BLAST interface (Sequence Alignment component) upgraded to include all parameter options offered on NCBI BLAST website (#2019, #2323). - CNKB component and Gene Ontology Enrichment Viewer - Introduced an "Expand all" functionality for GO Tree view (#2303). - Coefficient of Variation data filter added. - System Information display added (Main menu->Help->System Info), including Java memory allocated and used (#2340). - Arrays component - can now save list of array subsets (#2297). - Cytoscape - updated to version 2.7.0 (#2163). - Online Help - BLAST (Sequence Alignment) full update of chapter to match new functionality. - Filtering - added section for Coefficient of Variation filter. - MINDy - added section on using ARACNe preprocessing. - Pattern Discovery - chapter replaced with new material from Wiki. - Fixed list formatting problem in chapters ported from wiki. ---bug fixes (selected)--- - Annotation file parser - duplicate entries in annotation file can now be skipped (#1624). - ARACNe bootstrap threading error - changed to single-threaded operation (#2366). - BLAST (Sequence Alignment) - Fixed problem where due to a change at the NCBI BLAST site geWorkbench could not download sequences of BLAST hits (#2351). - CCM - Updated filter descriptions to match functional changes made in release 2.0.0. - CNKB - display only databases with available interactions (#2231). - CNKB - fix incorrect Gene Ontology tree display (#1210, #2303). - Filtering components -check for valid inputs (#2242). - Hierarchical Clustering - Fixed problem when saving a workspace that contains a large dendrogram display (#2190). - JMOL - Fixed a performance problem (#2305). - Pattern Discovery - temporary pattern node prevents workspace save (#2363). - SkyBase - add input validation (#2325, #2356). - SkyBase - fix save sequence functionality (#2357). - t-test - fixed inconsistencies in interface and functioning regarding choice of using the t-distribution vs permutations (#2365). - Welcome screen is now version aware - it will always appear first time when a new version is run (#2294). ---Refactoring--- - Hierarchical Clustering - major revamp in course of fixing bug #2190 above. - Plugin Object - code cleaned and refactored (#2312). - SequenceView Object - code cleaned and refactored (#2278). - SwingWorker - replace/remove deprecated and duplicate copies of SwingWorker (#2315). ---Versions of external files/components included in release 2.1.0--- - Cytoscape 2.7.0. - gene_ontology.1_2.obo downloaded 2010-09-03 from geneontology.org. - Ontologizer.jar version 2.0, file released 2010-03-10. - Jaspar_CORE (http://jaspar.genereg.net/) SQL files last updated on server 2009-10. - JMOL 12.0 RC10. geWorkbench 2.0.2 (2010-07-16) ============================== - Fixes a problem which prevented the genSpace component from posting events to its server. - Full update of the MINDy Online Help chapter. - DPI options in MINDy disabled, as not needed. geWorkbench 2.0.1 (2010-06-25) ============================== - Minor client-side changes to allow grid components to communicate with grid services behind the Columbia firewall. - Updates the CNKB Online Help chapter. geWorkbench 2.0.0 (2010-06-09) ============================== ---New components--- - Skyline - A high-throughput comparative modeling pipeline. It creates structural homology models for protein sequences with similarity to a protein with an experimentally determined 3-D structure. The input is a PDB file. - Skybase - SkyBase is a database that stores the homology models built by SkyLine analysis for all NESG PSI2 protein structures. It is queried using FASTA-format protein sequence files. - Pudge - Interface to a protein structure prediction server which integrates tools used at different stages of the structural prediction process. Modeling starts with a FASTA-format protein sequence file. ---Other major new features in release 2.0.0--- - Cellular Network Knowledge Base (CNKB) - Revamped interface to allow choice of interactome and data types. - File parsers added: MAGE-TAB data matix GEO Soft format - added series (GSE) and curated matrix (GDS). - Filtering - completely revamped - now works directly for all modes, allows specification of minimum % matching arrays before filtering occurs. - More than 250 "bug reports" were closed. These included many new features, improvements in the usability of numerous components, and actual bug fixes. - Java 6 - Moved from Java 5 to Java 6. geWorkbench now requires Java 6. Works on both 32 bit and 64 bit VMs (JREs). - Look and Feel - Switched to new, more modern Look and Feel (Nimbus). geWorkbench appearance now consistent across all platforms. - caBIO component updated from 4.2 to 4.3. ---Other major changes in release 2.0.0--- - caArray - Improved memory usage on downloads from caArray. - CNKB - Can now return markers direct from CNKB without use of Cytoscape. - Color Mosaic - enhancements to display (bug 2147): toggle array names on/off search on array name, accession, or label - Component Configuration Manager - now can filter display list by categories: Analysis, Viewer, Normalizer, Filter - Cytoscape - Corrected mapping between gene names in Cytoscape display and markers in Marker Sets panel (now uses Entrez IDs). - Dendrogram - can now create Array subsets as well as marker subsets. - File loading - Checking for "out of memory" errors during file loading. - File Parser menu - The file parser selection menu now shows valid file extensions for each type. - GUI - in switching to new L&F, fixed many text highlighting problems that were previously seen on Macintosh only but now appeared on Windows also. - Markers and Arrays - Hover text available in Markers and Arrays phenotypes to visualize long names if needed. - Marker Annotation - search results can be saved to a text file, including relevant URLs and pathway BioCarta pathway names. - Online Help - New or fully updated chapters added for: - Component Configuration Manager - Filtering - Normalization - Promoter - JASPAR promoter motifs now filterable by taxon. - Sequence alignment (BLAST) - many enhancements, including added additional databases to match those listed at NCBI improved handling of results from searches containing long query sequences. ---Versions of external files/components included in release 2.0.0--- - Cytoscape 2.4. - gene_ontology.1_2.obo downloaded 5/24/2010 from geneontology.org. - Ontologizer.jar version 2.0, file released 3/10/2010. - Jaspar_CORE (http://jaspar.genereg.net/) SQL files last updated on server 10/2009. - JMOL - updated to 12.0 RC10. geWorkbench 1.8.0 (2009-11-05) ============================== ---New components--- - Gene Ontology Enrichment - Analysis and visual components. Analysis component is built on Ontologizer 2.0. ---Other changes in release 1.8.0--- - caArray - Update caArray component to use caArray 2.3.0 Java API. Please note that geWorkbench 1.8.0 is not compatible with earlier versions of caArray. - CNKB - The network graph generated by CNKB was only showing nodes centered about a focus node. Now all accepted nodes will be displayed. - Dataset History - Additions for several modules. - Grid Services - A number of fixes to grid services were made. - Marker Annotations - Fixed a problem with retrieving marker annotations when microarray data downloaded from caArray. - Mark-Us - JMOL dependency added for molecule display. - Promoter - Update JASPAR motifs to release of December 2007. -Note on October 12, 2009 a new version of JASPAR was released which made an incompatible change in the file format. - Promoter - component now displays logos using the "Schneider" method, including his "small-value correction", rather than using a previous "in-house" method. - Promoter - the displayed data now does not include the effects of the pseudo-count normalization process. - Promoter - Added ability to specify pseudocount or select previous hard-coded option of square root of number of sequences. - Promoter - Loaded TFs now are properly added to the list of available TFs. - Sequence Alignment (BLAST) - PFP filtering option removed - Usability fixes - operation of cancel buttons, progress bar. - Release Notes - Added specific installation instructions. ---Online Help chapters updated--- - ANOVA - ARACNe - CNKB - Marker Annotations - Master Regulator Analysis - Promoter - Sequence Alignment (BLAST) geWorkbench 1.7.0 (2009-07-17) ============================== ---New components--- - MarkUs - The MarkUs component assists in the assessment of the biochemical function for a given protein structure. It serves as an interface to the Mark-Us web server at Columbia. Mark-Us identifies related protein structures and sequences, detects protein cavities, and calculates the surface electrostatic potentials and amino acid conservation profile. - MRA - The Master Regulator Analysis component attempts to identify transcription factors which control the regulation of a set of differentially expressed target genes (TGs). Differential expression is determined using a t-test on microarray gene expression profiles from 2 cellular phenotypes, e.g. experimental and control. - Pudge - Interface to a protein structure prediction server (Honig lab) which integrates tools used at different stages of the structural prediction process. - ARACNe2 - upgraded to ARACNe2 distribution from Califano lab, which adds selectable modes (Preprocessing, Discovery, Complete) and a new algorithm (Adaptive Partitioning). Preprocessing allows determination of key parameters from actual input dataset. - caGrid v1.3 - Upgrading of grid services to caGrid v1.3 + introduction of caTransfer for large data tranfers. - Component Configuration Manager - allows individual components to be loaded into or unloaded from geWorkbench. - genSpace collaborative framework - discovery and visualization of workflows. Implemented user registration and preferences. - SVM 3.0 (GenePattern) - Support Vector machines for classification. ---Other changes in release 1.7.0--- - Analysis - Parameter saving implemented in all components. If current settings match a saved set, it is highlighted. - ARACNe - improved description of DPI in Online Help. - caArray - query filtering on Array Provider, Organism and Investigator implemented. - caArray - can now add a local annotation file to caArray data downloads. - caGrid - caGrid connectivity is now built directly in to supported components rather than being a separate component itself. - caScript - The caScript editor is no longer supported. - Color Mosaic - now interactive with the Marker Sets list and Selection set. - Cytoscape - Upgrade to Cytoscape version 2.4 for network visualization and interaction. - Cytoscape - Set operations on genes being returned from Cytoscape network visualizations, via right-click menu. - Cytoscape - Changes to tag-for-visualization - e.g., now only one way, from marker set to Cytoscape, not vice-versa. - Gene Ontology file - the OBO 1.2 file format is supported. - Marker Annotations - Direct access to the NCI Cancer Gene Index was added. It supplies detailed literature-based annotations on a curated set of cancer-related genes. - Marker Annotations - add export to CSV file. - Marker Sets component - a set copy function was added. - MINDy - many improvements to display and results filtering - including marker set filtering. - Scatter Plot - Up to 100 overlapping points can be displayed in a single tooltip. - Various - A number of components were refactored. - Workspace saving - now works properly for all components. geWorkbench 1.6.3 (2009-01-08) ============================== - geWorkbench 1.6.3 fixes several caArray related issues: - connection issue that may cause a time-out on some machines. - incorrect caching of caArray query results. - duplicate query process removed. geWorkbench 1.6.2 (2008-11-14) ============================== - geWorkbench 1.6.2 provides improved proxy communication with its grid service dispatcher component (see Mantis bug 1631). - A problem was fixed in the server-side grid implementation of hierarchical clustering (Mantis bug 1598). geWorkbench 1.6.1 (2008-11-07) ============================== - A Java servlet now provides connectivity to the Cellular Networks Knowledge Base database through the firewall. - Online help for the Sequence Retriever component was added. - The GenePix annotation parser was augmented to include more data fields. - Added a missing GenSpace component. - The GenSpace component was moved from the visual area to the command area. - Volcano plot scaling was fixed to display extreme P-values (E-45). geWorkbench 1.6.0 (2008-10-24) ============================== - Adds Mindy component - The GO Terms component is not included in this release. It will return in a future release. - Fixed a problem (caused by a change in a server-side URL) with retrieving annotations for genes in Biocarta pathway diagrams (bug 1577). - The default caArray server was set to the production server at NCI (array.nci.nih.gov, port 8080) (bug 1602). The URL for the staging array was updated to array-stage.nci.nih.gov. - An incorrect argument was being sent to NCBI's BLAST server. Due to recent changes there implementing stricter checking, blastn would no longer run. (bug 1597). - Corrected a problem where, when using the adjusted Bonferroni correction, or the Westphal-Young with MaxT, only values with positive fold-changes were returned and displayed (bug 1603). - Added a feature whereby the user is warned before any operation that will alter the dataset, e.g. before filtering out markers, or before a log2 transformation. - Added a feature to allow adding a new empty marker set. This can then be used to receive markers selected interactively in Cytoscape (bug 1541). - Fixed a problem displaying patterns in the sequence viewer after running Pattern Discovery (SPLASH) (bug 1415). - Fixed a problem with displaying adjacency matrices generated by ARACNE in the Cytoscape component (bug 1449). - Numerous changes were made to improve responsiveness, including when - selecting a marker in a large dataset (bug 1346), - right-clicking on Project with a large dataset (bug 1337), - saving a workspace (bug 1525), and - starting an analysis (bug 1544). - Remaining changes, not listed here in detail, included - internal issues within geWorkbench, - improved verification of parameters and set selections before beginning a calculation, - improvements to the graphical user interfaces of many components, and - corrections to the grid implementations of analytical services (Hierarchical Clustering, SOM, ANOVA etc). geWorkbench 1.5.1 (2008-09-23) ============================== - It addresses changes in the APIs for the caArray and caBIO data services since geWorkbench 1.5 was released. geWorkbench 1.5.1 can currently connect with caArray 2.1 and caBIO 4.0/4.1. - It also includes an update to parse the new release 26 of Affymetrix annotation files. - Fixes a problem where annotation information was not associated with arrays that were merged. geWorkbench 1.5.0 (2008-07-03) ============================== ---New Modules--- - ARACNE – gene network reverse engineering (from Andrea Califano's lab at Columbia University, http://wiki.c2b2.columbia.edu/califanolab/index.php/Software). - ANOVA – Analysis of variance, ported from TIGR's MEV, http://www.tm4.org/mev.html). - caArray2.0 connectivity – query for and download data from caArray 2.0 directly into geWorkbench. - Cellular Networks Knowledge Base – database of molecular interactions. (from Andrea Califano's lab at Columbia University, http://amdec-bioinfo.cu-genome.org/html/BCellInteractome.html). - GenSpace - provide social networking capabilities and allow you to connect with other geWorkbench users. - MatrixReduce – transcription factor binding site prediction (from Harmen Bussemaker's lab at Columbia University, http://bussemaker.bio.columbia.edu/software/MatrixREDUCE/). - Analysis components ported from GenePattern (http://www.genepattern.org) - Principle Component Analysis (PCA) - K-nearest neighbors (KNN) - Weighted Voting (WV) ---New File types supported--- - The NCBI GEO series matrix file for microarray data (tab-delimited) ---New server side architecture--- - Invocation of caGrid services is now delagated to an independent component (the Dispatcher). This makes it possible to exit geWorkbench after submitting a long-running job and then automatically pick up any results next time the application starts. ---Other changes--- - The Marker and Array/Phenotypes components now support algebraic operations (union, intersection, xor) on marker and array groups. - Upon exiting the application, the user is prompted to store their workspace. - Workspace persistence problems have been resolved. - The Marker Annotations component has been enhanced in several ways: -- The integration with caBIO has been updated to use API Version 4.0 -- The caBIO Pathway component (previously an independent geWorkbench component that would display BioCarta pathway images) has been integrated into the Marker Annotations component. -- Markers can be returned from BioCarta pathway diagrams. -- A new option is provided to choose between human or mouse CGAP annotation pages. ================================================================ 4.0 Known Issues/Defects ================================================================ ---Running BLAST on multiple sequences--- The NCBI BLAST server may return an error when multiple sequences are searched from geWorkbench. The queries are sent serially, one at a time). This appears to depend on the load on the NCIB BLAST server. ---Affymetrix Annotation files--- Due to licensing restrictions, Affymetrix annotation files cannot be included in this distribution. geWorkbench users who are working with Affymetrix chip data should retrieve the latest version of the appropriate annotation file for the chip type they using directly from Affymetrix. geWorkbench uses the CSV format annotation files. Affymetrix annotation files can be downloaded from their support site, at www.affymetrix.com. Although there are frequent changes to their website, the files can be found starging on the Support tab of the website: (1) the Technical Documentation section, e.g.: * Support->Affymetrix Microarray Solutions->Technical Documentation->Annotation Files (2) Under "Support by Product". An example file from the Affymetrix site is "HG_U95Av2.na32.annot.csv.zip". This file must be unzipped before use. You can place the file in any convenient directory. When you load a new data file, you will be asked for the location of the annotation file and can browse to it. ---Grid Computations--- The reference implementations of the server-side grid-enabled algorithms currently are running on a single front-end server not meant for heavy computational use. That server is not configured for computing on large datasets or for long-running jobs. ================================================================ 5.0 Bug Reports and Support ================================================================ Support is provided via online forums at the NCI's Molecular Analysis Tools Knowledge Center. See https://cabig-kc.nci.nih.gov/Molecular/forums/ FAQs and other articles are also available at https://cabig-kc.nci.nih.gov/Molecular/KC/index.php/Main_Page#geWorkbench Finally, please see the geWorkbench project page for additional known issues and FAQs. www.geworkbench.org. ================================================================ 6.0 Documentation and Files ================================================================ The documents and support files in this distribution include: geWorkbench Release Notes: ReleaseNotes_2.2.2.txt (this file) geWorkbench License: geWorkbenchLicense.txt Online Help: Within geWorkbench, users can access "Help Topics" by clicking the top menu. It has detailed information about each module. For other documentation not directly included as part of the distribution, see the following section (7.0) Web Resources. ================================================================ 7.0 geWorkbench Web Resources ================================================================ The geWorkbench team maintains a Wiki containing extensive documentation, a User Manual, tutorials and training slides. It is available at: http://www.geworkbench.org