From Informatics

Jump to: navigation, search

1 General notes on previous geWorkbench releases
2 Other geWorkbench planning pages
3 Release Schedule for 2.5.1
4 Release Schedule for 2.5.0
5 Role Assignments
6 Things to remember
7 Things to check in generating a release
8 External components in 2.5.0
9 geWorkbench 2.5.0 Grid Service URLs
- 9.1 Production URLs for 2.5.0
- 9.2 Development URLs
10 System Testing
11 Java Version
- 11.1 Known Incompatibilities with Java 1.7.
12 Known caArray issue that keeps coming up
- 12.1 Problem 1
- 12.2 Problem 2
13 Changes in release 2.5.0
14 Components

General notes on previous geWorkbench releases

General notes, feature requests and FAQ page - This page was started with material from the time of release 1.7.0 and will be updated continually.
Links to other release pages:

Other geWorkbench planning pages

Entry point for geWorkbench releases

geWorkbench Development Notes - contains notes for several changes that were part of release 2.2.0.

GeWorkbench TODO List - items that need to be planned or documented.

System Test Results Log

Nikhil's page for geWorkbench web notes: http://wiki.c2b2.columbia.edu/informatics/index.php/GeWorkbench-Web

The geWorkbench Roadmap (local version) contains possible directions for future development.

caBIG has a separate geWorkbench Roadmap page that we must maintain.

caBIG/NCI also provides the official download page for geWorkbench

Release Schedule for 2.5.1

Actual release date: 11/01/2013
Updated release: 11/12/2013 - added missing VIPER file at 8 pm, no other changes.

Release Schedule for 2.5.0

Code freeze actual: 9/17/2013
System testing started: 9/23/2013
System testing end target:9/27/2013
System Testing concluded: 10/4/2013
Bug fixes concluded: 10/9/2013
Final release target: 9/30/2013
Actual release date: 10/11/2013

Role Assignments

Release Manager – Kenneth Smith
Release Engineer – Zhou Ji
Tech Lead – Zhou Ji
Tester – Udo Többen, and the rest of the bunch
Test Manager – Udo Többen
Technical Writer – Mary VanGinhoven

Things to remember

Best practices for defect management - See also Aris's email of 8/20/09 on this topic.
geWorkbench Roadmap page at NCICB - keep up to date with actual plans and developments - at https://cabig-kc.nci.nih.gov/Molecular/KC/index.php/GeWorkbench_Roadmap
InstallAnywhere JRE update packs: http://www.flexerasoftware.com/products/installanywhere/files-utilities.htm
Our local caArray is at afapp1.c2b2.columbia.edu port 38080 (web interface).

The Perl script to convert Media Wiki geWorkbench tutorial pages to the format needed for the geWorkbench Java Help system.

Things to check in generating a release

Windows control panel version number, set in InstallAnywhere
Update the Welcome message
Update the title bar version indication (application.properties)
Update the GenomeSpace launcher at the Broad Institute.

External components in 2.5.0

caArray - caArray client external v1.0 (new version, compatible ONLY with caArray 2.5.0+).
caGrid - caGrid version 1.4 (no change).
Cytoscape - Version 2.8.2 (no change).
GeneOntology OBO file- (9/13/2013) Updated to this date in data directory. But geWorkbench downloads latest each time to geWorkbench root.
JASPAR - Version released October 12, 2009 (no change). We use the following files from the JASPAR CORE SQL tables directory (http://jaspar.genereg.net/html/DOWNLOAD/jaspar_CORE/non_redundant/all_species/sql_tables/):
- MATRIX.txt
- MATRIX_ANNOTATION.txt
- MATRIX_DATA.txt
JMol - version 13.0.18 (updated).
Ontologizer - Ontologizer.jar version 2.0, file released 2010-03-10 (no change).

Note - caBIO support is removed in geWorkbench 2.5.0, so the caBIO client is no longer included. The caBIO service was discontinued by NCI. caBIO was replaced by calls to the bioDBNet web service, which returns an XML file.

geWorkbench 2.5.0 Grid Service URLs

The default Index Service and Dispatcher Service are hard-coded in configuration file "conf/application.properties". Updating these defaults is part of the release process. That is, for the production version, the production URLs must be entered.

Production URLs for 2.5.0

Index and Dispatcher

Default Index Service: http://cagridnode.c2b2.columbia.edu:8080/v2.5.0/wsrf/services/DefaultIndexService
Default Dispatcher Service: http://cagridnode.c2b2.columbia.edu:8080/v2.5.0/wsrf/services/cagrid/Dispatcher

Standard geWorkbench Grid Services

http://geworkbench1.c2b2.columbia.edu:8080/wsrf/services/caGrid/ServiceName

where ServiceName is e.g. Anova, Aracne, etc.

MarkUs and Skyline Service URLs

bhapp.c2b2.columbia.edu:8080/wsrf/services/cagrid/MarkUs
bhapp.c2b2.columbia.edu:8080/wsrf/services/cagrid/SkyLine

MarkUs and Skyline RESULT URLS

Both MarkUs and Skyline reference a web service to retrieve results. The results remain on the remote server and only the information requested is returned to geWorkbench. As a consequence, the results of these two analyses cannot be preserved indefinitely, even by saving a workspace.

The result urls are independent of cagrid index/dispatcher url, but linked tightly to bhapp.c2b2.columbia.edu.

Markus result url: http://bhapp.c2b2.columbia.edu/MarkUs/cgi-bin/browse.pl?pdb_id=MUS... [^] It's url for MarkUs web site, and won't change when we move our services around.

Skyline result url: http://cagridnode.c2b2.columbia.edu:8080/luna/SkyLineData/output [^] which is a proxy forward to bhapp.c2b2.columbia.edu:8080/SkyLineData/output

When cagrid index service moves to a new server, we just need to change the tomcat configuration of these two services to register to new index service. No need to change geworkbench code for them to work.

Development URLs

All grid services except MarkUs and Skyline use the development index service and dispatcher, and have development grid services.

Default Index Service: http://afdev.c2b2.columbia.edu:8080/wsrf/services/DefaultIndexService
Default Dispatcher Service: http://afdev.c2b2.columbia.edu:8080/wsrf/services/cagrid/Dispatcher

System Testing

See results at http://afdev.cgc.cpmc.columbia.edu/systemtest/BrowseLogs.php

Java Version

geWorkbench 2.5.0 was developed and tested using the Java 6 JDK and JRE.

Known Incompatibilities with Java 1.7.

caArray - (#3107) with Java 7, experiment list downloads but not displayed till hit "Cancel"

Known caArray issue that keeps coming up

(This section is unchanged from the entry for release 2.3.0).

There is a problem with the caArray server code, in that as long as a geWorkbench session is running, the server retains the last used username/password, if any have been submitted.

See bugs 2022, 2555.

Two problems can arise:

Problem 1

Unfortunately, that defect still exists in the new API. The only situtation you will see it in is as follows:

User A connects to caArray via geWorkbench and enters his/her credentials.
An anonymous user then connects to caArray using the same geWorkbench instance. This anonymous user can still see User A's protected data.

The bug does not affect any other situations. E.g., if the users are using different instances of geWorkbench, there is no problem. If the second user is passing in a new set of credentials, it's not a problem. It is only a problem when the first user is credentialled and the second user is anonymous, and they are both connecting through the same geWorkbench one after another.

Thanks! Rashmi

Problem 2

Once a username and password have been entered and submitted to caArray, you cannot go back to using no username/password, except by restarting geWorkbench. However you can still put in a different username/password combination. This is a property of the caArray server-side code. Thus if you have no valid username/password and enter an incorrect one, you will need to restart geWorkbench before you can query caArray public experiments again (no login required).

Changes in release 2.5.0

Major changes

New Components

ceRNA Query
Consensus Clustering (GenePattern 3.0)
DEMAND
LINCS Query
LINCS Color Mosaic Viewer
MARINa Results Viewer (Barcode)
Viper

Dropped Components

Cancer Genome Index

Other

caBIO dropped, replaced by interface to bioDBNet.
"Project" abstraction dropped.
BLAST parser updated to full use of XML
cutenet component deleted from SVN

Changes in functionality (requiring documentation updates)

Bug fixes (no documentation change)

Components

List of Included Components

Data Managmenent:

Arrays/Phenotypes
Markers
Preferences
Project Panel
Session manager - no one knows what this is - probably a SOAP interface. But it is definitely needed! (check for 2.4.0)

File input formats

Affy File Format
CEL File Loader
Exp. Format
FASTA Format
Genepix File Format
PDB Structure Format
Tab-delimited (RMA Express Format)

Connectivity

caArray2 - compatible with caArray 2.5.0 and higher. The caArray client jar is NOT backwards-compatible with any earlier versions.

Data filters

Filtering
Affy Detection Call Filter
Coefficient of Variation (new)
Deviation Filter
Expression Threshold Filter
Genepix Filter (Two channel filter)
Genepix Flag Filter
Missing Values Filter
Multiple Probeset Dilter
Entrez GeneID Filter

Normalization

HouseKeeping Genes Normalizer
Normalization
Log2 Tranformation
Marker Centering Normalizer
Mean Variance Normalizer
Missing Values (Normalizer)
Microarray Centering Normalizer
Quantile Normalizer
Threshold Normalizer

Experiment Information

Dataset Annotation
Dataset History
Experiment Info
Version Information

Analyis/Visualization

Alignment Results
Analysis
ANOVA
ARACNe2 - adds Adaptive Partitioning algorithm and Preprocessing mode.
CELImageViewer
Cellular Networks Knowledge Base
Color Mosaic
Component Configuration Manager
Cytoscape_V2_8
Dendrogram
Expression Profiles
Expression Value Distribution
Fold-change Analysis
Gene Ontology Enrichment Analysis and Display
genSpace collaborative framework
Hierarchical Clustering Analysis
IDEA
Image Viewer
Jmol
Marker Annotations
MarkUs - Analysis and Viewer
MRA - Master Regulator Analysis
MatrixREDUCE
Microarray Viewer
MINDy - Analysis and Viewer
Pattern Discovery
Position Histogram
Pudge
Promoter
SAM
Scatter Plot
Sequence
Sequence Alignment
Sequence Retriever
SOM Analysis
SOM Clusters
t Test Analysis
Tabular Microarray Viewer
Volcano Plot

GenePattern components
- PCA (GenePattern) - Analysis and Viewer
- K-nearest neighbors (GenePattern)
- SVM 3.0 (GenePattern) - Analysis and Viewer - include, we need to develop online help and tutorial (Aris).
- WV - Weighted Voting (GenePattern)
- GSEA

Excluded and Dropped Components

The release creation script in build.xml now explicitly includes components by name (previously it excluded components by name) The following is a list of modules known to be excluded.

Excluded components

The following components are excluded for a variety of reasons, most often due to lack of formal requirements documentation or/and associated system test scripts. Some of them should be scheduled for inclusion in the next production release. For modules not found in the current all.xml a path to the component is shown.

Still under development:

CART (GenePattern) - this component has not yet been released. Is part of another component and must be excluded manually from the final installer release build.
Cancer-GEMS (awaiting further development from NCI)
NetBoost
- EdgeListFileFormat (NetBoost)
Evidence Integration
MEDUSA

Not actively being developed:

GCRMA Via R CEL Loader (in \geworkbench\src\org\geworkbench\components\parsers)
Multi-t-test (OK, but need to understand when it would be used, e.g. after ANOVA, and if it is what we really want).
SMLR - Sparse Multinomial Logistic Regression - implementation by John Watkinson.
SVM Format (in \geworkbench\src\org\geworkbench\components\parsers) (left over from a John Watkinson project).
Synteny (in \geworkbench\components\alignment\src\org\geworkbench\components\alignment\client)
t-profiler
caScript

Dropped components

These components are not expected to be used again.

caBIO Pathways (support dropped by NCI, replaced by bioDBnet in Marker Annotations component)
Cancer Gene Index integration in the Marker Annotations component - support dropped by NCI.
CuteNet (GeneWays)
Column Major Format (in \geworkbench\src\org\geworkbench\components\parsers)
Frequency Threshold Filter (There is a class called AllelicFrequencyThresholdFilter in \geworkbench\components\filtering\src\org\geworkbench\components\filtering)
GeneOntology (the original component, now replaced by geneontology2/Ontologizer2.0)
Genotypic File Format (in \geworkbench\src\org\geworkbench\components\parsers\genotype)
Network Browser (was part of Reverse Engineering - would require major rewrite to revive. PathwayDecoder is module name)
Pattern Discovery Algorithm (association analysis)
Patterns (Pattern Panel) - Omit from release - Appears to have been superseded by the Sequence component.
Reverse Engineering (non-ARACNE, unpublished algorithm. PathwayDecoder is module name)
Simulation (a student project)

Note - the original "interactions" component was dropped and reimplemented as the Cellular Networks Knowledge Base. It took a brief detour as being called component "interactions2".

Externally supplied components

The following components originate external to the geWorkbench source tree:

MatrixReduce

Source

MatrixReduce source code was obtained from the Bussemaker lab and a modified copy saved under: adcvs.cu-genome.org:/cvs/magnet/matrixreduce_distribution. This modified copy contains Java API changes made to integrate with geWorkbench.

Compiling

MatrixReduce is compiled using the following commands:

FitModel binary is compiled manually as follows
- gcc -c -O2 -mno-cygwin -funroll-loops *.c
- gcc -mno-cygwin -static nrutil.o fncs_cmns.o fncs_seqs.o fncs_tdat.o fncs_seed.o fncs_app1.o fncs_app2.o fncs_nrcs.o fncs_topo.o fncs_mylm.o fncs_bits.o FitModel.o -o FitModel –lm (for windows and linux)
- gcc -mno-cygwin nrutil.o fncs_cmns.o fncs_seqs.o fncs_tdat.o fncs_seed.o fncs_app1.o fncs_app2.o fncs_nrcs.o fncs_topo.o fncs_mylm.o fncs_bits.o FitModel.o -o FitModelMac –lm (for Mac)

API jar: The Java API jar is created with the makefile, command "make jar".
FitModel binary is compiled manually with gcc, with extra flags to tell it to not use Cygwin, to optimize and to unroll loops
FitModel.exe bundles both the NR (Numerical Recipies) and GNU libraries.

The API jar is created with the makefile under MatrixREDUCE's top directory.

Notes

See comment on white spaces in file names/paths in Mantis : http://wiki.c2b2.columbia.edu/mantis/view.php?id=1316

Aracne.jar for MINDY

Although ARACNE is a geWorkbench component, the MINDY component uses a version of ARACNE that is externally maintained. The file aracne.jar is copied directly into the geWorkbench CVS tree.

The location of the external ARACNE code is:

The version of the external ARACNE code is:

MINDy jar file for caGrid

Source tree is kept in the geWorkbench local CVS repository.
Current version is MINDY-0.3.jar
Compile with ant dist-jar. The final jar file will be in the "dist" directory.

GeWorkbench Release 2.5