From Informatics

Jump to: navigation, search

1 General notes on geWorkbench release 2.0.*
2 Major Code Changes in 2.0.0
3 List of changes to GUI
4 New components in release 2.0
5 Other major new features in release 2.0
6 Tutorial/Online Help chapters revised and included in release
7 List of other major changes
8 Versions of external files/components included in this release
9 geWorkbench 2.0.0 Grid Service URLs
- 9.1 External URLs
- 9.2 Internal URLs
10 geWorkbench 2.0.1 Grid Service URLs
- 10.1 External URLs
- 10.2 Internal URLs
11 References
12 geWorkbench 2.0.0 Web Service URLs
13 External Service Requirements and Connectivity
14 List of Included Components
15 Excluded and Dropped Components
- 15.1 Excluded components
- 15.2 Dropped components
16 Externally supplied components
17 Analysis components - external runtime dependencies
18 TODO Notes
19 Documentation changes

General notes on geWorkbench release 2.0.*

General notes, feature requests and FAQ page - This page was started with material from the time of release 1.7.0 and will be updated continually.

The release page for version 1.8.0 can be found here.
The release page for version 2.1.0 can be found here.

The geWorkbench Roadmap (local version) contains possible directions for future development.

caBIG has a separate geWorkbench Roadmap page that we must maintain.

Release Schedule for 2.0.0

geWorkbench 2.0.0 code freeze: May 24, 2010 (actual)
Testing concluded:
Final release target: June 4, 2010
Actual release date: June 9, 2010

Update Releases

geWorkbench 2.0.1 - June 25, 2010
geWorkgbench 2.0.2 - July 16, 2010

Role Assignments

Release Manager – Kenneth Smith
Release Engineer – Thomas Garben
Tech Lead – Zhou Ji
Tester – Udo Többen, and the rest of the bunch
Test Manager – Udo Többen
Technical Writer – Mary VanGinhoven

Things to remember

Best practices for defect management - See also Aris's email of 8/20/09 on this topic.

Use Case documents - We would like to update Use Case documents as the underlying application changes. However, this has seldom been accomplished. At this time, the wiki-based tutorials often have the most up-to-date description of current functionality.

geWorkbench Roadmap page at NCICB - keep up to date with actual plans and developments - at https://cabig-kc.nci.nih.gov/Molecular/KC/index.php/GeWorkbench_Roadmap
InstallAnywhere JRE update packs: http://www.flexerasoftware.com/products/installanywhere/files-utilities.htm

Known Issues in Release 2.0.0

InstallAnywhere and Norton Internet Security Sonar - Under Windows, InstallAnywere places a file called "Install.exe" in a folder in a path like "C:\Users\ksmith\AppData\Local\Temp\I1276186086\Windows\. This file was seen to be detected and removed by "Norton Sonar", silently terminating the install. Everything below "Temp" is removed after the installation finishes.
Welcome Screen - If the user already has a recent version of geWorkbench installed, and has dismissed the "Welcome" screen, it will not be shown when the new geWorkbench is run. This is because it looks in the same property file in the user's .geWorkbench folder.
Marker Annotations (bug #2291) - Sometimes the progress bar does not go away after all records apparently retrieved. After this, further retrievals may fail. Need to restart geWorkbench.
CNKB - Running CNKB followed by dispatching a hierarchical clustering grid job can produce an error.
Color Mosaic -
- Macintosh: Sorting after t-test doesn't work.
- Macintosh: After a t-test, the color mosaic itself does not appear instantly - the array names and test and control do appear, but the mosaic itself appears only if the "display" button is toggled off and then on again (Note it should be off by default, this should be checked).
KNN and WV - Macintosh: unidentified problems lead to error message.

New Component Detail and Dependencies pages created

Annotation Dependencies - list of dependencies of particular components on particular annotation file columns.

CNKB Data - release status and available interactions for each database.

Major Code Changes in 2.0.0

Synteny (which is a discontinued component) removed from "Alignment" and made into separate component.
BLAT code deleted from "Alignment". It was poorly implemented and no longer reachable.
Sequence Alignment handles more databases (they were not shown in the previous versions and would not work until the recent development); more that 100 java files and jar files are removed; GUI improvement.
Ontologizer 2 command-line jar file was downloaded 4/2/2010. Internal date in the jar file is 3/10/2010.
MRA and t-test were previously located in analysis component. They have now been separated out into a new component.
Updates and rationalization of browser launcher code and Jmol.
CNKB
Filters

List of changes to GUI

Changes that will require updating of tutorials, online help and system tests.

New Look and Feel - ""
Available Analyses, Normalizers and Filters now shown in pull-down menus rather than small lists. The saved parameter sets are also now in pulldown menus.
New result nodes each get unique names by appending sequential numbers to test name.

New components in release 2.0

Skyline - (PDB) - A high-throughput comparative modeling pipeline. It is used to find homology models for a protein whose structure has been experimentally determined.
Skybase - (FASTA) - SkyBase is a database that stores the homology models built by SkyLine analysis for all NESG PSI2 protein structures.
Pudge - (FASTA) - Interface to a protein structure prediction server which integrates tools used at different stages of the structural prediction process.

Other major new features in release 2.0

More than 250 "bug reports" were closed. These included many new features, improvements in the usability of numerous components, and actual bug fixes.
Java 6 - Moved from Java 5 to Java 6. geWorkbench now requires Java 6. Works on both 32 bit and 64 bit VMs (JREs).
Look and Feel - Switched to new, more modern Look and Feel. geWorkbench appearance now consistent across all platforms.
CNKB - Revamped interface to allow choice of interactome and data types.
File parsers - added
- MAGE-TAB data matix
- GEO Soft format - added series (GSE) and curated matrix (GDS). Already had series matrix format.
Filtering - completely revamped - now works directly for all modes, allows specification of minimum % matching arrays before filtering occurs.
caBIO component updated from 4.2 to 4.3.

Tutorial/Online Help chapters revised and included in release

Filtering - New tutorial written and ported to online help.
Normalization - New tutorial written and ported to online help.
CCM - New tutorial written and ported to online help.

List of other major changes

caArray - Improved memory usage on downloads from caArray.
CNKB - Can now return markers direct from CNKB without use of Cytoscape.
Color Mosaic - enhancements to display (bug 2147)
- toggle array names on/off
- search on array name, accession, or label
Component Configuration Manager - now can filter display list by categories: Analysis, Viewer, Normalizer, Filter
Cytoscape - Corrected mapping between gene names in Cytoscape display and markers in Marker Sets panel (now uses Entrez IDs).
Dendrogram - can now create Array subsets as well as marker subsets.
Markers and Arrays - Hover text available in Markers and Arrays phenotypes to visualize long names if needed.
Marker Annotation - search results can be saved to a text file, including relevant URLs and pathway BioCarta pathway names.
File loading - Checking for "out of memory" errors during file loading.
GUI - in switching to new L&F, fixed many text highlighting problems that were previously seen on Macintosh only but now appeared on Windows also.
File parser menu - The file parser selection menu now shows valid file extensions for each type.
Promoter - JASPAR promoter motifs now filterable by taxon.
Sequence alignment (BLAST) - many enhancements, including
- added additional databases to match those listed at NCBI
- improved handling of results from searches containing long query sequences.

Versions of external files/components included in this release

gene_ontology.1_2.obo downloaded 5/24/2010 from geneontology.org.
Ontologizer.jar version 2.0, file released 3/10/2010, checked no further updates as of 5/24/2010. We are using the "Command line" jar file.
- Note - On 5/31/2010, the Ontologizer "Manual" version jar file (which has a GUI) was updated. However, the command line version was still not updated.
Jaspar_CORE (http://jaspar.genereg.net/) SQL files last updated on server 10/2009. (/html/DOWNLOAD/jaspar_CORE/non_redundant/all_species/sql_tables)
JMOL - component updated to JMOL 12 RC.10.

geWorkbench 2.0.0 Grid Service URLs

External URLs

Default Index Service: http://cagridnode.c2b2.columbia.edu:8080/v2.0.0/wsrf/services/DefaultIndexService
Default Dispatcher Service: http://cagridnode.c2b2.columbia.edu:8080/v2.0.0/wsrf/services/cagrid/Dispatcher

Internal URLs

These URLs are used within geWorkbench and are not resolved

SkyBase Service: http://cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/SkyBase
SkyLine Service: http://cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/SkyLine (does not work, only services on Luna work for SkyLine)
MarkUs Grid Service: http://luna.bioc.columbia.edu:8080/wsrf/services/cagrid/MarkUs

geWorkbench 2.0.1 Grid Service URLs

External URLs

Default Index Service: http://cagridnode.c2b2.columbia.edu:8080/v2.0.0/wsrf/services/DefaultIndexService
Default Dispatcher Service: http://cagridnode.c2b2.columbia.edu:8080/v2.0.0/wsrf/services/cagrid/Dispatcher

Internal URLs

SkyBase Service: http://geworkbench2.c2b2.columbia.edu:8080/wsrf/services/cagrid/SkyBase
SkyLine Service: http://luna.bioc.columbia.edu:8080/wsrf/services/cagrid/SkyLine
MarkUs Grid Service: http://luna.bioc.columbia.edu:8080/wsrf/services/cagrid/MarkUs

All others: http://geworkbench2.c2b2.columbia.edu:8080/wsrf/services/cagrid/*

References

SkyLine Citation: http://www.ncbi.nlm.nih.gov/pubmed/17154423?dopt=Abstract
- Mirkovi?, N., Li, Z., Parnassa, A., Murray, D. Strategies for high-throughput comparative modeling: Applications to leverage analysis in structural genomics and protein family organization. Proteins. 2007 Mar 1;66(4):766-77.

Article on SkyLine and SkyBase:
- Lee Hunjoong; Li Zhaohui; Silkov Antonina; Fischer Markus; Petrey Donald; Honig Barry; Murray Diana. High-throughput computational structure-based characterization of protein families: START domains and implications for structural genomics. Journal of structural and functional genomics 2010;11(1):51-9.

Website for PDB SkyBase: http://wiki.c2b2.columbia.edu/nesg3/nesg.php
NESG SkyBase: http://156.145.102.40/nesg3/nesg.php
SkyBase tutorial: http://wiki.c2b2.columbia.edu/nesg3/help/help.html

geWorkbench 2.0.0 Web Service URLs

Pudge Start Modeling: http://bhapp.c2b2.columbia.edu/pudge/cgi-bin/pipe_int.cgi
Pudge Analyze Results: http://bhapp.c2b2.columbia.edu/pudge/cgi-bin/show_results.cgi
Pudge Citation: http://wiki.c2b2.columbia.edu/honiglab_public/index.php/Software:PUDGE_References

External Service Requirements and Connectivity

Component	Web Service	Grid Service	External availability	Platform restrictions
Markus	yes	2.0.0: internal only; 2.0.1: external	web service or grid service	no 64-bit Windows, no Mac
Pudge	yes	no	web service	Mac: 64-bit (OSX 10.6+) OK, 32-bit no. Windows: 64-bit no, 32-bit OK
SkyBase	no	yes	grid service	—
SkyLine	no	2.0.0: internal only*; 2.0.1 external	grid service	no 64-bit Windows

—	—	—	—	—

Markus and Skyline now available via grid service externally using code in 2.0.1, but not in 2.0.0.

List of Included Components

Data Managmenent:

Arrays/Phenotypes
Markers
Preferences
Project Panel
Session manager - no one knows what this is - probably a SOAP interface. But it is definitely needed!

File input formats:

Affy File Format
CEL File Loader
Exp. Format
FASTA Format
Genepix File Format
PDB Structure Format
Tab-delimited (RMA Express Format)

Connectivity

caArray2 - updated to support caArray 2.3.0 in release 1.8.0 (released September 2009). The caArray client jar is NOT backwards-compatible with any previous versions.

Data filters:

Filtering
Affy Detection Call Filter
Deviation Filter
Expression Threshold Filter
Genepix Filter (Two channel filter)
Genepix Flag Filter
Missing Values Filter

Normalization:

HouseKeeping Genes Normalizer
Normalization
Log2 Tranformation
Marker Centering Normalizer
Mean Variance Normalizer
Missing Values (Normalizer)
Microarray Centering Normalizer
Quantile Normalizer
Threshold Normalizer

Experiment Information:

Dataset Annotation
Dataset History
Experiment Info
Version Information

Analyis/Visualization

Alignment Results
Analysis
ANOVA
ARACNe2 - adds Adaptive Partitioning algorithm and Preprocessing mode.
caBIO Pathways (this has been integrated in the Marker Annotations component)
Cancer Gene Index integration in the Marker Annotations component.
CELImageViewer
Cellular Networks Knowledge Base
Color Mosaic
Component Configuration Manager.
Cytoscape_V2_4 - updated version of Cytoscape.
Dendrogram
Expression Profiles
Expression Value Distribution
Gene Ontology Enrichment Analysis and Display
Hierarchical Clustering Analysis
genSpace collaborative framework
Image Viewer
Jmol
Marker Annotations
MarkUs - Analysis and Viewer
MRA - Master Regulator Analysis
MatrixREDUCE
Microarray Viewer
MINDy - Analysis and Viewer
Pattern Discovery
Position Histogram
Pudge?? - Analysis and Viewer (Browser) - if this is working (Kiran?) we should include. We can create a very simple online help file, essentially pointing to the Pudge documentation at the Honig site (Aris).
Promoter
Scatter Plot
Sequence
Sequence Alignment
Sequence Retriever
SOM Analysis
SOM Clusters
t Test Analysis
Tabular Microarray Viewer
Volcano Plot

GenePattern components
- PCA (GenePattern) - Analysis and Viewer
- K-nearest neighbors (GenePattern)
- SVM 3.0 (GenePattern) - Analysis and Viewer - include, we need to develop online help and tutorial (Aris).
- WV - Weighted Voting (GenePattern)

Excluded and Dropped Components

The release creation script in build.xml now explicitly includes components by name (previously it excluded components by name) The following is a list of modules known to be excluded.

Excluded components

The following components are excluded for a variety of reasons, most often due to lack of formal requirements documentation or/and associated system test scripts. Some of them should be scheduled for inclusion in the next production release. For modules not found in the current all.xml a path to the component is shown.

Still under development:

Cancer-GEMS (awaiting further development from NCI)
NetBoost
- EdgeListFileFormat (NetBoost)
Evidence Integration
MEDUSA

Not actively being developed:

GCRMA Via R CEL Loader (in \geworkbench\src\org\geworkbench\components\parsers)
GSEA
Multi-t-test (OK, but need to understand when it would be used, e.g. after ANOVA, and if it is what we really want).
SMLR - Sparse Multinomial Logistic Regression - implementation by John Watkinson.
SVM Format (in \geworkbench\src\org\geworkbench\components\parsers) (left over from a John Watkinson project).
Synteny (in \geworkbench\components\alignment\src\org\geworkbench\components\alignment\client)
t-profiler
caScript

Dropped components

These components are not expected to be used again.

CuteNet (GeneWays)
Column Major Format (in \geworkbench\src\org\geworkbench\components\parsers)
Frequency Threshold Filter (There is a class called AllelicFrequencyThresholdFilter in \geworkbench\components\filtering\src\org\geworkbench\components\filtering)
GeneOntology (the original component, now replaced by geneontology2/Ontologizer2.0)
Genotypic File Format (in \geworkbench\src\org\geworkbench\components\parsers\genotype)
Network Browser (was part of Reverse Engineering - would require major rewrite to revive. PathwayDecoder is module name)
Pattern Discovery Algorithm (association analysis)
Patterns (Pattern Panel) - Omit from release - Appears to have been superseded by the Sequence component.
Reverse Engineering (non-ARACNE, unpublished algorithm. PathwayDecoder is module name)
Simulation (a student project)

Note - the original "interactions" component was dropped and reimplemented as the Cellular Networks Knowledge Base. It took a brief detour as being called component "interactions2".

Externally supplied components

The following components originate external to the geWorkbench source tree:

MatrixReduce

Source

MatrixReduce source code was obtained from the Bussemaker lab and a modified copy saved under: adcvs.cu-genome.org:/cvs/magnet/matrixreduce_distribution. This modified copy contains Java API changes made to integrate with geWorkbench.

Compiling

MatrixReduce is compiled using the following commands:

FitModel binary is compiled manually as follows
- gcc -c -O2 -mno-cygwin -funroll-loops *.c
- gcc -mno-cygwin -static nrutil.o fncs_cmns.o fncs_seqs.o fncs_tdat.o fncs_seed.o fncs_app1.o fncs_app2.o fncs_nrcs.o fncs_topo.o fncs_mylm.o fncs_bits.o FitModel.o -o FitModel –lm (for windows and linux)
- gcc -mno-cygwin nrutil.o fncs_cmns.o fncs_seqs.o fncs_tdat.o fncs_seed.o fncs_app1.o fncs_app2.o fncs_nrcs.o fncs_topo.o fncs_mylm.o fncs_bits.o FitModel.o -o FitModelMac –lm (for Mac)

API jar: The Java API jar is created with the makefile, command "make jar".
FitModel binary is compiled manually with gcc, with extra flags to tell it to not use Cygwin, to optimize and to unroll loops
FitModel.exe bundles both the NR (Numerical Recipies) and GNU libraries.

The API jar is created with the makefile under MatrixREDUCE's top directory.

Notes

See comment on white spaces in file names/paths in Mantis : http://mantis.cu-genome.org/view.php?id=1316

Aracne.jar for MINDY

Although ARACNE is a geWorkbench component, the MINDY component uses a version of ARACNE that is externally maintained. The file aracne.jar is copied directly into the geWorkbench CVS tree.

The location of the external ARACNE code is:

The version of the external ARACNE code is:

Cytoscape

Any other components?

Analysis components - external runtime dependencies

component	local	external type	username/password	relay servlet	known to work outside campus
ANOVA	yes	grid	grid_default	no	?
ARACNe	yes	grid	grid_default	no	?
CNKB	no	servlet	some open data	yes	?
MINDy	yes	grid	grid_default	no	?
GenSpace	local	grid	genSpace account	no	?
Hierarchical Clustering	yes	grid	grid_default	no	?
KNN	no	GenePattern	???	no	?
MarkUs	no	grid	open	no	?
MRA	local	no	-	no	not applicable
MatrixREDUCE	local	grid	grid_default	no	?
PCA	no	GenePattern	???	no	?
PUDGE	no	web	open	no	?
SkyLine	no	grid	grid_default	no	?
SkyBase	no	grid	grid_default	no	?
SOM	yes	grid	grid_default	no	?
SVM	no	GenePattern	???	no	?
WV	no	GenePattern	???	no	?

TODO Notes

Done

Release Process
1. System testing should be done on Installer-built releases if practical. At least the installer version needs to be carefully tested early on, not only just at the end.
Release files
1. Release Notes and license were not included in 1.8.0 release. Add. - Done.
2. Release Notes - need better instructions on Java requirements. - Done. Online version on installation page even better.
3. Cardiogenomics MAS5 files were omitted from release 1.8.0. Add. - Done.
Filtering
1. Add to documentation that filtering does not respect marker set activation - it always works on all markers.
2. The reference for Quantile Normalization (Bolstad 2003) was added to the online help. It, and any other needed references, should be added to the very light normalization tutorial. (Done - new tutorial/online help written).
Analysis
1. MINDy - should gray out entirely the MINDy unconditional calculation as changing the settings has no effect. (Done). The tutorial states this but could be made more clear.
Normalization Panel - rename to "Normalization" in CCM.
Tutorials
1. Improve documentation on handling of Marker and Array sets (bug 1687) - Done post-release.
2. Make sure each file type is fully described.
3. Project Folders - the File Open list of file types is now alphabetical. If any tutorial / help page depicts this, it should be updated. Done - tutorial updated post-release.
4. Normalization - two of the components are missing online help - array based centering and mean-variance normalizer - all screenshots are bad. http://wiki.c2b2.columbia.edu/mantis/view.php?id=1948 - Done - Normalization tutorial completely rewritten and ported to online help.
Online Help
1. CNKB – full update needed - done in 2.0.1.
2. MINDy – full update done in 2.0.2.

Items deferred to a future release

Califano lab enhancements
1. wiki page needs an Evidence Integration page.
2. Califano lab is using old AMDeC website still for Bcell interactome. Should be moved to their Wiki.

Not finished for release 2.0.0

Ontologzier 2.0 - update license to include Ontologizer 2.0 BSD license terms. - done? Mention License but not terms....
MATKC
1. Update geWorkbench Roadmap periodically.
ARACNe
1. Add "pro" tips on ARACNe usage from Manjunath.
Tutorials
1. The caArray tutorial needs to be transferred to online help. It is currently under Project Panel -> Open Dataset
2. Document how missing values are detected, stored and represented.
3. Pudge tutorial needs to be more extensive.
4. Update Grid Services screenshot?
5. Note somewhere that if a subset of Markers has been activated, but is not visible (because Arrays is on top) the user may forget about the activated markers and make a mistake.
Manual
1. Update Pattern Discovery page in Manual based on revised tutorial.
2. Add section on Marker and Array sets in chapter 3?
Grid Services
1. Expose current grid services; right now we are still only exposing geWorkbench 1.5 services.
Promoter
1. Matching algorithm needs to be given a statistical basis.
2. More recent promoter set available, e.g. 14K set in Elkon paper.
3. Document how upstream/downstream indications are used on display. Where do they come from, are they used correctly?
Analysis Panel -> Analysis (Done).
1. Can we correlate the various licenses to particular components. Should list in CCM?
Properties files
1. Document that all recent versions of geWorkbench use the same properties files. However, this itself can be changed.
2. Document how the properties files work.
Sequence Retriever
1. How can we use sequence retriever outside of the context of a microarray dataset. Can we add ability to query genes by name directly, outside of microarray context?
GEO Soft parsers
1. Document valid Affy column headers: 'ABS_CALL' or 'DETECTION_CALL' and 'DETECTION_P' or 'DETECTION P-VALUE'
Known java problems:
1. EDT exception - due to background threads trying to alter GUI. See bug 2224.
Hierarchical Clustering - Euclidean distance metric. - Document that if Euclidean metric is use, the data should be normalized first. See bug 148.
MeV - need a list of components that came from MEV code.
1. t-test
caArray/Marker Annotations - bug 1956 contains comments about different ways to set gene names in caArray. Needs to be looked at again.

Release 1.8.0 TODO notes carried over

ARACNe Grid - need to verify that server-side implementation includes Bcell-100 parameter files.
ARACNe/MINDy Need to check on migration to new Califano lab page on Wiki.
caArray - Our local caArray is at afapp1.c2b2.columbia.edu port 38080 (web interface).
Color Mosaic - All Markers and All Arrays checkboxes appear to be disabled - oh this is only for ANOVA display.
Hierarchical Clustering - When I do hierarchical clustering, the arrays are shown ordered by the array sets activated, rather than the original order of the arrays in the dataset. Need to confirm that the labels and arrays are really staying together correctly when resorted.
MatrixREDUCE shown to work on Windows but not clear if it works on Linux.

Release 1.7.0 TODO notes carried over

Add MatrixReduce data to tutorial dataset. (not done in 1.7.0)
Remove unneeded data from tutorial download. (not done in 1.7.0)
Was problem with file save corruption fixed? It affected writing out files that had been read in in EXP (matrix) format. (think so but need to verify)
Include a list of HG-U95 and HG-U133 transcription factors in tutorial data download or with distribution (see Nature Protocols paper). (not done in 1.7.0?)

For next time

ANOVA - Need to pin down exact details on algorithms - Adjusted Bonferroni, Westfall-Young, and how to explain the interpretation of the alpha value in FDR - is it the confidence in the FDR as you sometimes see mentioned? Is the reported p-value (e.g. Bonferroni) corrected or uncorrected? Check code for details.

Documentation changes

Changes included in release 2.0.2 Online Help

These changes made to Wiki and transferred to Online Help.

MINDy - The wiki tutorial was completely rewritten with new screenshots to match all the changes (most were made in release 1.8).

Changes included in release 2.0.1 Online Help

These changes made to Wiki and then transferred to Online Help.

CNKB - material completely revised to reflect changes to component (multiple interactomes, choices of interaction types etc.).

Changes included in release 2.0.0 Online Help

These changes made to Wiki and then transferred to Online Help.

CCM – update needed. Buttons have changed, L&F and highlighting has changed. Buttons removed. Complete rewrite ported from new Tutorial.
Filtering - Complete rewrite ported from new Tutorial, based on new implementation.
Normalization - Complete rewrite ported from new Tutorial..

Release 2.0 new material - didn't get into release Online Help

MatrixREDUCE does not work for specific combinations of machines and options. This should be noted in documentation, as no solution has yet been found. The specific problems are detailed in Mantis bug #1555 "MatrixReduce cannot run":
1. MatrixReduce can run on Mac/Linux only when Parameter Topological Pattern is set to "Load from file".
2. MatrixReduce runs on PC either under "Load from file" or "Specify Pattern".
32 and 64 bit problems.
1. Pudge - bug #2136 - Pudge browser can run on 64-bit mac (mac osx 10.6), but not on 32-bit mac. Tests on macs dar1 and common1 run well. Pudge does not run on 64 bit windows. Note - need to test if it will work on 32-bit JRE on 64-bit windows.
2. Markus browser - bug #2136 - Markus browser cannot run on macs until the applet loading problem is solved. Does not run on 64 bit windows.
Online Help changes needed(from system test)
1. Promoter –
  1. update URL: The following sentence is INCORRECT: The datafile used "MATRIX_DATA.txt" can be found at http://jaspar.genereg.net/html/DOWNLOAD/mySQL/JASPAR_CORE_2008/.
  2. update screenshot of Parameters tab. Sequence tab screenshot mis-sized.
  3. Remove external links.
2. Pattern Discovery – out of date, must be replaced. (DONE in release 2.1.0)
3. Project Panel – "open dataset" needs updating for file names (tab delimited)
4. Pudge - incomplete
5. Sequence Retriever – hey what is the deal with the blue markers in the first picture?
6. T-test – external links are shown, should not be. Same for volcano plot.

Changes to Wiki tutorials subsequent to 1.8.0 release

The relevant Online Help pages will need to be updated.

Pattern Discovery tutorial completely rewritten. Switched from a DNA example back to a protein (histone) example.
MINDy - advanced params screenshot and text updated. (Done - New MINDy tutorial in 2.0.2)

Completed changes to Wiki tutorials subsequent to 1.7.0 release

The relevant Online Help pages will need to be updated.

Color Mosaic tutorial added, starting with material in User Manual.
Cytoscape
1. tutorial was updated to describe network create/destroy right-click menu commands and how clicking on an adjacency matrix in the Project Folders component recreates the network. mantis bug 1770.
2. bugs 1728, 1743 and 1752 - a description of set operations and how the may result in unexpected results due to the many-to-many relationships of markers and genes was added.
Grid Services - Added a "Services" section to each analysis component tutorial for which a grid service exists (except Pudge).
Hierarchical Clustering - A completely new tutorial on Hierarchical Clustering was written, starting from the ANOVA tutorial result.
MatrixREDUCE - tutorial was updated after the 1.7 release - may need to update online help.
SOM - The SOM entry (previously part of the Clustering entry) was completely rewritten, including detailed descriptions of the parameters taken from the online-help. The SOM example also starts with the ANOVA result.
Analysis - There is a new section in the tutorials for Analysis which has no matching Online Help chapter. It describes e.g. the way saved parameters are highlighted if matched.

Needed changes to Tutorials

Cummulative list starting with Release 1.7....

Color Mosaic
1. Don't know what Sort is supposed to do, and
2. Export does not appear to work. Does not work from ANOVA, and
3. Image Snapshot does not seem to work from main Color Mosaic but does work if displaying from ANOVA.
EVD - What is the EVD t-test used for/ how is it used? A histogram of t-test statistics? (not done in 1.7.0)
Gene Pattern components need tutorials/Online Help??:
1. Need to document server settings to use GenePattern modules. Our local GenePattern server is afdev2.c2b2.columbia.edu port 9999.
2. PCA (GenePattern) - Analysis and Viewer
3. K-nearest neighbors (GenePattern)
4. SVM 3.0 (GenePattern) - Analysis and Viewer - include, we need to develop online help and tutorial (Aris). http://wiki.c2b2.columbia.edu/mantis/view.php?id=474
5. WV - Weighted Voting (GenePattern)
Grid Services
1. Add detail to tutorial about how caGrid v1.3 uses caTransfer?
2. Verify that each component offering a grid service has documentation.
3. Find out and explain how our grid services handle multiple requests e.g. to ARACNe grid service - all run at once, in separate processes?
4. Explain exactly what is sent to grid - only selected data, or all data with a map? (not done in 1.7.0)
Hierarchical Clustering
1. Need a more top-level description of the Dendrogram component.
2. when "Average" linkage is selected, MEV uses a "weighted" average, which reduces the weights of more distant nodes. Does geWorkbench implement any such refinement?
3. MEV can give priority to markers or arrays (?) when drawing the clusters.
Marker Selection in some components - The way marker filtering is done has changed to use a built in set selection feature in the MINDy viewer, rather than using activated marker sets. See bug 1673.
Markus
1. Make sure any tutorials include the final URL.
2. tutorial is still rough. Structure should be brought more into line with others.
3. to actually run Mark-Us, one needs a grid password. What is our policy on this?
Position Histogram - There is no online documentation. How does it align sequences? (not done in 1.7.0)
Pudge - did not add a services section to Pudge tutorial because not sure if it is actually used/available.
Scatter plotThere seems to be no tutorial. Online Help exists but needs to be updated to mention the enhanced "tooltip" spot identification added in release 1.7.0.
1. http://wiki.c2b2.columbia.edu/mantis/view.php?id=1782
2. Details: A feature had been added to Scatter Plot to allow overlapping points to each display a tooltip. This did not work if many points were overlapping, or if there were too many points in the dataset being compared. If more than 100 points are being compared in the plot, the enhanced tooltip feature is turned off, and only one point will show a tooltip for a given location.
Sequence (Viewer)- tutorial needed.
SOM - The following questions are outstanding on SOM tutorials:
1. Where did the statement about data for SOM needing to be normalized come from? Is it true?
2. The formal definition of SOM says dimensionality where it may mean something like "dimensionality N".
3. The online help mentions neuron and initial coordinates, but now only one set is displayed. Which is it?

Tutorial and Help change status table

This table currently just copied from release 1.8.0. Not yet updated for release 2.0.

component	Tutorial	Online-Help	in synch	further changes needed	assigned to
Analysis	yes	no	no	no	Ken
ANOVA	yes	yes	yes	no
ARACNe	yes	yes	yes	no
BLAST/Seq. Align	yes	yes	yes	no
caArray	"remote data sources (caArray)"	"Project Panel - Remote Data Source"	?
CEL imager	in Viewing a Microarray	yes	?	?
Cellular Network KB	yes	yes	yes	no
Clustering (SOM and HC)	individual	yes	no	?	Ken
Color Mosaic	yes	yes	no	yes	assign
Component Configuration Manager	yes	yes	?	?
Cytoscape	yes	yes	yes	no
Dataset Annotation	in "Project Details"	"Comments"			need to synch up names
Dataset History	in "Project Details"	"History Panel"			need to synch up names
Expression Profiles	no	yes	no	?
Expression Value Distribution	yes	yes	?	?
Experiment Information	in "Project Details"	yes	no	?
Filtering	yes	yes	?	?
Gene Ontology	no	no	-	create	Ken
Gene Pattern Components	Classification - KNN and WV	?	no	?
genSpace	no	yes	no	?
Hierarchical Clustering	yes	yes	no	yes	Ken
JMol	yes	yes	?	?
Marker Annotations	yes	yes	yes	yes	Ken
Markers/Phenotypes/Arrays	?	yes	?	yes
Mark-Us	yes	yes	?	yes	Ken
Master Regulator Analysis	yes	yes	yes	no
MatrixReduce	yes	yes	no	no
Menu
Microarray Viewer	yes - see Viewing a microarray dataset	yes	?	?
MINDy	yes	yes	?	Yes	Ken
Normalizers	yes	yes	no	yes	Aris
Online Help
Pattern Discovery	yes	yes	?	yes
Position Histogram	no	yes	?	?
Preferences	?	?
Principal Component Analysis	no	no			is this gene pattern?
Project Folders	"Projects and Data Files"	"Project Panel"	?	yes
Promoter	yes	yes	yes	no
Pudge	yes	yes	looks yes	?
Scatter Plot	no	yes	no	yes
Self Organizing Maps	yes	yes	no	no	Ken
Sequence Panel	no	no			assign
Sequence Retriever	yes	yes	?
Services (Grid)	yes	no	no	yes	Ken
t-Test	yes	yes	?	yes	Ken
Tabular Microarray Viewer	"Viewing a Microarray Dataset"	yes

GeWorkbench Release 2.0

From Informatics

Contents

General notes on geWorkbench release 2.0.*

Release Schedule for 2.0.0

Update Releases

Role Assignments

Things to remember

Known Issues in Release 2.0.0

New Component Detail and Dependencies pages created

Major Code Changes in 2.0.0

List of changes to GUI

New components in release 2.0

Other major new features in release 2.0

Tutorial/Online Help chapters revised and included in release

List of other major changes

Versions of external files/components included in this release

geWorkbench 2.0.0 Grid Service URLs

External URLs

Internal URLs

geWorkbench 2.0.1 Grid Service URLs

External URLs

Internal URLs

References

geWorkbench 2.0.0 Web Service URLs

External Service Requirements and Connectivity

List of Included Components

Data Managmenent:

File input formats:

Connectivity

Data filters:

Normalization:

Experiment Information:

Analyis/Visualization

Excluded and Dropped Components

Excluded components

Dropped components

Externally supplied components

MatrixReduce

Source

Compiling

Notes

Aracne.jar for MINDY

Cytoscape

Any other components?

Analysis components - external runtime dependencies

TODO Notes

Done

Items deferred to a future release

Not finished for release 2.0.0

Release 1.8.0 TODO notes carried over

Release 1.7.0 TODO notes carried over

For next time

Documentation changes

Changes included in release 2.0.2 Online Help

Changes included in release 2.0.1 Online Help

Changes included in release 2.0.0 Online Help

Release 2.0 new material - didn't get into release Online Help

Changes to Wiki tutorials subsequent to 1.8.0 release

Completed changes to Wiki tutorials subsequent to 1.7.0 release

Needed changes to Tutorials

Tutorial and Help change status table

Views

Personal tools

Navigation

Search

Toolbox