User:Bjagla
Contents
- 1 Feature requests for geWorkbench
- 1.1 General Features
- 1.2 Programmatic features
- 1.3 Menu
- 1.4 Project Folders
- 1.5 Markers
- 1.6 Arrays/Phenotypes
- 1.7 Sets
- 1.8 FASTA sequences
- 1.9 Microarray data
- 1.9.1 Simulation
- 1.9.2 Network browser
- 1.9.3 Interactions
- 1.9.4 Aracne
- 1.9.5 Tabular Microarray Viewer
- 1.9.6 Sequence Retriever
- 1.9.7 Synteny
- 1.9.8 Microarray viewer
- 1.9.9 Gene Ontology
- 1.9.10 GSEA
- 1.9.11 Reverser Engineering
- 1.9.12 Scatter Plot
- 1.9.13 caBIO pathways
- 1.9.14 Marker Annotations
- 1.9.15 Color Mosaic
- 1.9.16 Expression Profiles
- 1.9.17 Expression Value Distribution
- 1.9.18 Experiment Info
- 1.9.19 Synteny Parameters
- 1.9.20 Normalization
- 1.9.21 T-Profiler
- 1.9.22 Analysis
- 1.9.23 Associations
- 1.9.24 caScript
- 1.9.25 Dataset Annotations
- 1.9.26 Filtering
- 1.9.27 Dataset History
Feature requests for geWorkbench
General Features
- Mantis 0000480: Workflow support
Would be nice to have true workflow support for regular users (without the need for scripting). (agreed: Bernd)
- Not in Mantis:
There are problems related to memory management. geWorkbench uses too much memory for storing Microarray data internally. When loading 9 U133 chips one needs more than 512MB.
- Mantis 0000114: Persisting configuration settings designated through the visual builder
At present, changes to the start up configuration effected during the application execution (through the Visual Builder) do not get persisted: when the application is launched the original start up configuration is used again.
It is desirable that changes introduced by the Visual Builder be persisted in the application's configuration file so that they can be "remembered" at the next launch.
- Mantis 0000115: Choosing among available application "flavors"
During application launch, a start up window (which the user can opt to hide for subsequent application invocations) should prompt the user to select one among the available application configuration "flavors". This functionality should be available not only at start up but also from within the application, through a "Preferences" type of menu option.
- Mantis 0000077: Context sensitive online help
We need to make online help available in a context sensitive manner: from within a component, a user should be able (using F1 or by pressing a Help button) get access to the online help associated with the component.
- Mantis 0000156: Extension of event exchange model
Entering this here as a placeholder regarding the proposed re-engineering of the event model. Some of the design suggestions that have been mentioned include:- Making it a rule that event data are always interfaces.
- Removing the need to specify within the throwEvent() method the listener interface and the method to invoke in that interface.
- Using annotations (or some other mechanism) in order to provide a direct reference to a service provider so that methods can be invoked directly rather than through event exchanges.
- Mantis 0000157: Extend framework to bring to focus components that receive appropriate events
An issue with the GUI is that one needs to know which components respond to which events in order to inspect the results of some action. E.g., when executing the hierarchical clustering analysis and in order to review the results one needs to explicitly select the Dendrogram tab.
The framework needs to be extended so that components can gain the focus as appropriate when they receive certain events.
- Mantis 0000173: UI Tab Display.
Tabs should be displayed in alphabetical order. It's difficult to find tabs as they are ordered currently.
If the user cannot view all the tabs without scrolling, the tab headers should become drop down values.
- Mantis 0000478: File save/load operations are memoryless in terms of last directory used.
There are many places in the app where data are being loaded from or saved to disc. When such an operation is used for the *second* time the app should remember the directory the user navigated to the *first* time. Some specific examples where this is not the case:- Exporting an image node from within the projects folders.
- Saving a data node from within the projects folders.
- Saving a panel from the Markers component.
- Mantis 0000635: columns can be moved (applies to all tables)
when moving a column (which is not neccessarly bad) to the first position (row names) the row name is exchanged with a value but has this "button" look....
Programmatic features
- Mantis 0000400: Inconsistent t-test results based on phenotype/panel activation order
There should have been no significant genes found once the marker selection had been made but the volcano plot was erroneously displaying the genes that had been selected as being significant. This is due to an inconsistency in the way the markers() method is implemented in CSMicroarraySetView. For the time being I've routed around this method but it should probably be changed at some point.
Menu
- Mantis 0000166: main menubar controls for Project Folders Area are confusing.
The problem is caused by the differences in type and semantics between workspace folder and project folders. For example, The "File" submenu offers "Open" and the pop-up choices are "File" or "Workspace". Select Workspace and you are offered a dialog box. Select "File" and you are told you must first select a project node. Or if you select "New" (an adjective) from the submenu, you get a choice of "Workspace" or "Project". Select "Workspace" and the current workspace folder is cleared of its contents. Thus New Workspace actually REMOVES the Workspace folder, whereas the submenu item "Remove" does not include "workspace" as an option. Select "Project" and a new project folder is added without replacing an existing one. I think the whole thing could be made more obvious and logical by making the File submenu items "Workspace" and "Projects" and perhaps "Files". Then the choices under each could be actions (verbs) - e.g Open, Remove, Load, Delete, Save, etc. In other words, you consider the folder that you want to do something to, select it in the file menu and choose the action that you want to take. Also the other file items "Export" etc. have nothing to do with the Project Folders Area and are positioned awkwardly.
File
- Mantis 0000345: File Loading Confusion
No choice to list all files in directory. The file selection box should contain an option to see all the files in a directory, otherwise unclear which file postfix goes with which loading option.
- Mantis 0000347: Prompt to Save
No prompt to save workspace when application exits, there should also be a way to exit the application from the file menu.
- Mantis 0000452: filtered datasets misrecognized on read in
If a dataset is filtered, written out, and then read in again, the "magic marker" that defined the chip type may be deleted, and the dataset can be recognized as another entirely. For example, my HG_U95 dataset, after filtering out half the markers, was recognized as HG_133_Plus2.
- Mantis 0000479: Enable the "Export" functionality for data nodes.
This functionality is supposed to be similar to the "Open file" functionality, where a number of pluggable export filters can be used to export a dataset into another format. At the vry least we should support exporting microarray data into the Cluster (http://rana.lbl.gov/EisenSoftware.htm) [^] format.
- Mantis 0000484: At the File menu, an "Exit" menu item should be provided.
Many many other softwares do have it. Maybe an option for saving the workspace should be provided before exit.
- Mantis 0000465: Support loading Affymetrix .CEL and .CHP formatted files
geWorkbench 1.0/caWorkbench v3.0 are capable of reading in only the .txt version of Affy files. Many users (including Northwestern) have indicated it would be tremendously beneficial to support .CEL, .CHP files as well. The former contain probe level data, rather than the probeset level data which the workbench is currently designed to handle. As such, loading of .CEL data will probably have to be coupled with the immediate execution of a normalizer capable of translating probe level data to probeset level data. => CEL is already working
Edit
View
Commands
Tools
Help
- Mantis 0000466: Add "shortcuts" screen
Add a "shortcuts" screen (maybe under the "Help" main menu item?) to list available shortcuts like the very useful F12.
Project Folders
- Mantis 0000739: right click image nodes should show option to export
if there is an image in the project panel there should be an additional option to export the image. Now one has to go through the File menu.
- Mantis 0000633: copy function
Since all the operations are destroying the original data I believe it could be useful to have a copy function that copies a state of a data set to a new data set.
- Mantis 0000643: ask if should really remove
I think it is good practice to ask the user if he really wants to remove an object. Accidents happen and if you put a lot of work into one it is rather disappointing if you accidentally remove your work... (This is only ment for projects and dataset, not for images or such)
Markers
Arrays/Phenotypes
Sets
- Mantis 0000381: Renaming a label erases its class (case, control, etc.) status.
Renaming a label erases its class (case, control, etc.) status. To reproduce, set a label's class to 'case', then rename that label.
FASTA sequences
Promoter
Sequence Panel
Position Histogram
Simulation
- Mantis 0000608: why can a simulation be performed on an image?
Simulation can be used for any object, why? Can it be at least moved to the beginning? So it is not the panel that pops up all the time first?
Network Generator
Phenotype & Optimizer Options
Interactions Display
5. Sequence alignment
- Mantis 0000544: fastacmd support doesn't exist
I don't know if there's a plan to implement fastacmd in geworkbench, but if there is, would I find it in sequence alignment? What options will be available? => Don't know why we would support this??
BLAST
- related to 0000541: blastall is not complete
using local / own databases
HMM
Other
Pattern discovery
- Mantis 0000496: sequence masking needed
A major planned workflow of geWorkbench is to be able to search for patterns in sequences surrounding co-regulated genes. However, no provision is made to mask or deal with masked sequences. Many patterns found in unmasked genomic sequence can be expected to be in repeated elements of various types (or does Splash already have a way of dealing with this?). Solving this may involve enhancements to both the Pattern Discovery module and perhaps to the Splash server. The Splash server currently is believed to recognize mask characters N and #.
Searching for regulatory patterns is potentially very sensitive to the details of how it is carried out, as such patterns may be short and embedded in regions of repeated sequence. The greatest possible degree of flexibility and control may be required.
Suggestions: We will assume we can obtain from the Sequence Retriever sequence that is either unmasked or is reversibly masked, that is with small letters denoting masked seqeunce and capitals denoting unmasked.
The following options cold be considered: 1. If lower-case-masked sequence is available, provide the option to convert masked characters to Ns before sending to the Splash server.
2. We could host a repeatmasker server (www.repeatmasker.org). We could then mask sequence as desired, e.g. mask low complexity but leave in LINE elements. There are many considerations and subtleties discussed at repeatmasker.org.
3. The Splash server itself could be enhanced to offer simple masking, along the same lines as the BLAST programs do.
4. We should give some indication of the kinds of statistics one could expect from true conserved patterns vs those arising from repeats. Can they be distinguished?
Associations
caScript
Dataset annotations
data history
Experiment Info
Microarray data
The following panels/functionalities are requested specifically for Microarray data:
Simulation
Network browser
Interactions
Aracne
Tabular Microarray Viewer
- Mantis 0000627: selecting arrays and markers
It would be nice if markers and arrays could be selected/highlighted by double clicking on row / column headers
Sequence Retriever
Synteny
Microarray viewer
- related to (closed) Mantis 0000225: Microarray panel details
It would be nice to have the values (min, max) displayed next to the color legend
- Mantis 0000326: absolute visualization needs viewing controls
If a set of unnormalized Affy data is read in, for example the cardiogenomics dataset, with the tools->preferences->visualization mode set to absolute, the dynamic range of the data is so large that only a few data points appear. Some options to deal with this would be: 1. a control labeled "Display as log transform" 2. an intensity control such as in the color mosaic panel.
- Mantis 0000431: Table View does not sort properly
Sorting by most columns not work properly (does not order numerically or alphabetically) and attempting to sort by pValue appears to do nothing. => also true for any table
Gene Ontology
- Mantis 0000623: Organization of panel is misleading
The Gene Ontology Panel is divided into three sub-panels: TreeView, TableView, P-value Trend. TreeView holds vital information for TableView and P-value: Chip set used, Reference List, selection to choose which of the above, and which class of GO terms is used. Those options/parameters should be visible in all of the three main panels since they heavily rely on them and should be moved outside of the panel structure within the GO panel!
GSEA
Reverser Engineering
Scatter Plot
caBIO pathways
Marker Annotations
Color Mosaic
- Mantis 0000477: Missing right click actions
CM should have the right click actions consistent with Expression Profiles, EVD,Scatter Plot which supports zoom in/out,save as & print.
=> least have a mouse-over function that displays the chip and value for the spot
- Mantis 0000611: color slider function not obvious (probably not only in color mosaic)
The real values displayed in a heat plot have to be mapped into a color space that is usually in the range between 0 to 1. A threshold has to be defined beyond which all values are mapped to 1 in the color space. By moving the threshold lower differences in the lower magnitudes can be visualized. This does not seem to happen in geWorkbench. I would expect to everything lighting up when I move the slider to one of the extremes. This does not happen.
Since I have seen the behaviour also in the other heat map representation and this is not the obvious behaviour it either has to be clearly documented what is happening or changed to "normal" behaviour
Expression Profiles
- Mantis 0000524: no mouse over on expression graph (assigned to John)
Some of the components support displaying the marker name when a data point is moused-over. The Expression Profiles component does not, instead you must click on a line and view its name in the Markers panel. Mouse over would be a good enhancement.
Expression Value Distribution
Experiment Info
Synteny Parameters
Program
MUMmer
Dots
Synteny map
Genome selections
Annotations
Normalization
Housekeeping Genes Normalizer
Log2 Transformation
Marker-based centering
Mean-variance normalizer
Array-based centering
Missing value computations
Threshold Normalizer
Quantile Normalization
T-Profiler
Analysis
The following analysis functions are requested:
- ANOVA
- mantis 0000307: Analysis should write to dataset history
When any type of analysis is done, basic facts should be written to the Dataset History log, such as whether all markers were used or an activated panel; the number of markers/arrays in the activated panel, and the array type, and a timestamp.
Fast Hierarchical clustering
- Mantis 0000148: Euclidean distance shoud use normalized vectors
Data should be transiently normalized
Matrix reduce
Svm classifier
Slmr classifier
SOM
T-test
Multi T-test
Associations
caScript
Dataset Annotations
Filtering
Deviation filter
Affy detection call filter
Expression threshold filter
Genepix flags filter
Missing value filter
2 channel threshold filter
Dataset History
- Mantis: 0000300: need array info
We should be able to display information about a microarray - the chip type, the number of markers in the dataset vs in the chip definition.
- Mantis 0000476: Data set annotation missing
Issue: Users analyze data, use activated sets, limit arrays included and none of these steps are captured in the annotation. The system should capture these steps for the user in a log so the researcher is clear on what the results include and how to reproduce.