Comments on geWorkbench

test file collection | Comments on geWorkbench | SAEC notes | SAEC protocol | SAEC executive summary | Other


Contents

Feature requests for geWorkbench

Feature requests not listed in Mantis

  • image snapshots should have dataset history, including all information needed to reproduce this image.
  • There are preprocessing methods implemented in geWorkbench like normalization and others, but I am missing methods for dealing with replica, time course, and similar problems.
  • Marker sets have visual properties associated to them. The color and shape should be visible next to the name in the Marker Sets view. This way it is obvious what the displayed results refer to. It is better to have the symbol and color as a separate image next to the name rather than coloring the name because light color would make it impossible to see the names (e.g. light yellow)
  • Integrate JRE with distribution (to handle genepattern problems)
  • Unit tests
  • Clean up/ refactor for more coherent sources. E.g. SVM: why is it under components-clustering and not components - analysis? There are a few of these guys floating around where it is not obvious to the unknowing what is going on… Also, how does the caGrid stuff fit in there?
  • Remove unused/old code from trunc: e.g gpmodule, synteny,
  • Version control for components
  • We need to define a standard way to deal with reading and writing files: (bug 1538)
  1. writing files
    1. if file exists, ask if to overwrite
      1. if cancel pressed cancel whole operation
      2. if ok write file
    2. if no file extension is given, ask if to automatically append file extension
      1. if cancel pressed cancel whole operation
      2. if ok append file extension
      3. if no don't append file extension and save file nonetheless
    3. if file extension differs from what is expected, ask if to automatically append file extension
      1. if cancel pressed cancel whole operation
      2. if ok append file extension
      3. if no don't append file extension and save file nonetheless
    4. check if file has been successfully written (at least on MAC/Linux) I "can" write in a place where I have no rights, I get no error message, and also no file is written.
    5. tell the user that the file has been successfully or not successfully written
  2. reading files
    1. allow for reading files with "non-standard" file extensions
    2. check that reading was successfull
    3. tell the user what has been read

* Do we really need the "All markers" and "All Arrays" check boxes? It involves a lot of work for the programmer to 1. understand and then also implement everything that is related to this. 2. The user needs to understand this concept.

  • Graphics export/printing should have publication quality, it should be standardized and more than one format should be implemented (jpeg, tiff, png). We need a use case for this.

* list management: i.e. union, intersection, etc, visualization of sets: Venn diagrams should be possible to select from more than one data object or even between projects

* there is a gray box displayed when starting the application

  • import projects by adding to existing live projects
  • full support for cell files, chip chip data, all Affymetrix data type/formats ( cel files, cnt files, expression chp files, genotyping chp files (LOH Estimation and Genotyping Analysis), exp files, experimental result summary) chemical information (SD files), ODBC connections

* 3D visualization of PCA projections

  • genomics visualization tools: plotting features on genomic regions with zooming functionality

* visualization of multiple chips at the same time

  • more annotation features for graphics (Title, comments, regression lines, etc)
    • possibility to export data to text file for processing with R or Matlab/Excel...

* better linking of table cells and selected data (highlight OR display selected data within all data)

  • being able to deal with sequences as input for numerical calculations: what is needed here are different ways to encode nucleic and amino acids. There are different matrices that can be applied but also self adapting algorithms one of which I developed during my PhD thesis...
  • different kinds of artificial neural networks.

* the application should be much more memory efficient

  • various kinds of visualization should be included/improved. 3D visualizations, box plot, ...
  • annotations should be visible somewhere. there should be a display that shows all/selected types of annotation/additional information about a marker or Array.
=> All information that is read into the program should be visible through some interface.
Mantis 870
currently there is a right click functionality called "view Annotations" but that only displayed the raw data as a tab delimited file

* In the "open file menu - remote access" one has to right-click to get vital information about the data being retrieved. It is very unintuitive to hide essential information in a right-click.

  • it should be possible to clean the temp folder from within Workbench to be able to revert to the initial installation configuration
  • It should be possible to do a pairwise/multiple sequence alignment from within sequence retriever or for a set of sequences to see differences in sequences that come up for the same marker. Selecting all sequences for a specific marker, adding them to a new sequence object and then doing a sequence alignment would be OK, too.
  • search function in Help is missing

General Features

  • Mantis 958: Preferences are not retained between installs
When a new version of geWorkbench is installed, any saved user preferences are lost. This is because these settings are stored in various files within the geWorkbench distribution itself. This also touches on the known issue of some components remembering which directory they last opened a file in, and others not. Preferences and directory information should be stored in a directory specified by the user, by default the users home directory or a subdirectory of it, e.g. .geWorkbench/. All components that open files should store a last directory used if it makes sense. This would be a lot easier if files were opened through a central file-handling service, which could store directories on behalf of individual components, or where they could all share a set working directory.
  • Mantis 0001028: Cancel print not working
1. right click on "selection" in either Marker sets or Array/Phenotypes sets. 2. select Print; 3. cancel the page properties pop-up; => you are still being let to the print page; cancel button is not working correctly
  • Mantis 0001029: printer selection is not working from page setup window
a print dialog pops up with the original default printer selected, please remove the first printer selection option.
The variable components.dir should be set if not set as a command line argument. Problems occurred in MatrixReduceAnalysis.java because of this.
  • Mantis 0000991: history tree structure
The history panel should store all information needed to reproduce the results. This can be either by saving the caScript code or by a tree like structure that shows different levels of detail: top node would only show name of the component used, second the parameters used, third all input data id as list.
  • Mantis 0000998: modifying saved settings keeps selection active
Christine, if I understand your proposed solution right, it would require modifying the Analysis Panel code by adding necessary call back methods each time a new analysis component is developed (to accommodate setting the execution parameters for this component). I would prefer if we avoided this it introduces unnecessary component dependencies. An alternative approach may be to have each class that extends AbstractSaveableParameterPanel throw a property change event (or something similar) each time the user changes a parameter value and also have the Analysis Panel register itself as a listener for such events so that it can update the selection status of a named parameter entry.
For now I would recommend that we give this fix a lower priority, we can look at it at later time.
  • Mantis 0000999: saving parameters not working correctly
save the parameters of matrix reduce; change parameters; save again under different name; go to first parameter settings by selecting first saved item; => nothing happens, still the latest parameters are displayed.

* Mantis 0001004: Add a "Delete Settings" button to the analysis panel

It works, but on saving the parameter set an exception is thrown: java.io.NotSerializableException. Reopening bug for cleanup.
  • Mantis 846: For every file import we need a warning message if the import didn't work and also a message what has been read in (how many markers, TF sites etc...)

* 0000700: progress bar for saving/loading (not only the workspace)

* Mantis: 0000482: "All Markers" and "All Arrays" overrides are confusing

In order to successfully uses the marker and phenotype panels, the user must create selection sets, enable those, optionally set them as case/control and ensure that "All Markers" or "All Arrays" is not checked in their component. This seems very difficult for a user to learn, and is not a very intuitive approach. There must be a better way to manage selections.
It might be sufficient to describe the basic concepts of geWorkbench (like this one) in the documentation AND tutorial etc...
What is the status of the "Visual Builder"????
  • Mantis 0000401: Panel selection/unselection improperly affects visualization components
As it turns out (per Andrea) this behavior is an artifact of the relative color scheme (does not appear for the absolute color scheme). I think we will need to change this in the future: one would expect activated panels to have an impact on the contents of a visual component only if the visual component has indicated that it wants to take into account activated panels.
  • somebody should go through all components and check that they all respond correctly to all events they should respond to. Quite a few bugs relate to events not being handled correctly. E.g.330 where the sequence panel does not clear when a project is being removed.
Bugs related to system state problems: 0000612, 789, 794, 817, 874
  • Mantis (242): Unable to save more than one panel
This relates to actions (save, delete) on multiple selections in Project Folder and Marker and Arrays/Phenotypes panels, where only the first the selected is being used
  • Mantis 0000480: Work flow support
Would be nice to have true work flow support for regular users (without the need for scripting).
(agreed: Bernd)
  • Mantis 0000740: selections in Project folder
When selecting the Workspace, or Project from another object the views that relate to that latter object are still available. It is impossible to know what is being calculated.
When leaving an object the display should be reset to an empty state
  • Not in Mantis:
There are problems related to memory management. geWorkbench uses too much memory for storing Microarray data internally. When loading 9 U133 chips one needs more than 512MB.
  • Mantis 0000114: Persisting configuration settings designated through the visual builder
At present, changes to the start up configuration effected during the application execution (through the Visual Builder) do not get persisted: when the application is launched the original start up configuration is used again.
It is desirable that changes introduced by the Visual Builder be persisted in the application's configuration file so that they can be "remembered" at the next launch.
  • Mantis 0000115: Choosing among available application "flavors"
During application launch, a start up window (which the user can opt to hide for subsequent application invocations) should prompt the user to select one among the available application configuration "flavors". This functionality should be available not only at start up but also from within the application, through a "Preferences" type of menu option.
  • Mantis 0000077: Context sensitive on-line help
We need to make on-line help available in a context sensitive manner: from within a component, a user should be able (using F1 or by pressing a Help button) get access to the on-line help associated with the component.
  • Mantis 0000156: Extension of event exchange model
    Entering this here as a placeholder regarding the proposed re-engineering of the event model. Some of the design suggestions that have been mentioned include:
    • Making it a rule that event data are always interfaces.
    • Removing the need to specify within the throwEvent() method the listener interface and the method to invoke in that interface.
    • Using annotations (or some other mechanism) in order to provide a direct reference to a service provider so that methods can be invoked directly rather than through event exchanges.


  • Mantis 0000157: Extend framework to bring to focus components that receive appropriate events
An issue with the GUI is that one needs to know which components respond to which events in order to inspect the results of some action. E.g., when executing the hierarchical clustering analysis and in order to review the results one needs to explicitly select the Dendrogram tab.
The framework needs to be extended so that components can gain the focus as appropriate when they receive certain events.
Tabs should be displayed in alphabetical order. It's difficult to find tabs as they are ordered currently.
If the user cannot view all the tabs without scrolling, the tab headers should become drop down values.
  • Mantis 0000478: File save/load operations are memoryless in terms of last directory used.
    There are many places in the app where data are being loaded from or saved to disk. When such an operation is used for the *second* time the app should remember the directory the user navigated to the *first* time. Some specific examples where this is not the case:
    • Exporting an image node from within the projects folders.
    • Saving a data node from within the projects folders.
    • Saving a panel from the Markers component.
  • Mantis 0000635: columns can be moved (applies to all tables)
when moving a column (which is not necessarily bad) to the first position (row names) the row name is exchanged with a value but has this "button" look....

* Mantis 0000547: Inaccurate Error Message

When connecting to a MAGE-OM Server the error message is "can not connect to server" even when it does connect and authenticate the user.
I changed the related code to reflect the real error.
But right now I have no available caArray RMI server to test this bug, if John can give me access to his local caArray server, I can confirm the bug is fixed or need more work.
  • Mantis 0000552: remembering parameters for different sequences
I have loaded two sequences. The visual pane remembers which of (Promoter, Sequence, Position Histogram) panes is used for each sequence. But it doesn't remember the individual options within those panes for each sequence. E.g. when I select the line view for one sequence and the Full sequence for the other, it will display the last selected viewing option.
There should be some consistency here: Either you remember everything or nothing for different sequences.


Is it possible to warn the user if there is no more memory? Especially on the MAC we are using for testing(where one has to wait quite some time sometimes) I would find it helpful to know that geWorkbench crashed and is not waiting or doing something....
=> It seems there is nothing we can do about it, but if a child process dies due to memory problems we should be able to identify this and let the user know.
  • Mantis 0000620: results are not associated with a dataset
see also 523: Reverse Engineering
  1. load two Affy dataset
  2. run on one of them a GO term analysis
  3. switch to the other dataset
=> the results from the first one are still displayed

Programmatic features

  • Mantis 0000400: Inconsistent t-test results based on phenotype/panel activation order
There should have been no significant genes found once the marker selection had been made but the volcano plot was erroneously displaying the genes that had been selected as being significant. This is due to an inconsistency in the way the markers() method is implemented in CSMicroarraySetView. For the time being I've routed around this method but it should probably be changed at some point.

* Mantis: 0000509: Error when trying to create network from saved workspace

0000794: Component state is not saved for any component

Menu

  • Mantis 0000166: main menu bar controls for Project Folders Area are confusing. The problem is caused by the differences in type and semantics between workspace

folder and project folders. For example, The "File" sub menu offers "Open" and the pop-up choices are "File" or "Workspace". Select Workspace and you are offered a dialog box. Select "File" and you are told you must first select a project node. Or if you select "New" (an adjective) from the sub menu, you get a choice of "Workspace" or "Project". Select "Workspace" and the current workspace folder is cleared of its contents. Thus New Workspace actually REMOVES the Workspace folder, whereas the sub menu item "Remove" does not include "workspace" as an option. Select "Project" and a new project folder is added without replacing an existing one. I think the whole thing could be made more obvious and logical by making the File sub menu items "Workspace" and "Projects" and perhaps "Files". Then the choices under each could be actions (verbs) - e.g Open, Remove, Load, Delete, Save, etc. In other words, you consider the folder that you want to do something to, select it in the file menu and choose the action that you want to take. Also the other file items "Export" etc. have nothing to do with the Project Folders Area and are positioned awkwardly.


File

  • Mantis 0001054: workspace file associated with Adobe ImageReady
after saving a workspace on the MAC the file is associated with Adobe ImageReady. It should be associated with geWorkbench and loaded when double-clicked upon.
  • Mantis 0001053: saving workspace in a place without rights doesn't give an error message
save a workspace as non-root in e.g. /etc => no error message is shown => no file is created => no exception is thrown.
  • Mantis 0000873/926: remote microarrays are not assigned to a chip type
When loading a microarray experiment from a remote source I am not asked what type of chip it belongs to. Also, it seems that there is no annotation assigned to markers. Then the experiment info could probably have the description from the remote source in it...
  • Mantis 666: merging data sets in project folder doesn't work
  • Mantis 0000823: Annotation parser - building gene names
  • Mantis 0000578: marker names are incorrect
  • Mantis 0000790: Implement a check on chip type selection
File loader should be rewritten; Exp files should be read in twice to verify file format and probably assign Affy chips automatically
An inherent problem for a lot of functions is that probe sets with unknown gene name will be assigned the name "---". This is, simply put, wrong. In order to account for that Manju and I suggest to attach the probe id to the "---" to first indicate that no name is supplied and secondly to distinguish between them. Of course it can be true that some of the "---" actually belong to the same gene, but as long as this is not known any such assumption is very dangerous.
In the same breath I want to suggest to give the user an option of distinguishing between probe ids with the same name or combine them. This can be accomplished by a toggle button when reading in a file and by associating the probe id with the gene name.
  • Mantis 0000345: File Loading Confusion
No choice to list all files in directory. The file selection box should contain an option to see all the files in a directory, otherwise unclear which file postfix goes with which loading option.
No prompt to save workspace when application exits, there should also be a way to exit the application from the file menu.
  • Mantis 0000452: filtered datasets mis-recognized on read in
If a dataset is filtered, written out, and then read in again, the "magic marker" that defined the chip type may be deleted, and the dataset can be recognized as another entirely. For example, my HG_U95 dataset, after filtering out half the markers, was recognized as HG_133_Plus2.
  • Mantis 0000479: Enable the "Export" functionality for data nodes.
This functionality is supposed to be similar to the "Open file" functionality, where a number of pluggable export filters can be used to export a dataset into another format. At the very least we should support exporting microarray data into the Cluster (http://rana.lbl.gov/EisenSoftware.htm) [^] format.
  • Mantis 0000484: At the File menu, an "Exit" menu item should be provided.
Many many other software packages do have it. Maybe an option for saving the workspace should be provided before exit.
  • Mantis 0000465: Support loading Affymetrix .CEL and .CHP formatted files
geWorkbench 1.0/caWorkbench v3.0 are capable of reading in only the .txt version of Affy files. Many users (including Northwestern) have indicated it would be tremendously beneficial to support .CEL, .CHP files as well. The former contain probe level data, rather than the probeset level data which the workbench is currently designed to handle. As such, loading of .CEL data will probably have to be coupled with the immediate execution of a normalizer capable of translating probe level data to probeset level data.
=> CEL is already working
  • Mantis 0000418: Chip type selection dialog appears with dataset switch
If the application doesn't recognize the chip type of a dataset it will ask you to select it, but it will keep asking you every time you switch to the dataset even once after you've picked that type once.
  • Mantis 0000549: GUI Poorly Formatted
The GUI for connecting to a remote ca-array server is poorly laid out - Port and Protocol should be on the same line, URL should have its own line, there should be larger room for User Name and Password.
  • Mantis 0000742: directory not remembered
when reading in an annotation file (through the "other" option) the directory of the last load is not remembered.

Edit


View


Commands


Panels
  • Panels should be called sets

Tools


Visual builder

Preferences
  • Mantis: 0000513: Changing visualization in preferences has no immediate effect
  • Mantis 0000964: when path to editor nor valid no error message is displayed
given a wrong path to an editor, I would expect to have two different error messages, one when entering a path to something that is not there (when editing the path) and one when the program cannot be started ( when starting the application).



Help

  • Mantis 0000466: Add "shortcuts" screen
Add a "shortcuts" screen (maybe under the "Help" main menu item?) to list available shortcuts like the very useful F12.
  • Mantis 0000548: No documentation pointer to MAGE-OM Setup
Although there is documentation in 1.03 geWorkbench describing how to connect to a MAGE-OM server, it is not detailed, does not describe the protocols and has no pointer to the MAGE-OM documentation.
  • Mantis 0000511: Help topics not alphabetical
The main help topics in the on-line help are not in alphabetical order.


Project Folders

  • Mantis 0001025: saving dataset as ".exp" doesn't preserve phenotype sets
1. load a data set with more than one array; 2. add both arrays to a new set called "a"; 3. add array 2 to a new set called "b"; 4. create a new phenotype set; collection by clicking on "New"; 5. save the data as "test.exp" on your desktop; 6. open the file again with geWorkbench; => only set "a" is retained. all information about set "b" is lost

* Mantis 0000730: feature request: union / intersection

being able to combine e.g. fasta sequence/merge data sets

* Mantis 0000604: select menu-file-open-workspace

=> new workspace is not confirming deletion of old one
  • Mantis 0000739: right click image nodes should show option to export
if there is an image in the project panel there should be an additional option to export the image. Now one has to go through the File menu.

* Mantis 0000633: copy function

Since all the operations are destroying the original data I believe it could be useful to have a copy function that copies a state of a data set to a new data set.
  • Mantis 0000643: ask if should really remove
I think it is good practice to ask the user if he really wants to remove an object. Accidents happen and if you put a lot of work into one it is rather disappointing if you accidentally remove your work... (This is only meant for projects and dataset, not for images or such)
  • Mantis 0000543/571: removing more than one project/data set only removes the first one
  • Mantis 0000571: sequence selections under Project folder
1. load two or more sequences in different project files.
2. by holding the ctlr key select more than one sequence.
=> two sequences are selected, but you cannot do anything the both together. How do you combine different sequence into on project such that you can select from the combined sequences?


Markers



Arrays/Phenotypes

  • Mantis 0000393: Inconsistent GUI element
there is no save/load functionality for arrays as there is for sets
  • Mantis 0000415: Panel Label is not prepopulated with last entry.
I performed the following step:
- Loaded webmatrix.exp
- Added 3 arrays to the selection panel
- Selected an array> Add to Panel and typed over the "Section' and entered Panel A
- Then another array > Add to Panel. The Panel label text box was no longer prepopulated.
Expected Result: The Panel label should be prepopulated with the last label created (Panel A).


Sets

* Mantis 0000381: Renaming a label erases its class (case, control, etc.) status.

Renaming a label erases its class (case, control, etc.) status. To reproduce, set a label's class to 'case', then rename that label.

  • Mantis 0000451: if filter after clustering, dendrograms become invalid
  1. Load a dataset and perform a hierarchical clustering. 2. View in the dendrogram panel.
  2. Now filter out some values so that the size of the dataset is decreased.
  3. run hierarchical clustering again.
  4. View in the dendrogram panel. Now select the first cluster output in the project panel.
  5. Because the reduced dataset does not match the original cluster, the drawing fails with numerous bad consequences across the interface
How to handle changes in dataset size is a design consideration that must be worked out, for example by
  1. duplicating the original dataset so that dependent datasets remain valid, or
  2. remove the dependent datasets that have become invalid. User should be prompted before such filtering on a dataset that has dependent datasets.


FASTA sequences

Promoter Panel

  • Mantis 0000861: deleting project leaves results
  • Mantis 0000867: Saving results => not the correct line feed is used.
  • Mantis 0000841: can't load new TF sites
  • Mantis 0000859: loaded TF is not handled correctly
After loading the attached transcription factor trough the "Add TF" button, the new TF entry appears in the Selected TF list (I was actually expecting it to appear in the TF Mapping list but this is not a big deal). When I double click on the new TF it moves (correctly) to the TF Mapping list. When I then go to the TF Mapping list and double click on the new TF again, I get an exception and the TF disappears from both lists

* Mantis 0000712: Promoter Panel Logo Tab needs a header

It would be good to change the color of the currently selected TF. Now it is just a little bit darker blue. Try Black or red/ light red.
  • Mantis 0000653: position of promoter panel
To me the promoter panel is more an analysis module and should be in the lower right area (analysis area) rather than in the viewing area (upper right).
  • Mantis 0000593: manually adjustable p-value changes automatically, which should not happen, any user changeable variable should NOT be overwritten
  • Mantis 0000655: after scan displayed panel is changed
The active/displayed view of the promoter panel is changed to Sequence-Line view whenever a scan is performed.
Especially when changing parameters it can be really annoying.

* Mantis 0000660: selection is not used for pattern discovery

=> It is a known issue and need further discussion.
  • Mantis 0000697: progress bar color doesn't change to red if process is interrupted under MAC

Sequence Panel

  • Mantis: 0000330: Sequence panel does not clear when project removed.
  • Mantis 0000575: edited sequence causes problems
  1. load attached document
  2. open sequence view
  3. edit sequence (right click on sequence in Project folder, view in editor)
  4. remove 7 nucleotide (AATAATT) around position 380 (the line should now be 70 nucleotides long as all the others)
  5. activate the sequence (line view) in the sequence panel such that the sequence is shown below the line view
  6. click on the arrow bars
=> sequence disappears.
Also the sequence from the modified filed is not the one show. The original sequence is shown.
  • Mantis 0000748/651: sequence line should show position/window of displayed sequence
It would be beneficial to have a small window/box or indicator that shows the position of the displayed sequence within the multiple/single line view.
(see also sequence retriever)
  • Mantis 0000650: holding down arrow for moving sequence
it would be nice to be able to scroll through the sequence not by single clicking on the arrows but also by holding the arrow down. then the sequence should slowly be moving

Position Histogram

The Filter button does not seem to do anything. I also don't see any effect from the Avg./Peak button. Also, the on-line help states that the flex. threshold is not implemented. If these are not in use they should probably be removed or at least deactivated.
The filter button is in fact connected to a pretty sizable method called computeAllPatternStatistics() which has the following note attached:
// Assuming a binomial model, the average density should be given by the likelihood * SeqNo * size of step
// and the variance should be sqrt of that. For the time being, we assume a uniform likelihood
And eventually seems to update each pattern's p-value.
  • Mantis 0000732: display is not updated when parameters are changed
The way the component works now is correct but maybe not very intuitive. We should at least re-examine how it works in comparison with the other graphical tools and try to harmonize if possible.
Bernd suggests just coloring the plot button red if an action is pending but not yet executed (awaiting the plot button being pressed). Another alternative would be to change the plot button to a check-box called "Active" or something like that.
Is the plot button needed for performance reasons? Why is it actually there?

Simulation

  • Mantis 0000608: why can a simulation be performed on an image?
Simulation can be used for any object, why? Can it be at least moved to the beginning? So it is not the panel that pops up all the time first?

Network Generator

Phenotype & Optimizer Options

Interactions Display

5. Sequence alignment

  • Mantis 0000544: fastacmd support doesn't exist

I don't know if there's a plan to implement fastacmd in geWorkbench, but if there is, would I find it in sequence alignment? What options will be available? => Don't know why we would support this??


BLAST

Columbia server not working...

  • Mantis 0001055: sequences are not checked of nt or protein anymore
I can start blastn on a protein sequence (histall.fa) this doesn't give any error message but also no results and a NoSuchElement exception.
  • Mantis 0000787: subsequence blast is not working
build in some functionality to copy part of a sequence to a new sequence, or even better to be really able to edit a sequence, and create a new sequence.
  • Mantis 0000751: up/down keys not working
This problem is seen when navigating in the target sequence area (left panel) of the results panel for BLAST. Even though the selection is moving in the result view on the left hand side (the blue selection is moving) when using the up/down keys the corresponding alignments are not shown. This only works when using the mouse
  • Mantis 0000734: naming conventions for alignment results and imported sequences
Blast results get their name from the first name listed in the actual blast output. The db field is set to the appropriate database. When importing a sequence using "Add selected sequence to project" a random number is used for generating an artificial name for that sequence. The name of the sequence is derived from the NCBI (gi name). There is no additional information in the data history on where the sequence comes from.
Needed functionality:
- Dataset history should reflect where the sequence is coming from: i.e. which blast run, which result entry
- the name of the sequence shouldn't be a random number but rather the name of the first sequence that is imported. In case there are multiple sequences with the same name a number should be appended separated from the original name by and underscore.
- The name of the sequence should be same as the one from the alignment result.
Right now it is impossible to trace back where a specific sequence is coming from.
  • related to 0000541: blastall is not complete
using local / own databases
  • Mantis 0000521: Need dates for BLAST databases
People need to know the creation date of the BLAST databases they are running a query on. They cannot just trust that we have the latest. It would be great if a query could be done to import this info into geWorkbench, if not, then there should be a link or button that would bring up the appropriate web page on the AMDeC website. Really it should be in the interface though.
  • Mantis 0000671: load in alignment result is not working (as I expect)
I would just remove the load button. Why would you want to do this in the first place? I would understand if this button would be on the analysis component of Blast but in the result component... ????

HMM

has been removed


Other

has been removed


Pattern discovery

  • What is the meaning of "Add pattern to project" ? It doesn't do anything of value, other than producing a new node in the project panel, that cannot be saved even though there is the option to do so.
  • The "all Sequences" check box is gone from the sequence view, but I guess I just missed that one...
  • Profile HMM and Grouping should be removed from the sub panels since they are not implemented yet, what about "use globus"?
  • Pattern discovery still seems very unstable. It is quite easy to crash the splash server (though not entirely reproducible). I would suggest keeping a log of what queries were executed and which one failed. This might enable to improve the parameter processing/validation before starting splash
  • Mantis 0001049: BLOSUM matrices are not correct
In geWorkbench one can select BLOSUm50, 100, 150 whereas on the server there are only BLOSUM50, 62 and 62F. Also there is no error reported when a matrix is not there.
  • Mantis 0001048: load pattern doesn't create a result node
when loading a pattern file I would expect that this would create a new node. Otherwise the first file being read would be overwritten.
  • Mantis 0001045: activated component changes after execution
run a splash calculation => the activated component changes from "Pattern Discovery" to "caSCRIPT" this is irritating, one should not be thrown out of a view because something else is executed.
  • Mantis 0001046: empty result window when no pattern found
It would be nice to have a message saying that no pattern was found in the result view. Currently a result node is created, it is empty and the status bar says done.
  • Mantis 0001047: results view doesn't change when going to parent node
splash results are calculated based on a fasta data node a new result node is shown when splash is executed. when switching back to the first fasta node the results are still shown. This can be confusing at times because they should not be directly connected.
  • Mantis 0000889: Hierarchical Pattern Discovery does not work
In the Pattern Discovery system test, test 0000003, Hierarchical, a null pointer exception occurs. Xiaoqing believes that the problem involves the server side C code. The problem may have arisen with the AXIS communications code changes recently (or not). It is connected with the way intermediate results are communicated back to the Splash client in geWorkbench.
The existing code only can be used for saving normal pattern. There is no code available to save hierarchical clustering at all.
when saving a pattern over an existing file I am not asked if I want to replace the file.
  • Mantis 0000858: pattern discovery only works on all the sequences
It is not possible to run pattern discovery on a sub set of sequences. E.g. select only the first two sequences of a fasta file and run pattern discovery. => Calculations are done on all sequences. this doesn't fit the behavior of (most) other components.
  • Mantis 0000496: sequence masking needed
A major planned work-flow of geWorkbench is to be able to search for patterns in sequences surrounding co-regulated genes. However, no provision is made to mask or deal with masked sequences. Many patterns found in unmasked genomic sequence can be expected to be in repeated elements of various types (or does Splash already have a way of dealing with this?). Solving this may involve enhancements to both the Pattern Discovery module and perhaps to the Splash server. The Splash server currently is believed to recognize mask characters N and #.
Searching for regulatory patterns is potentially very sensitive to the details of how it is carried out, as such patterns may be short and embedded in regions of repeated sequence. The greatest possible degree of flexibility and control may be required.
Suggestions:
We will assume we can obtain from the Sequence Retriever sequence that is either unmasked or is reversibly masked, that is with small letters denoting masked sequence and capitals denoting unmasked.
The following options could be considered:
  1. If lower-case-masked sequence is available, provide the option to convert masked characters to Ns before sending to the Splash server.
  2. We could host a repeatmasker server (www.repeatmasker.org). We could then mask sequence as desired, e.g. mask low complexity but leave in LINE elements. There are many considerations and subtleties discussed at repeatmasker.org.
  3. The Splash server itself could be enhanced to offer simple masking, along the same lines as the BLAST programs do.
  4. We should give some indication of the kinds of statistics one could expect from true conserved patterns vs those arising from repeats. Can they be distinguished?


  • Mantis 0000665: conceptional inconsistency
Pattern discovery is the only component I came across so far that has “Pattern discovery” (itself) in the analysis section. All other modules just have “Experiment Info”, etc. and in the visualization area the results. Actually I do like the idea of having “itself” still in the analysis section because this way the parameters are still stored. But it is still inconsistent.
  • Mantis 0000648: navigation keys don't work properly
There are only two ways of selecting pattern such that they are displayed in the sequence window: Single click on a pattern and holding the shift/Ctrl key and using the mouse. There are other ways to change the selection within the pattern panel that have no influence on the sequence panel. Some of them are: using arrow keys, selecting more than entry with the mouse.

Associations


caScript


Dataset annotations


data history


Experiment Info



Microarray data

The following panels/functionalities are requested specifically for Microarray data:

Simulation


Cell Viewer

  • Mantis 0000839: image not removed from view panel when project is removed
after removing project image is still there
  • Mantis 0000840: visual display not updated when project removed

Network browser


Cellular networks KB

  • Mantis 0000927: Erratic population of Activated Markers List when switching between project folder nodes.
0000961 selected markers don't change with changing project/dataset
  • Mantis 0000950: sorting by columns in activated marker list
It would be desirable to be able to sort by the columns in the activated marker list.
  • Mantis 0000955: Cytoscape visualization doesn't have arrows, shapes as described in Use case
The visualization of networks derived from "cellular Networks KB" doesn't show the following features
  1. edge thickness is always the same
  2. nodes are all squares
  3. GO and other annotation is not available
  4. edges have no arrows/directionality
  5. geneways query is not possible

Cytoscape

  • Mantis 0000954: analysis components active when working on Cytoscape node
When working with a Cytoscape component (e.g. one derived from cellular networks component) the analysis tab in the lower right is still available.

Interactions


Aracne

  • Mantis 0000956: aracne on adjacency matrix
According to the use case Aracne can be run on Adjacency matrices. (this is the case). Unfortunately I cannot check if the full calculation is done or not, but I can see that I can change all parameters. Certain parameters (hub markers, Kernel width) should disabled when working with Adjacency matrices.
  • Mantis 0000957: out of bounds error
This is a BUG, related to running Aracne on Cytoscape output
  • Mantis 0000959: system doesn't validate parameter entries
it is possible to enter strings and numbers that are out of range for the parameters (e.g. Mutual Info.)

Tabular Microarray Viewer

  • Mantis 0000627: selecting arrays and markers
It would be nice if markers and arrays could be selected/highlighted by double clicking on row / column headers

Sequence Retriever

  • Mantis 0000869: resetting display when selection changes
shouldn't the results be reset if the selection changes? When I add a marker to a set or remove a marker this is reflected in the list of markers in Seq. retr. but not in the retrieved results.
  • Mantis 0000817: sequence retriever not correctly threaded
When doing a longer retrieve, i.e. many sequence changing the project/file will interrupt the process. Also results are not remembered when switching between files/projects.
  • Mantis 0000812: select all sequences button is missing
It is not possible to select all sequences with a single button click. one has to manually select all individual sequences.
  • Mantis 0000813: additional databases
Currently it is possible to modify the chiptypeDatabaseMap.txt file to include additional databases. There are ways with MySQL to retrieve all available databases. We should make use of this and show only the latest version of any database
  • Mantis 0000743: only mouse and human sequences can be retrieved
Drosophila and all other genomes are missing...
  • Mantis 0000748: sequence line should show position/window of displayed sequence
It would be beneficial to have a small window/box or indicator that shows the position of the displayed sequence within the multiple/single line view.

Synteny

  • Mantis (104): To add option of adding a custom annotation track from file to current dot- or feature-matrix.
  • Mantis 0000530: incorrect links to UCSC genome browser
  • Mantis (158): Synteny module is not properly linked to the project.
Synteny module can't obtain data from the current project or submit the results of computation to it. Consequently those results can not be saved in a work space.
  • Mantis (342): Synteny needs cancel button
Program
MUMmer
Dots
Synteny map
Genome selections
Annotations

Microarray viewer

  • related to (closed) Mantis 0000225: Microarray panel details
It would be nice to have the values (min, max) displayed next to the color legend
  • Mantis 0000326: absolute visualization needs viewing controls
If a set of unnormalized Affy data is read in, for example the cardiogenomics dataset, with the tools->preferences->visualization mode set to absolute, the dynamic range of the data is so large that only a few data points appear. Some options to deal with this would be:
  1. a control labeled "Display as log transform"
  2. an intensity control such as in the color mosaic panel.
  • Mantis 0000431: Table View does not sort properly
Sorting by most columns not work properly (does not order numerically or alphabetically) and attempting to sort by p-value appears to do nothing.
=> also true for any table
  • Mantis 0000550: color scale disappears when window small
If the window size is made small enough, the expression color scale disappears, leaving behind just its label. This may be common if people have small screens.
  • Mantis 600: The function of show marker is not clear. I might be useful to consider the following features/functionalities:
  1. marker positions should be remembered when scrolling through plates. Either only for the plate that they were selected on, but probably even better for all the plates
  2. Negative values should be marked in red, otherwise it is not possible to see the marker for a negative value
  3. It would be nice if there were a correlation between the markers in the microarray viewer and selected markers from the Markers panel. Right now the Marker panel updates to the marker that is double clicked on.
  4. When switching between "All Markers on and off" in the off-mode the marker selection is not shown, but when switching back to on they are shown. This should be somehow more consistent and the markers should also be visualized in the off mode

Gene Ontology

there are some issues arising that stem from converting non directed acyclic graph into one...
  • Mantis 0000827: p-value display to show gene names/probe ids on x-axis
I would suggest to print the gene name or probe id as a label for the x-axis since that is from what I understand the numbers correlate to.
  • Mantis 0000828: saving profiles is not working correctly
the file format is not readable
  • Mantis 0000825: incorrect Go term assignments
in the U95 annotation file the two additional genes only have entries for cell and not for cell component. But in the table view it is annotated as cellular_component. (it should be only cell)
  • Mantis 0000821: GO Term component does not respond to external events
It sounds like a good idea to create a results node in response to a GO Terms analysis (especially since this analysis can take a non-trivial amount of time to complete). We may have to change the GUI a bit, so that we keep track of the list of activated genes that were used for the analysis (since we cannot assume anymore that they will be the ones currently activated in the Markers component).
  • Mantis 0000623: Organization of panel is misleading
The Gene Ontology Panel is divided into three sub-panels: TreeView, TableView, P-value Trend. TreeView holds vital information for TableView and P-value: Chip set used, Reference List, selection to choose which of the above, and which class of GO terms is used. Those options/parameters should be visible in all of the three main panels since they heavily rely on them and should be moved outside of the panel structure within the GO panel!
affy ids are used for mapping to the GO tree structures. But then the gene names are use to do the calculations. Biggest problem here are affy ids with "---" as a gene name, they are all being folded into one set.
  • Mantis: 0000617: loading reference list
When loading an external reference list, basically any file can be loaded. Manjuanth told me that only affy id's are actually being processed. We need a way to show the user what actually was understood from the file that was just read. Something like the number of lines read, the number of identified affy id's, the number of non identified objects. Each of the non identified object should be displayed on the console. The help file should clearly state what the reference list file should look like.
  • Mantis 0000618/830: weird behavior - displaying reference lists
problem with large white blocks in markers area of component
  • Mantis 0000834: we need unit tests for the calculation parts in GOpanel
  1. computeHypergeometric
  2. getPValue
  3. mapNode
  4. computeCorrectedPValues

GSEA


Reverse Engineering

Everything is going to be switched to the new version of Cytoscape, then a lot of the issues will be resolved

By default Cytoscape comes with two panels that can be selected (You will see them after resizing.) They cannot be switched by clicking on the panel heads but only by clicking on the panel names. (The other one was called sim1...)

Don't know what the check box 2nd Marker in Reverse Engineering - Profiler - Conditional tab means

Best fit should be a line not some dots.

Note that "Exp. Range From/To" only works on expression values and not on rank order plots!

  • Feature request (not in Mantis): change the "Filter" button under Conditional tab into a check box. I find it irritating that you have two states that can only be distinguished by the color of a button.
  • It is not clear what the Probability vs. Score plot is for and when there will be something displayed. Also, I am not sure what the "Mutual Information Distribution" and "p-value Distribution" check boxes do (see also Mantis 0001039.
  • Mantis 0001040: Motif Location Histogram: color of filtered values cannot be changed
see also scatter plot
  • Mantis 0001042: export genes is not working
The button "export genes" is no doing anything, not even an error message, but the print Genes button prints something.
  • Mantis 0001041: export file extension
when exporting graph as Jpeg (nothing else is selectable) the file extension is not automatically added if not given
  • Mantis 0001038: motif location histogram should be empty if no gene is selected
In the result gene list no gene is active, but the plot in Motif Location histogram is still visible. This should not be the case. It will also be still visible if I open a new data set and then change to Reverse Engineering.
  • Mantis 0000795: scatter plot not working
it is not plotted when the "All arrays" is not activated. If activated it is plotted. (but only the motif location plot)
one has to select mircoarrays in the Phenotype panel, but it is not enough to just select them, there has to be at least one named selection for the microarrays to be drawn.
one problem is related to GeneProfiler.java line 1140. Here it is evaluated if the panel size > 1 ((dataSetView.getItemPanel().panels().size() > 1)). I believe this is incorrect and doesn't account for selected markers that are not named...
  • Mantis 0000774: Motif Location Histogram does not default to all arrays display
The Motif Location Histogram display within the Reverse Engineering component does not follow the expected default display rule, whereby if no arrays are activated, then all arrays are considered activated. It only displays results if arrays are activated or the "All Arrays" checkbox is checked. It should really display all results if nothing is activated.
  • Mantis 0000799: results are not sorted by absolute value
The results in the reverse engineering panel are sorted by their numerical value. A negative 100 (-100) is displayed last whereas 100 is first. Since it is the absolute value that is interesting they should be ordered accordingly.
  • Mantis 0000802: Conditional analysis
  1. load web4.exp from https://sharepoint.cu-genome.org/c2b2/Testing/head/microarrays/Reverse%20Engineering/data [^]
  2. select first gene from markers list
  3. analyze 2d
  4. select conditional analysis panel
  5. use 31350_at as gene1
  6. gene2 should be 31314_at
  7. select intersection
  8. hit compute
=> last panel shows three times the same entry
=> the entry seems to be wrong
=> first panel shows all entries in duplicate with the same MI
  • Mantis 0000803: wrong name for gene1 gives exception in conditional analysis
user should be notified if the selection is not valid
  • Mantis user should be notified if the selection is not valid
  • Mantis 0000680/779/795: Cytoscape - multiple menu entries for right click in image
  • Mantis 0000531: Unexpected results when using Print and Export buttons
prints too much, exports wrong things
  • Mantis 366, 472, 797: Mutual information/p-value distribution appears to be non-operational
Plot is not being drawn. This appears to be quite difficult and Manju is supposed to take care of this.
  • Mantis 0000678: Cytoscape - right click within graphics area
right click with the graphics area give you a lot of options including to search Google. Why not search NCBI, Pubmed?
  • Mantis 0000679: Cytoscape - resizing nodes
nodes can be resized within Cytoscape and the size/shape is retained even when changing the model. There are two ways of getting the original shape back: 1. changing it manually or 2. by switching to a different Adjacency Matrix and back. This is a bit awkward and inconsistent.
  • Mantis 0000677: conceptual inconsistencies
Reverse engineering is located in the visualization area and not in the analysis area. To be consistent the visualization and parameter selection should be separated.
  • Mantis 0000675: Cytoscape - Project Folders
When running multiple analysis creating multiple networks new item in the project folder are created. At the same time new entries in Cytoscape are created. This is redundant and confusing. There can (should) only be one, and it should probably be the Project folder one with all its drawbacks, just because it is more consistent. But do we then need the left panel showing all the networks?
Cytoscape comes with a whole set of menus and functionality that is not visible at first, only after resizing the window they come up.
=> this should be changed
The layout for the reverse engineering component does not properly distribute space amongst the components. Upon first viewing the plugin one of the graphs is compressed to only 10-20 pixels high.
  • Mantis 469: documentation incomplete

Scatter Plot

  • Mantis 0000606: settings of properties are reversed when double clicking on marker
the tear-breaker!
  • Mantis 0000807: reference lines breaks at (1000,1000)
  • Mantis 0001040: Motif Location Histogram: color of filtered values cannot be changed
It is not possible to change the color of a marker from within the plot area. This would be desirable in "reverse engineering" with "filtered" values. Sometimes a bright yellow or green is chosen for this and is barely visible

caBIO pathways

  • Mantis 0000583: Pathway Results not obvious
When a pathway is retrieved, a naive user does not know to look in the caBIO Pathways tab to find the results. User should be notified somehow.
Marker Annotations and Pathway components should probably be combined, since they are inextricably linked.
  • Mantis 0000584: SVG Pathway Export
It would be great to allow easy export of the caBIO Pathway graphics. This is a great thing to give to a researcher, it looks very impressive. Failing that, a link where one could get the pathway would be nice.

Marker Annotations

  • Mantis 0000838: exporting gene list: extension is missing
When exporting genes to CVS, the type of file is suggested to be .cvs, but when saving e.g. to a file called "test", there is no ".cvs" automatically added. I would expect such behavior.
  • Mantis 0000837: links not working?
a few "Interleukin 1, alpha" and one "Interleukin 1 alpha" (last one without ",") are displayed clicking on any of the ones with the "," and selecting CGAP mouse will lead to a wrong web page. The same holds true for the one without "," and CGAP Human
  • Mantis 0000585: Export of spreadsheet data
  • Mantis 0000805: additional annotations requested
  • Mantis 0000806: new component for sequence and microarray integration
Andrea has suggested adding the option of retrieving from additional sources of annotation, including Entrez, GeneCards (Weizmann), etc. For a list see mantis report
a new data-agnostic component is needed for integrating data from sequence analyses and microarray data.

See additional info


Color Mosaic

  • Mantis 0000477: Missing right click actions
CM should have the right click actions consistent with Expression Profiles, EVD,Scatter Plot which supports zoom in/out,save as & print.
=> least have a mouse-over function that displays the chip and value for the spot
  • Mantis 0000611: color slider function not obvious (probably not only in color mosaic)
The real values displayed in a heat plot have to be mapped into a color space that is usually in the range between 0 to 1. A threshold has to be defined beyond which all values are mapped to 1 in the color space. By moving the threshold lower differences in the lower magnitudes can be visualized. This does not seem to happen in geWorkbench. I would expect to everything lighting up when I move the slider to one of the extremes. This does not happen.
Since I have seen the behavior also in the other heat map representation and this is not the obvious behavior it either has to be clearly documented what is happening or changed to "normal" behavior
  • Mantis 0000642: information about arrays is missing
Let's follow here the exact same solution used in the Dendrogram component: there is a little sticky button with a bulb icon at the bottom right corner of the Dendrogram GUI. When pressed, moving the mouse over any spot reveals the array name, marker name and the expression value for the marker.

Expression Profiles

  • Mantis 0001027: color from marker preferences are not used
when displaying selected markers the color that can be associated with visual properties of the marker set should be used to display the lines for that marker in Expression Profiles
  • Mantis 0000524,471: no mouse over on expression graph (assigned to John)
Some of the components support displaying the marker name when a data point is moused-over. The Expression Profiles component does not, instead you must click on a line and view its name in the Markers panel. Mouse over would be a good enhancement.
  • Mantis 0000682: image snapshot is missing

Expression Value Distribution

  • Mantis 0000923: setting ranges directly in text boxes does not update "Selected genes" count
The Selected Genes count is updated if the boundary values are changed using the sliders. It is not update if numbers are typed directly into the text boxes. At least when a text box loses focus the value should be recalculated, that is e.g. when you click between boxes.
The boxes now cannot be edited at all. This was not the desired change. We should still be able to edit the values in the text boxes, and after making a change, a loss of focus event should cause the "Selected genes" field to update.
  • Mantis 0000214: Double clicking on an activated array plot line does not make the array the base array
  • Mantis 0000572: domain and range axis: naming of axis confusing
  • Mantis 0000681: boundaries for selection are not all visible
This issue already was brought up few times. It is related to the package we used for drawing chart, JFreeChart. It is not very easy to extend JFreeChart to implement the two boundaries.

Experiment Info


Normalization


Housekeeping Genes Normalizer
  • Mantis 0001069: housekeeping normalizer: no genes selected, nothing calculated, need message
It would be good to tell the user that nothing has changed if no housekeeping gene has been selected. Currently the system behaves like everything is normal. I would like to see a pop-up warning the user that no valid markers have been selected as house keeping genes.
  • Mantis [1]: house keeping filter doesn't recognize markers correctly
31310_at is in the dataset but recognized as not being in it
  • Mantis 0001052: parameters are remembered through project datasets
the same genes are already highlighted without even having hit "Analyze"

Log2 Transformation

Marker-based centering

Mean-variance normalizer

Array-based centering

Missing value computations

Threshold Normalizer

Quantile Normalization

T-Profiler


Analysis

  • Mantis 0000872: Multiple testing correction has no impact on volcano plot and color mosaic displays (as it should).
The result of the correction is that no markers pass the post-correction threshold of 1E-3 (as implied by the new marker group with 0 markers created, called ""Significant Genes(1)"). However, the Volcano Plot and the Color mosaic displays remain unchanged (same 5 markers displayed with the same, uncorrected p-values). The desired behavior (after a correction has been applied) would be for the data points in the Volcano plot and the color mosaic to be displayed using their corrected p-values (in this particular example then, no markers should have been displayed, as no corrected value passes the threshold).

The following analysis functions are requested:

  • ANOVA
  • Mantis 481: K-means is requested
  • mantis 0000307: Analysis should write to dataset history

When any type of analysis is done, basic facts should be written to the Dataset History log, such as whether all markers were used or an activated panel; the number of markers/arrays in the activated panel, and the array type, and a timestamp.

  • classification node should be reusable and be applicable not only to the data set they were derived from but also to other projects etc.

Fast Hierarchical clustering
  • Mantis 0000847: Allow creation of array group comprising arrays in a cluster
The current implementation of the Dendrogram component permits a user (via a right click popup-up option) to grab all markers in the currently displayed cluster and add them to a marker group. A similar functionality should be added for arrays as well: another popup option should allow collecting all arrays in the currently displayed cluster into an array group.
  • Mantis 0000148: Euclidean distance should use normalized vectors
Data should be transiently normalized
  • Mantis: 0000485/000046: Aborting Hierarchical clustering does not really abort.
  • Mantis 0000594: Clustering on array dimension persists after used
Open a microarray dataset and run hierarchical clustering with the dimension selected as "both". After it completes, set the clustering dimension to "marker" and re-run the clustering algorithm. You will see that the dendorgram is still clustered in both dimensions. What you want here is for the dendorgram to be ordered in the original order of the arrays in the data file.

Matrix reduce
  • Mantis 0000971: sequence view - scroll through selected PSAM
The sequence view shows the application of a PSAM that is shown in the upper left corner on all the sequences. The PSAM can only be selected by double clicking on a PSAM in the PSAM detail view. It would be nice to have the possibility to select a different PSAM from the sequence view
  • Mantis 0000973: Filtering for Sequence Search is not working
I tried both sequence names and sequence motifs in the filter box. Neither produced any hits, though they were there in the data. Also, if the filter is for sequence name, what is the purpose of the Threshold box? Reopening bug.
  • Mantis 0000974: svd of MxN matrix, M<N, is not implemented
This bug has been marked as having a resolution of "not fixable" until a new version of the code is received from the developer. If this is not currently fixable, this fact should be documented in the tutorial/manual/FAQ/Known Problem list. Assigning to documentation team. Bug reopened for documentation.



gene pattern
WV
  • Mantis 0000877: misleading message displayed
With the number of cross validation folds below 2 (i.e. 1) a message pops up saying "Control data is missing".
SVM
PCA
  • Need to check for JAVA 3D / Open GL 1.2

SVM classifier
  • Mantis 474: online help missing

Slmr classifier

SOM
  • Mantis 0000685: image snapshot only captures one cluster

It would be nice to capture as well all clusters at once.

  • Mantis 0000574: clarity for parameter description

I would like to suggest naming parameters with more contents. For example the parameter Alpha could be called learning rate (alpha), rows could be named Number of rows, Radius is only used for the Bubble neighborhood,


T-test
  • Mantis 0000516: Volcano Plot doesn't know data already log2 normalized
When a t-test has been done in the Analysis component, the results are displayed in the Volcano Plot (note that this component does not have an entry in Mantis yet). The display is on a log2 scale on the X-axis. However, the component does not know if the data has already been log2 transformed. I think it is essentially doing a second log2 transform of the data - that is, instead of just subtracting one log2 value from the other, it is taking the difference and then the log2 of this. Or it is taking a log2 transform of all the data again and then subtracting. Either way, it is displaying a log2(log2 X) value.
The only way to deal with this, is to know that the data has been log2 transformed. It would appear that this fact would have to be remembered if done in the normalizer component, or the user should be able to check a box on read-in that indicates the data is log transformed.
We can probably auto detect the log transformed state just on the range of values present in the input file.
=> auto detection is good, but we need to be able to set it manually, too. (BJ)
  • Mantis 0000637: stop button is missing for T-test
  • Mantis 0000638: t-test alpha corrections
I suggest to name the tab "corrections methods" and then divide them into "alpha corrections for t-distribution" and "Stepdown Westphal and Young Methods for permutation". Also there should be an error message when an option combination is selected that does not make sense (e.g. minP for t-distribution).

Multi T-test
  • Mantis 0000640: multi t-test message is missing message if there are too few data
  • Mantis 0000729: multi t-test and t-test don't get the same results
There is a difference between the results arrived from a single t-test and a multi t-test. With the current implementation it is not possible to get the same results with similar starting points
Resolution: change in MultiTTestAnalysis.java: tTest.tTest to tTest.homoscedasticTTest those results will be the same as when doing the comparable t-test with equal group variance, t-distribution, and just alpha correction.
Does it make sense to do a homoscedastic t-test, though? It seems that we should not assume that the between-groups variances are the same.
I would suggest to give the user a choice and document the different options.

Associations


caScript


Dataset Annotations


Filtering


Deviation filter

Affy detection call filter

Expression threshold filter

Genepix flags filter

Missing value filter

2 channel threshold filter

Dataset History

We should be able to display information about a microarray - the chip type, the number of markers in the dataset vs in the chip definition.
  • Mantis 0000476: Data set annotation missing
Issue: Users analyze data, use activated sets, limit arrays included and none of these steps are captured in the annotation. The system should capture these steps for the user in a log so the researcher is clear on what the results include and how to reproduce.