User:Rfriedman

Revision as of 16:16, 17 July 2006 by Rfriedman (talk | contribs) (Tutorials Comments)

Functionality Comments

Rich, add functionality comments and new feature suggestions here.

One quick inital suggestion. geWorkbench should be able to import files in the following GCG formats: sequence, mutiple sequence, and rsf.

(3/23/06) A more robust couterpart of k-means clustering with statistical estimates for micorarray analysis is described in the following papers:

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=12801869&query_hl=11&itool=pubmed_docsum

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=12184810&query_hl=11&itool=pubmed_docsum

3/30/06 I don't like the slider to change arrays in the microarray widow. The identity of an array is a fixed, not a variable quantity. I suggest that a pull-down window for this would be better.

4/7/06 I suggest asking "are you sure" when a user asks to remove a project.

4/26/06 It would be very helpful if the workbench could display an hourglass, or a watch, or a sundial or something, when it ia loading or working - for example when it is loading micorarray files from a remote database.

5/25/06. I just installed Version 1.03. In the Windows menu it says version geWorkbench 1.0 and on top of the geWorkbench GUI its says geWorkbench 1.0. I suggest that all labels give the full workbench version.

5/25/06 The two tutorial sets should be included in the download automatially.

5/25/06 I would like to ammend my recommendation of 4/26/06 to inlcude an estimate of the time a task will take, so that people may use it more easily.

5/25/06 When I spoke to the group, Ken had stated that the intensities in the microarray viewer did not correspond to an image of the chip. In which case the phrase "microarray viewer" is misleading. In fact I am not sure to what the intensities an spacing in microarray viewer corresponds.

5/25/06 I think that "Get bioassays" is a poor command on 2 grounds: 1. I am not used to "bioassays" being used in place of "arrays" or "array data". 2. We are obatining a list, rather than loading the bioassays into the program. What I think we eman then is "list arrays".

Additionally, it is not clear what format the arrays are being loaded (Cel, normalized probeset intensities, etc).

5/25/06 Some indication that a work is in progress should be given while the arrays are being loaded.

5/25/06 I suggest that a dummy new source be made available to the users to learn how to access a remote source and I suggest that intructions for posting a remote source be made available. Doing these things will increase the ease with which users can use the workbench in collaborative projects.

5/30/06 The terms "marker" and "phenotypes" are not optimal. In the microarray world we use "probsets" (affymetrix) or "probes" (glass-slides) instead of "markers". "Arrays" is much more informative than "phenotypes" because there can be several arrays for a phenotype, or arrays can represent different patients rather than a phenotype, or because arrays can correspont to points in a time series. Also, you might want to reserve "phenotype" for instantiations that have precise defintions in a controlled vocabulary.

5/31/06 With respectto the tabular microarray view. There is also a "probe number" for affy chips (1,2,3. ..) based upon its poistion in a sort. It would be useful to have a colum for that. It would also be useful to have seperate, searchable columns for the following 3 items: 1. Probe id. 2. Gene name. 3. Gene defintion.

(If it sounds as if I am thinking of Excel here - I am).

5/31/06. I strongly recommend that there be a way to reverse filtering, by a global undo command or some other means, so that the user may try different filters.

6/14/06 Inclusion in the announcments mailing list should be made an integral part of teh downloading process.

6/14/06 The "expression thresshold filter" instructions shoudl be clearer. stating "Filer values insdie range" is ambiguous in that it is not clear if those values are left after filtering or removed by filtering (I believe that the later is the case). I suggest the language be changed to "remove values inside range". or "flter-out values inside range".

6/29/06 I recently did some Hierarchical Clustering using Cluster 3.0. Instead of simply filtering by absolulte M value, its also enables the user to retain genes that are larger than a given mvalue in a user-specifiiable

  1. of experiments. It also offers the following options:

1. %. present >= of chips (this only works if you use present/absent threshholds rather than statistical noise. 2 .SD gene vector >=X to remove genes with insufficient variability. 3. Max-Min >= another variability filter.

I can see why someone might want to use 2 or 3.

7/17/06 Tabular micorarray format - The column widths on the tabular microarray format should be suffcient to accomodate the whole title of the chip.

7/17/06 The color mosiac only makes sense if the data already has a log2 or other variance-stabilization transformation. As is, an unsuspecting user can look at real values at this can be confusing. Furthermore, heatmaps make teh most sense for log ratio comarisons versus a standard.

Tutorials Comments

Tutorials comments go here.

The initial download should come with all of the datasets in the tutorial (the cardio set was missing when I installed) OR the tutorial should show where these can be downloaded.

3/30/06: Some mention of what the micorarray viewer does should be included in the manual - i.e that it shows a raw image of the chip.

3/30/06: What it means to merge microarray files should be stated more explicitly.

4/07/06 That the chip recognition message is only shown once should be stated. Alternatively maybe it should be shown each time - but not require an okay button.

4/10/06 How to save a merged affy dataset so that one may open it again shoudl be described more clearly. The following points (courtesy of Ken) should be mentioned (and illustated).

1. The set should be saved with an exp suffix.

2. The set can be reopened with the filter set to "Affymatrix Matrix file".

4/24/06 The tutorials comments for opening a remote site are misleading. It should state: 1. Go is clicked for getting the list of micorarray experiments. 2. "Get Bioassays" is necessary for getting a list of arrays in the

  experiment-not for retreival.

3. "open" will retrieve the selcted bioassays. I found this veyr hard to use and required correspondence with Kem and a visit from Xiaoqing in orfer to learn to use it.

4/26/06 I suggest that the tutorial not mention adding a new site for remote downlaod umtil such sites are commonly available. Otehrwise it just begs questions from the reader/

5/25/06 I suggest that the tutorial pages state to which version of geWorkbench they apply. This is implicit in the label of the window, that appears in the screenshot, but it should also be on the web-page that the user unloads.

5/25/06 I suggest that the tutorial pages be downlaodable as a pdf file.

5/25/06 I suggest that there be a public mailing list where users can be notified of updates.

5/25/06 I suggest to what the intensities and layout on the microarray viewer slide be discussed.

5/30/06 Designating a group of arrays a "case" causes the thumbtack to be labeled red. However, designating a group of arrays as the "control" does not change the color of the thumbtack. I suggest that the color of the thumbtack be changed to green to distinguish it from a group whose nature has not been demonstrated. Also, the designation "case" is used in clinical and epidemiological research. The corresponding term in laboratory research is "experiemnt".

6/13/06 It should be explained that the microarray viewer image is in probset order split across each row and is not an actual image of the slide.

6/14/06 Examples of each the different filter options should be given in the tutorial.

7/17/06 I believe that you are doing the person learning to use geWorkbench a disservice by showing the heat map instructions in the tutorial before you have shown log transformation (or at least am assuming that there is no log trandformation because some of the numbers are so high). Heat maps are most useful relative to a standard and hence this should be used as part of a didactic example in which a log2 ratio standard is used.

7/17/06 I suggest that the instructions in the tutorial for using the scatter plot graph be more detailed and step-by-step.