Difference between revisions of "Array Sets"

(Common Principles of Operation of Marker and Array Subsets)
(Array/Phenotype Sets (Context))
 
(120 intermediate revisions by 2 users not shown)
Line 10: Line 10:
 
Sets of microarrays can be used to group arrays in a meaningful fashion for statistical analysis.  For example, two such phenotypes might be the diseased and normal states of a tissue from which samples have been taken.  geWorkbench uses the terms "Case" and "Control" to categorize these, but in biological setting the equivalent would be "Experimental" vs "Control".
 
Sets of microarrays can be used to group arrays in a meaningful fashion for statistical analysis.  For example, two such phenotypes might be the diseased and normal states of a tissue from which samples have been taken.  geWorkbench uses the terms "Case" and "Control" to categorize these, but in biological setting the equivalent would be "Experimental" vs "Control".
  
This chapter discusses the use of sets of microarrays.  Please see the chapter [[Data_Subsets_-_Markers | Data Subsets - Markers]] for a discussion of the use of Marker sets.
+
This chapter discusses the use of sets of microarrays.  Please see the chapter [[Marker_Sets]] for a discussion of the use of Marker sets.
  
The figure below shows the Arrays/Phenotypes component located below the Project Folders component in geWorkbench.  The Markers component is located in the same space, under a  separate tab.
+
The figure below shows the Arrays/Phenotypes component located below the Workspace in geWorkbench.  The [[Marker_Sets| Markers]] component is located in the same space, under the adjacent tab.
  
  
[[Image:T_Arrays_basic_withProj.png]]
+
[[Image:Arrays_Sets_Bcell-100.png]]
  
 
==Common Principles of Operation of Marker and Array Sets==
 
==Common Principles of Operation of Marker and Array Sets==
Line 21: Line 21:
 
Rather than using all arrays or all markers in a data set for a particular analysis or visualization, the user may wish to restrict those used to only some subset.
 
Rather than using all arrays or all markers in a data set for a particular analysis or visualization, the user may wish to restrict those used to only some subset.
  
===Activating Subsets of Markers and Arrays===
+
===Activating Sets of Markers and Arrays===
In the Markers and Arrays components, subsets of markers and arrays can be defined by the user, and also are created as the outcome of some analysesBeside each such subset in the graphical interface is a checkbox.  Marking this box "checked" '''activates''' the subset.   
+
In the [[Marker_Sets|Markers]] and the Arrays components, sets of markers and arrays can be defined by the user.  Such sets are also created as the outcome of various analysis methodsAdjacent to each set in the graphical interface is a checkbox.  Checking this box '''activates''' the subset.   
  
* '''Activating''' a subset restricts many geWorkbench components to using as input only the markers or arrays that are in such activated subsets.
+
* '''Activating''' a set restricts many geWorkbench components to using as input only the markers or arrays that are in one or more such activated sets.
  
* '''Marker Subsets'''
+
* '''Marker Sets'''
** If no Marker subset is active, all Markers are used.
+
** If no marker set is active, all markers are used.
** If at least one Marker subset is activated, affected components will only use markers in activated sets.
+
** If at least one marker set is activated, affected components will only use markers that are in activated sets.
* '''Array Subsets'''
+
* '''Array Sets'''
** If no Array subset is active, all Arrays are used.
+
** If no Array set is active, all Arrays are used.
** If at least one Array subset is activated, affected components will only use arrays in activated sets.
+
** If at least one Array set is activated, affected components will only use arrays that are in activated sets.
  
==Controls==
+
* '''Selection''' set - this is a special, default set.  One is present in both the [[Marker_Sets|Markers]] component and the Arrays component.  The "Selection" set has the following properties:
 +
** Double-clicking on a marker or array entry in the upper pane list will add that entry to the default "Selection" set.  Double-clicking the same entry again will remove it from the default set.
 +
** The "Selection" set cannot be deleted.
  
===Upper Pane===
+
===Number of members displayed===
 +
The number of members in a set is given inside square brackets just to the right of the set name.
  
[[Image:T_Arrays_Upper_RightClickMenu.png]]
+
 
 +
 
 +
==Upper Pane Controls==
 +
 
 +
[[Image:Arrays_Upper_RightClickMenu.png]]
  
 
The list in upper pane of the Arrays component shows the arrays loaded in the current data set.   
 
The list in upper pane of the Arrays component shows the arrays loaded in the current data set.   
Line 43: Line 50:
  
 
The upper pane of the Arrays/Phenotypes component has the following controls:
 
The upper pane of the Arrays/Phenotypes component has the following controls:
* '''Search''' text field - Search for arrays by typing in a name or portion of a name. As one types, the first array matching the entry so far will be highlighted.  In some cases however, the Find Next button must be pushed to find a match.  If the typed entry matches no arrays, it will be displayed in red.
 
* '''Find Next''' button - find the next array matching the typed entry.
 
  
* [[Image:T_Arrays_lightbulb.png]] '''Light Bulb''' icon - when activated [[Image:T_Markers_lightbulb_active.png]], enables "hover text" display of array names in this component.
+
===Search===
 +
Search for arrays by typing an array name or portion of a name into the search box. As each character is typed, the list is dynamically updated to contain only names containing matches to the query so far.  The match can occur anywhere in the name.
 +
 
 +
For example, here "n" has been entered, and only the five arrays containing "N" in their name are displayed.  The search is case-insensitive.
 +
 
 +
 
 +
[[Image:Arrays_Search_example.png]]
 +
 
 +
===Light Bulb icon===
 +
[[Image:T_Arrays_lightbulb.png]] - when activated [[Image:T_Markers_lightbulb_active.png]], enables "hover text" display of array names in this component.
 +
 
 +
===Double-click action===
 +
* Double-clicking with the mouse on an array name will add that array to the default "Selection" set. 
 +
* Double-clicking on the same entry again will remove it from the "Selection" set. 
 +
* More generally, for any array in the "Selection" set, double-clicking on its entry in the upper list will remove it from the set.
 +
 
 +
 
 +
[[Image:Arrays_ArrayDoubleClick.png]]
  
 +
===Right-click menu===
 
Selecting one or more arrays in the list and then right-clicking gives the following choices in a pop-up menu:
 
Selecting one or more arrays in the list and then right-clicking gives the following choices in a pop-up menu:
* '''Add to Set''' - Add the selected arrays to a new or existing subset.
 
* '''Clear Selection''' - unhighlights the selected arrays.
 
  
===Lower Pane===
+
====Add to Set====
 +
Selected arrays can be added to a new or to an existing set.
 +
 
 +
A shortcut for adding arrays to an existing set is covered in the next section.
 +
 
 +
When "Add to Set" is clicked, a dialog box will appear asking for the name of the set.  Enter the name of a new or existing set.
 +
 
 +
As an example, we show adding arrays to two sets based on their phenotype.
 +
 
 +
(In this example we will start with the same data files that were used in an example in the [[Local_Data_Files#Example:_Loading_microarray_data_files_-_local | Local Data Files]] tutorial.  Load the ten individual MAS5 data files as shown there in the section "Loading microarray data files - local".  Be sure to check the "merge" option).
 +
 
 +
First, we select and label arrays which contain samples from the congestive cardiomyopathy disease state:
 +
 
 +
1. In the Arrays/Phenotypes component, select the six arrays beginning with '''JB-ccmp''', which represent the samples from the  congestive cardiomyopathy disease state.
 +
 
 +
[[Image:Arrays_AddToSet.png]]
 +
 
 +
 
 +
2. Right-click,  select '''Add to Set'''.  In the dialog box, you can enter the name of either an existing set, or of a new set to be created.
 +
 
 +
3. Enter the new subset name "CCMP" in the input box and click OK.
 +
 
 +
 
 +
[[Image:T_Arrays_SetLabel.png]]
 +
 
 +
 
 +
4. Next, add the arrays beginning with JB-n to a new set with name "Normal" ('' repeat steps 2 & 3 ''):
 +
 
 +
The Array/Phenotype Sets component will now show the two sets added.  Note that the number of arrays in each set is shown in square brackets to the right of the set name.
 +
 
 +
 
 +
[[Image:Array_ArraySets.png]]
 +
 
 +
====Add to Set (Existing)====
 +
 
 +
You can add arrays to an existing set by either
 +
* entering the name of the existing set in the popup dialog box, or
 +
* before right-clicking on the selected arrays, first highlight the desired array set in the lower list.  Its name will then be pre-entered into the Add to Set dialog.
 +
 
 +
 
 +
[[Image:Arrays_AddToSelectedSubset.png]]
 +
 
 +
====Clear Selection====
 +
Clear the contents of the default "Selection" array set.
 +
 
 +
==Lower Pane Controls==
  
  
 
The lower pane of the Arrays/Phenotypes component has the following controls:
 
The lower pane of the Arrays/Phenotypes component has the following controls:
* '''Array/Phenotype Sets''' menu - Select which named set of arrays to display. Each can contain a different arrangement of arrays into subsets.
 
* '''New''' button - Create a new array set.
 
  
If you right-click on a subset, a menu with the following choices appears:
 
  
[[Image:T_Arrays_rightclick_menu.png]]
+
===Array/Phenotype Sets (Context)===
* '''Rename''' - Rename the selected subset.
+
 
* '''Copy''' - Make a copy of the selected subset.
+
 
* '''Activate''' - Activate the selected subsetThis can also be done directly by checking the check box before its entry.
+
[[Image:Arrays_Create_New_Context.png]]
* '''Deactivate''' - Deactivate the selected subsetThis can also be done directly by unchecking the check box before its entry.
+
 
* '''Delete''' - Delete the selected subset.
+
 
* '''Combine''' - Combine the selected subsets into a new subset.  Methods are:
+
 
** Union - add all arrays from all selected sets.
+
* '''Pulldown menu''' - A list of array set contexts. Each context can contain different sets of arrays. All components will make use of the currently selected contextIf an analysis returns an array set as a result, it is returned to the current array set contextDifferent contexts can be used to organize arrays at different levels of detail, or different types of analyses.
** Intersection  - add arrays that are in each selected set.
+
* '''New''' - This button will create a new, empty array set context.
** XOR - add arrays that are in only one of the selected sets.
+
 
* '''Print''' - Print the selected subset of arrays.
+
* '''Note''' - The [[File_Formats#Affymetrix_File_Matrix_Format_.28geWorkbench.29| Affymetrix Matrix File]] microarray data file format, native to geWorkbench, supports multiple such contexts of array sets being defined and saved.
* '''Visual Properties''' - Allows the color and shape of points representing arrays in graphical components to be chosen, e.g. in the Scatter Plot.
+
 
* '''Classification''' - Allows for the selected subset to be assigned a classification, chosen from: '''Case''', '''Control''', '''Test''' and '''Ignore'''.
+
 
* '''Save''' - save the chosen set of arrays as a simple list (CSV format, one array per line) to a file on disk.
+
A popup will ask for a name for the new context:
 +
 
 +
 
  
==Examples==
+
[[Image:Arrays_New_List_Popup.png]]
  
In this tutorial we will start with the same data files that were used in [[Tutorial_-_Local_Data_Files | Tutorial - Local Data Files]].  Load the ten individual MAS5 data files as shown there in the section "Loading microarray data files - local".
 
  
 +
The new context will appear in the menu, with only the default "Selection" set as a member.
  
===Add an array to the default set by double-clicking===
 
  
 +
[[Image:Arrays_New_List_Created.png]]
  
[[Image:T_Arrays_ArrayDoubleClick.png]]
+
===Load Set===
 +
The Load Set button will bring up a dialog to load a file containing a list of arrays to a new set.  The file should have one array name per line in the first column.  The format of the file is Comma Separated Values.
  
===Removing an array from a subset===
+
Sets are loaded into the currently visible list in the Arrays component.  A new list can be created with the "New" button.
  
One or more arrays can be removed from a set by highlighting them and the right-clicking.  A menu will appear with option "Remove from Set".
+
If the file contains only one column, the new set is given the same name as the file it was read-in from.
  
[[Image:T_Arrays_RemoveFromSet.png]]
+
The second column, if there is one, is used to represent Array Set names.  Thus one file can contain arrays that will be read in to different sets, depending on the names in the second column.  This can be convenient to import array names and phenotypes from a spreadsheet file.
 +
 
 +
If the file contains this second column, then sets will be created using these names.
 +
 
 +
Any further columns after the second will be ignored.
 +
 
 +
===Right-click menu===
 +
If you right-click on a set, a menu with the following choices appears:
 +
 
 +
[[Image:Arrays_rightclick_menu.png]]
 +
====Rename====
 +
Rename the selected set.
 +
 
 +
====Copy====
 +
Make a copy of the selected set.
 +
 
 +
====Activate====
 +
Activate the selected set (see explanation [[Array_Sets#Activating_Sets_of_Markers_and_Arrays | above]]).  This will set the check-box in front of the set name to checked.
 +
 
 +
Activating a set can also be done directly by checking the check box before its name.
 +
 
 +
 
 +
[[Image:T_Arrays_ActivateSets.png]]
  
  
===Assigning arrays to an new or existing subset===
 
  
We will place the new subsets of arrays in the "Default" set, however you can create a new set by pushing the '''New''' button on '''Array/Phenotype Sets''' at lower left.
 
  
First, we will select and label arrays which contain samples from the congestive cardiomyopathy disease state:
+
====Deactivate====
 +
Deactivate the selected set (see explanation [[Array_Sets#Activating_Sets_of_Markers_and_Arrays | above]]).  Deactivating a set can also be done directly by unchecking the check box before its name.
  
1. In the Arrays/Phenotypes component, select the six arrays beginning with '''JB-ccmp''', which represent the samples from the  congestive cardiomyopathy disease state.
+
====Delete====
 +
Delete the selected set.
  
[[Image:T_Arrays_AddToSet.png]]
+
====Combine====
 +
Combine the selected sets into a new set.  Methods are:
 +
* '''Union''' - Include all arrays that appear in one or more of the selected sets.
 +
* '''Intersection ''' - Include only arrays that are present in each of the selected sets.
 +
* '''XOR''' - Include arrays that are present in one and only one of the selected sets.  Note that this usage differs from the logic gate XOR function.
  
 +
====Print====
 +
Print the names of the arrays contained in the selected set(s) to a printer.
  
2. Right-click,  select '''Add to Set'''. In the dialog box, you can enter the name of either an existing subset, or of a new subset to be created.
+
====Visual Properties====
 +
Change the color and shape of points representing arrays in graphical components, e.g. in the Scatter Plot.  
  
 +
Selecting the "Visual Properties" menu item, here selected for the array set "GC B cell",
  
3. Enter the new subset name "CCMP" in the input box and click OK.
 
  
 +
[[Image:Arrays_Change_visual_properties_item.png]]
  
[[Image:T_Arrays_SetLabel.png]]
 
  
 +
causes a properties editor to appear. In it, the shape and color of points representing arrays in various geWorkbench graphical components can be globally altered.
  
4. Next, similarly label the arrays beginning with JB-n as "Normal" ('' repeat steps 2 & 3 ''):
+
[[Image:Arrays_Change_visual_properties.png]]
  
The Array/Phenotype Sets component will now show the two subsets added:
 
  
 +
The color chooser:
  
[[Image:T_Array_ArraySets.png]]
+
[[Image:Arrays_Change_visual_properties_choose_color.png]]
  
  
===Adding arrays to an existing subset - shortcut===
+
The shape chooser:
  
If you wish to add additional arrays to an existing subset, you can avoid having to type in its name again in the dialog box by first selecting the target subset in the lower pane. Then right-click on a selection of arrays above and select "Add to Set" from the pop-up menu. The name of the existing subset will appear in the "Add to Set" dialog.  
+
[[Image:Arrays_Change_visual_properties_choose_shape.png]]
  
  
[[Image:T_Arrays_AddToSelectedSubset.png]]
+
Chooser showing selection of a green "plus" shape to represent the array set.
  
===Manipulating array subsets===
+
[[Image:Arrays_Change_visual_properties_green_plus.png]]
  
  
Right-clicking on an array subset produces a menu with actions that can be applied to it, as already described in the Controls sectionA few will be demonstrated in more detail in the following sections.
+
After the visual properties of the "GC B-cell" set have been altered, we can view their appearance in, for example, the Scatter Plot component.  Here we have activated both the "GC B-cell" set and the "non-GC B-cell" setThe former now uses the green plus signs, whereas the later uses a system assigned default color and shape.
  
 +
[[Image:Arrays_Change_visual_properties_scatter_plot.png|{{ImageMaxWidth}}]]
  
[[Image:T_Arrays_SubsetRightClickMenu.png]]
+
====Classification====
 +
For statistical tests such as the t-test, sets can be classified using several preset labels, e.g. "Case" and "Control". 
  
===Activating subsets===
+
The color of the thumbtack icon in front of a set name indicates the sets classification.  The color scheme is shown at the bottom of the Arrays component, and is repeated in the following list.
  
The check boxes next to the subset name can be checked to indicate that a subset of arrays is "Active".  Various analysis and visualization components can be set to only use/display activated arrays or markers.
+
The full list of classifications is:
 +
* '''Case''' - (Red) used in analyses such as t-test to differentiate the experimental group from the control group.
 +
* '''Control''' - (No color) the default classification.
 +
* '''Test''' - (Green) used by the validation step of several classification algorithms, such as KNN (Broad).
 +
* '''Ignore''' - (Gray)
  
  
[[Image:T_Arrays_ActivateSets.png]]
+
A classification can be made directly by left-clicking on the "thumbtack" icon in front of a set's name.  The list of available classifications will be offered.
  
===Classifying a subset===
 
  
For statistical tests such as the t-test, Case and Control sets can be specified.
+
[[Image:Arrays_SetClassification_thumbtack_menu.png]]
  
# Left-click on the thumb-tack icon in front of the phenotype name. 
 
# Select Case to specify the disease arrays as the "Case".  The remaining "Normal" arrays are by default labeled control.
 
  
 +
In the figure below, a red thumbtack indicates the arrays have been specified as "Case".
  
[[Image:T_Arrays_SetCase.png]]
 
  
 +
[[Image:Arrays_CaseSet.png]]
  
A red thumbtack indicates the arrays have been specified as "Case".
 
  
  
[[Image:T_Arrays_CaseSet.png]]
+
The set classification can also be reached from a right-click menu on the set you wish to classify.
  
===Creating a new set to contain subsets===
 
  
Pushing the "New" button will bring up a dialog box in which the name of a new set can be entered.  
+
[[Image:Arrays_SetClassification_long_menu.png]]
  
 +
====Save====
 +
Save the chosen set of arrays as a simple list (CSV format, one array per line) to a file on disk. 
  
[[Image:T_Arrays_NewArraySet.png]]
+
If more than one array set is highlighted, two choices are offered:
 +
* '''Merge into one set''' - The arrays in all highlighted sets are merged into a single list and written out to a file.  A file browser window appears that allows the user to specify the location and name of the new file.
 +
* '''Save as multiple sets''' - Each highlighted array set will be saved to a separate file, using the set name as the new file name.  A file browser will appear which will allow the user to specify where to save the new files.
  
  
In turn, once the new set is created, a new collection of subsets can be created within it.
+
[[Image:Arrays_Save_Multiple.png]]
  
 +
===Removing an array from a set===
  
[[Image:T_Arrays_NewArraySetCreated.png]]
+
One or more arrays can be removed from a set by highlighting them and then right-clicking. A menu will appear with option "Remove from Set". 
  
  
 +
[[Image:T_Arrays_RemoveFromSet.png]]
  
===Example of working with multiple array sets===
+
==Working with multiple lists of array sets==
  
There can be different groupings of the same arrays in the Arrays/Phenotypes and Marker components.  Here we show how there are several different set groupings which are predefined in the example data file "BCell-100.exp".  After loading this file into geWorkbench as type "Affymetrix File Matrix", the following listed sets can be seen in the Arrays/Phenotypes group pulldown menu.   
+
There can be different groupings of the same arrays in the Arrays/Phenotypes components.  The example data file "BCell-100.exp" predefines 3 such lists of sets.  After loading this file of type "Affymetrix File Matrix" into geWorkbench, the following lists of sets can be seen in the Arrays/Phenotypes group pulldown menu.   
 
* Default
 
* Default
 
* Class
 
* Class
* Source- short
+
* Source - short
 
* Source - detailed
 
* Source - detailed
  
Each such set can contain a different arrangement of the arrays into subsets.
+
Each such list can contain a different arrangement of the arrays into subsets.
  
[[Image:T_Arrays_Groups_choose.png]]
+
[[Image:Arrays_ListOfSets_choose.png]]
  
  
If we choose the set called "Class", the following subsets of arrays are displayed:
+
If we choose the list called "Class", the following sets of arrays are displayed:
  
[[Image:T_Arrays_Groups_Class.png]]
+
[[Image:Arrays_Set_Class.png]]
  
  
If instead we choose the set "Source - short", a different division into subsets of the same arrays is seen:
+
If instead we choose the list "Source - short", a different division into subsets of the same arrays is seen:
  
[[Image:T_Arrays_Groups_CellLine.png]]
+
[[Image:Arrays_Set_Short.png]]

Latest revision as of 14:56, 23 April 2014

Home | Quick Start | Basics | Menu Bar | Preferences | Component Configuration Manager | Workspace | Information Panel | Local Data Files | File Formats | caArray | Array Sets | Marker Sets | Microarray Dataset Viewers | Filtering | Normalization | Tutorial Data | geWorkbench-web Tutorials

Analysis Framework | ANOVA | ARACNe | BLAST | Cellular Networks KnowledgeBase | CeRNA/Hermes Query | Classification (KNN, WV) | Color Mosaic | Consensus Clustering | Cytoscape | Cupid | DeMAND | Expression Value Distribution | Fold-Change | Gene Ontology Term Analysis | Gene Ontology Viewer | GenomeSpace | genSpace | Grid Services | GSEA | Hierarchical Clustering | IDEA | Jmol | K-Means Clustering | LINCS Query | Marker Annotations | MarkUs | Master Regulator Analysis | (MRA-FET Method) | (MRA-MARINa Method) | MatrixREDUCE | MINDy | Pattern Discovery | PCA | Promoter Analysis | Pudge | SAM | Sequence Retriever | SkyBase | SkyLine | SOM | SVM | T-Test | Viper Analysis | Volcano Plot



Overview of Marker and Array Sets

The Markers/Arrays component, located at lower left in the geWorkbench graphical interface, allows the user to define and use sets of arrays and markers for a number of purposes.

As used in geWorkbench, the term "marker" includes genes, probes/probesets, and individual sequences, depending on the type of data loaded. Sets of markers can be returned by various analysis routines. For example, the t-test returns a list of markers showing significant differential expression, and after hierarchical clustering, the markers in a subtree of the resulting dendrogram can be saved to a list.

Sets of microarrays can be used to group arrays in a meaningful fashion for statistical analysis. For example, two such phenotypes might be the diseased and normal states of a tissue from which samples have been taken. geWorkbench uses the terms "Case" and "Control" to categorize these, but in biological setting the equivalent would be "Experimental" vs "Control".

This chapter discusses the use of sets of microarrays. Please see the chapter Marker_Sets for a discussion of the use of Marker sets.

The figure below shows the Arrays/Phenotypes component located below the Workspace in geWorkbench. The Markers component is located in the same space, under the adjacent tab.


Arrays Sets Bcell-100.png

Common Principles of Operation of Marker and Array Sets

Rather than using all arrays or all markers in a data set for a particular analysis or visualization, the user may wish to restrict those used to only some subset.

Activating Sets of Markers and Arrays

In the Markers and the Arrays components, sets of markers and arrays can be defined by the user. Such sets are also created as the outcome of various analysis methods. Adjacent to each set in the graphical interface is a checkbox. Checking this box activates the subset.

  • Activating a set restricts many geWorkbench components to using as input only the markers or arrays that are in one or more such activated sets.
  • Marker Sets
    • If no marker set is active, all markers are used.
    • If at least one marker set is activated, affected components will only use markers that are in activated sets.
  • Array Sets
    • If no Array set is active, all Arrays are used.
    • If at least one Array set is activated, affected components will only use arrays that are in activated sets.
  • Selection set - this is a special, default set. One is present in both the Markers component and the Arrays component. The "Selection" set has the following properties:
    • Double-clicking on a marker or array entry in the upper pane list will add that entry to the default "Selection" set. Double-clicking the same entry again will remove it from the default set.
    • The "Selection" set cannot be deleted.

Number of members displayed

The number of members in a set is given inside square brackets just to the right of the set name.


Upper Pane Controls

Arrays Upper RightClickMenu.png

The list in upper pane of the Arrays component shows the arrays loaded in the current data set.


The upper pane of the Arrays/Phenotypes component has the following controls:

Search

Search for arrays by typing an array name or portion of a name into the search box. As each character is typed, the list is dynamically updated to contain only names containing matches to the query so far. The match can occur anywhere in the name.

For example, here "n" has been entered, and only the five arrays containing "N" in their name are displayed. The search is case-insensitive.


Arrays Search example.png

Light Bulb icon

T Arrays lightbulb.png - when activated T Markers lightbulb active.png, enables "hover text" display of array names in this component.

Double-click action

  • Double-clicking with the mouse on an array name will add that array to the default "Selection" set.
  • Double-clicking on the same entry again will remove it from the "Selection" set.
  • More generally, for any array in the "Selection" set, double-clicking on its entry in the upper list will remove it from the set.


Arrays ArrayDoubleClick.png

Right-click menu

Selecting one or more arrays in the list and then right-clicking gives the following choices in a pop-up menu:

Add to Set

Selected arrays can be added to a new or to an existing set.

A shortcut for adding arrays to an existing set is covered in the next section.

When "Add to Set" is clicked, a dialog box will appear asking for the name of the set. Enter the name of a new or existing set.

As an example, we show adding arrays to two sets based on their phenotype.

(In this example we will start with the same data files that were used in an example in the Local Data Files tutorial. Load the ten individual MAS5 data files as shown there in the section "Loading microarray data files - local". Be sure to check the "merge" option).

First, we select and label arrays which contain samples from the congestive cardiomyopathy disease state:

1. In the Arrays/Phenotypes component, select the six arrays beginning with JB-ccmp, which represent the samples from the congestive cardiomyopathy disease state.

Arrays AddToSet.png


2. Right-click, select Add to Set. In the dialog box, you can enter the name of either an existing set, or of a new set to be created.

3. Enter the new subset name "CCMP" in the input box and click OK.


T Arrays SetLabel.png


4. Next, add the arrays beginning with JB-n to a new set with name "Normal" ( repeat steps 2 & 3 ):

The Array/Phenotype Sets component will now show the two sets added. Note that the number of arrays in each set is shown in square brackets to the right of the set name.


Array ArraySets.png

Add to Set (Existing)

You can add arrays to an existing set by either

  • entering the name of the existing set in the popup dialog box, or
  • before right-clicking on the selected arrays, first highlight the desired array set in the lower list. Its name will then be pre-entered into the Add to Set dialog.


Arrays AddToSelectedSubset.png

Clear Selection

Clear the contents of the default "Selection" array set.

Lower Pane Controls

The lower pane of the Arrays/Phenotypes component has the following controls:


Array/Phenotype Sets (Context)

Arrays Create New Context.png


  • Pulldown menu - A list of array set contexts. Each context can contain different sets of arrays. All components will make use of the currently selected context. If an analysis returns an array set as a result, it is returned to the current array set context. Different contexts can be used to organize arrays at different levels of detail, or different types of analyses.
  • New - This button will create a new, empty array set context.
  • Note - The Affymetrix Matrix File microarray data file format, native to geWorkbench, supports multiple such contexts of array sets being defined and saved.


A popup will ask for a name for the new context:


Arrays New List Popup.png


The new context will appear in the menu, with only the default "Selection" set as a member.


Arrays New List Created.png

Load Set

The Load Set button will bring up a dialog to load a file containing a list of arrays to a new set. The file should have one array name per line in the first column. The format of the file is Comma Separated Values.

Sets are loaded into the currently visible list in the Arrays component. A new list can be created with the "New" button.

If the file contains only one column, the new set is given the same name as the file it was read-in from.

The second column, if there is one, is used to represent Array Set names. Thus one file can contain arrays that will be read in to different sets, depending on the names in the second column. This can be convenient to import array names and phenotypes from a spreadsheet file.

If the file contains this second column, then sets will be created using these names.

Any further columns after the second will be ignored.

Right-click menu

If you right-click on a set, a menu with the following choices appears:

Arrays rightclick menu.png

Rename

Rename the selected set.

Copy

Make a copy of the selected set.

Activate

Activate the selected set (see explanation above). This will set the check-box in front of the set name to checked.

Activating a set can also be done directly by checking the check box before its name.


T Arrays ActivateSets.png



Deactivate

Deactivate the selected set (see explanation above). Deactivating a set can also be done directly by unchecking the check box before its name.

Delete

Delete the selected set.

Combine

Combine the selected sets into a new set. Methods are:

  • Union - Include all arrays that appear in one or more of the selected sets.
  • Intersection - Include only arrays that are present in each of the selected sets.
  • XOR - Include arrays that are present in one and only one of the selected sets. Note that this usage differs from the logic gate XOR function.

Print

Print the names of the arrays contained in the selected set(s) to a printer.

Visual Properties

Change the color and shape of points representing arrays in graphical components, e.g. in the Scatter Plot.

Selecting the "Visual Properties" menu item, here selected for the array set "GC B cell",


Arrays Change visual properties item.png


causes a properties editor to appear. In it, the shape and color of points representing arrays in various geWorkbench graphical components can be globally altered.

Arrays Change visual properties.png


The color chooser:

Arrays Change visual properties choose color.png


The shape chooser:

Arrays Change visual properties choose shape.png


Chooser showing selection of a green "plus" shape to represent the array set.

Arrays Change visual properties green plus.png


After the visual properties of the "GC B-cell" set have been altered, we can view their appearance in, for example, the Scatter Plot component. Here we have activated both the "GC B-cell" set and the "non-GC B-cell" set. The former now uses the green plus signs, whereas the later uses a system assigned default color and shape.

Arrays Change visual properties scatter plot.png

Classification

For statistical tests such as the t-test, sets can be classified using several preset labels, e.g. "Case" and "Control".

The color of the thumbtack icon in front of a set name indicates the sets classification. The color scheme is shown at the bottom of the Arrays component, and is repeated in the following list.

The full list of classifications is:

  • Case - (Red) used in analyses such as t-test to differentiate the experimental group from the control group.
  • Control - (No color) the default classification.
  • Test - (Green) used by the validation step of several classification algorithms, such as KNN (Broad).
  • Ignore - (Gray)


A classification can be made directly by left-clicking on the "thumbtack" icon in front of a set's name. The list of available classifications will be offered.


Arrays SetClassification thumbtack menu.png


In the figure below, a red thumbtack indicates the arrays have been specified as "Case".


Arrays CaseSet.png


The set classification can also be reached from a right-click menu on the set you wish to classify.


Arrays SetClassification long menu.png

Save

Save the chosen set of arrays as a simple list (CSV format, one array per line) to a file on disk.

If more than one array set is highlighted, two choices are offered:

  • Merge into one set - The arrays in all highlighted sets are merged into a single list and written out to a file. A file browser window appears that allows the user to specify the location and name of the new file.
  • Save as multiple sets - Each highlighted array set will be saved to a separate file, using the set name as the new file name. A file browser will appear which will allow the user to specify where to save the new files.


Arrays Save Multiple.png

Removing an array from a set

One or more arrays can be removed from a set by highlighting them and then right-clicking. A menu will appear with option "Remove from Set".


T Arrays RemoveFromSet.png

Working with multiple lists of array sets

There can be different groupings of the same arrays in the Arrays/Phenotypes components. The example data file "BCell-100.exp" predefines 3 such lists of sets. After loading this file of type "Affymetrix File Matrix" into geWorkbench, the following lists of sets can be seen in the Arrays/Phenotypes group pulldown menu.

  • Default
  • Class
  • Source - short
  • Source - detailed

Each such list can contain a different arrangement of the arrays into subsets.

Arrays ListOfSets choose.png


If we choose the list called "Class", the following sets of arrays are displayed:

Arrays Set Class.png


If instead we choose the list "Source - short", a different division into subsets of the same arrays is seen:

Arrays Set Short.png