T-test

Revision as of 13:39, 14 March 2011 by Smith (talk | contribs) (Classification)

Home | Quick Start | Basics | Menu Bar | Preferences | Component Configuration Manager | Workspace | Information Panel | Local Data Files | File Formats | caArray | Array Sets | Marker Sets | Microarray Dataset Viewers | Filtering | Normalization | Tutorial Data | geWorkbench-web Tutorials

Analysis Framework | ANOVA | ARACNe | BLAST | Cellular Networks KnowledgeBase | CeRNA/Hermes Query | Classification (KNN, WV) | Color Mosaic | Consensus Clustering | Cytoscape | Cupid | DeMAND | Expression Value Distribution | Fold-Change | Gene Ontology Term Analysis | Gene Ontology Viewer | GenomeSpace | genSpace | Grid Services | GSEA | Hierarchical Clustering | IDEA | Jmol | K-Means Clustering | LINCS Query | Marker Annotations | MarkUs | Master Regulator Analysis | (MRA-FET Method) | (MRA-MARINa Method) | MatrixREDUCE | MINDy | Pattern Discovery | PCA | Promoter Analysis | Pudge | SAM | Sequence Retriever | SkyBase | SkyLine | SOM | SVM | T-Test | Viper Analysis | Volcano Plot



Overview

A t-Test analysis can be used to identify markers with statistically significant differential expression between two sets of microarrays. In geWorkbench, these groups are specified as the "Case" and "Control" sets.

There are several steps to setting up a t-test analysis in geWorkbench.

  1. At least two sets of arrays must be available in the Arrays component.
  2. The array sets to be used in the analysis must be "activated" by checking the box adjacent to their names in the Arrays component.
  3. One or more activated array sets must be designated "Case", and the others "Control" (which is the default classification).
  4. The t-test parameters must be set.


After the t-test is run, the results will be displayed graphically, and all markers meeting the significance threshold are placed into a new Marker Set called "Significant Genes".

Preparation

Obtain the file "BCell-100.exp", which is contained in the data/public_data directory of the geWorkbench distribution, or can be directly downloaded from the tutorial data download area.


For tips on loading data files, see Local Data Files and Projects.

In this example, we apply two normalization steps to the data set.

  1. Threshold Normalizer - set a minimum value of 1. Any value less than 1 will be set to 1.
  2. Log2 Transformation Normalizer - Log2 transform the data.

For an actual data analysis, you should apply data normalization steps appropriate to your own data and analysis design.

t-Test Parameters

P-value

The p-value can be estimated from 1. the t-statistic (the default) or 2. by permutation.


T-test Pvalue params.png


Alpha corrections

For multiple testing (alpha) correction, the following options are offered:

  1. no correction
  2. Standard Bonferonni Correction
  3. Adjusted (step down) Bonferonni Correction.
  4. Additional methods are available if the p-value is being estimated by permuation:
    1. minP
    2. maxT


T-test alpha corrections.png

Degrees of Freedom

Group variances can be declared as:

  1. unequal (Welch approximation) (default)
  2. Equal.


T-test degrees of freedom.png

Array Classification

The t-test in geWorkbench requires that at least two sets of arrays be "activated". Only such "activated" sets are considered. In addition, at least one such set must be designated as "Case", and at least one other as "Control" (which is the default classification). Note that more than one set of arrays can be marked as "Case" or control.

Array set classification is covered in the Arrays/Phenotypes chapter. However, for convenience, the steps are illustrated here.

The desired sets of arrays should be activated in the Arrays/Phenotypes component. This is done by checking the boxes by the desired Sets.

T-test Set activation BCell.png


The classification can be made directly by left-clicking on the "thumb-tack" icon adjacent to an array set name.

T-test Set classification left click Bcell.png


The array classification can also be set by right-clicking on the desired array set and selecting "Classification":


T-test Set classification right click Bcell.png


Using either method, the desired array set can be classified as "Case":


T-test Set selection BCell.png


The thumbtack image next to activated Array Sets is colored red.

Set Analysis Parameters

  • From the Analysis Panel, select T-Test Analysis.
  • Various parameters can be adjusted as desired. Here we will use the Standard Bonferonni method, which is the strictest.
  • Alpha-corrections tab: Standard Bonferonni.

T t-test bonferroni BCELL webm qldm.png


  • P-Value Parameters tab: p-values based on t-distribution. Note that the default alpha (critical p-value) is set to 0.01.

T t-test p-values.png


  • Degree of Freedom tab: Welch approximation - unequal group variances.

T t-test dof.png


After all the parameters have been set, click Analyze. The results will be returned in three locations: The Project Folder, the Markers component, and the Visualization area.

t-Test Results

The result is placed into the Projects Folder as a child of the microarray dataset that was analyzed.

T t-test ProjectFolders result.png


The results are displayed by default using the Volcano Plot visualizer.

T t-test volcano BCELL webm qldm.png


The adjacent tab provides a Color Mosaic showing all of the arrays and the p-value calculated for each marker. It also can display annotation for each marker.

T t-test colormosaic BCELL webm qldm.png


The set of markers which met the minimum signifcance criterion are placed into a new Marker Set labeled "Significant Genes" in the Markers component. The number of markers is shown also.

T t-test Markers BCELL result.png

References

t-test [1]