GSEA
Contents
Overview
Gene Set Enrichment Analysis (Subramanian et al, 2005)
Prerequisites
- The "GSEA Analysis" and "GSEA Browser" components must be loaded in the Component Configuration Manager.
- An expression dataset must be loaded in the Workspace.
- Two (and only two) array sets must be activated in the Arrays component. They do not need to be marked "Case" or "Control", this will have no effect. These sets define the two classes used to calculate a measure of differential expression and from that the rank order of genes.
Parameters
Required Parameters
- select gene set database - Gene sets database from GSEA website.
- upload gene set database - Gene sets database - .gmt, .gmx, .grp. Upload a gene set if your gene set is not listed as a choice for the gene sets database parameter.
- collapse probe sets - Select yes to have GSEA collapse each probe set in the expression dataset into a single vector for the gene, which gets identified by its gene symbol.
- select chip platform - Choose the annotation ("Chip") file that matches the expression dataset loaded in the Workspace.
- upload chip platform - Upload a chip file if your chip is not listed as a choice for the chip platform parameter.
- permutation type - Type of permutation to perform.
- phenotype - permute arrays among the two phenotype classes (preferred).
- gene set - chose random genes sets of the same size as that being tested.
- number of permutations - Number of permutations to perform.
Basic Parameters
- scoring scheme - The statistic used to score hits (gene set members) and misses (non-members)
- classic
- weighted
- weighted_p2
- weighted_p1.5
- metric for ranking genes - Class separation metric - gene markers are ranked using this metric to produce the gene list
- Cosine
- Euclidean
- Manhattan
- Pearson
- min gene set size - Gene sets smaller than this are excluded from the analysis
- max gene set size - Gene sets larger than this are excluded from the analysis
- gene list ordering mode - Direction in which the gene list should be ordered
- descending
- ascending
Advanced Parameters
- collapse mode - collapsing mode for probe sets with more than one match
- max probe
- median of probes
- normalization mode - normalization to apply
- none
- meandiv
- randomization mode - Type of phenotype randomization (does not apply to gene set permutations)
- no balance
- equalize and balance
- omit features with no symbol match - If there is no known gene symbol match for a probe set, omit it from the collapsed dataset.
- yes
- no
GenePattern Server Settings
You can connect to any running GenePattern server to run the analysis (provided it has the required module installed). An example configuration of the "GenePattern Server Settings" tab is shown here:
- Protocol - http
- Host - enter the URL of an available GenePattern server.
- Port - Enter the port number of the GenePattern server you are using.
- Username - enter a GenePattern username
- Password - the GenePattern password for username, if required.
Results
Technical Note
The GSEA components are found in the "gpmodule_v3_0" package in the geWorkbench component source tree.
References
- Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP. (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 102(43):15545-50. PubMed 16199517
- GenePattern modules documentation.
- GSEA v14 online documentation.
- Guide to interpreting GSEA Results