Difference between revisions of "User:Smith"

 
(38 intermediate revisions by the same user not shown)
Line 1: Line 1:
Design and outline of tutorials for geWorkbench
+
==Resources:==
  
Tutorial Design considerations -
+
http://geworkbench.org =
1. Probably best not to use detailed section numbers, since we cannot autoupdate them in this wiki. Instead, rely on links?
+
http://wiki.c2b2.columbia.edu/workbench
2. Each section should list example data files needed, and these should be part of distribution.
 
  
 +
http://wiki.c2b2.columbia.edu/workbook/index.php/Genomics_Workbook
  
Outline for tutorials
+
https://sharepoint.c2b2.columbia.edu/c2b2/default.aspx
        2.1 Before You Begin
 
        2.2 Getting Started
 
              Is caWorkbench downloaded and installed?  Link to download and installation
 
              Important concepts:
 
                  Use of activated phenotype and marker panels throughout application.
 
                      - if no panels are activated, the "Activated Arrays" and "Activated Markers" check boxes should have no effect.
 
                      - if gene or phenotype panels are activated, then these check boxes should control what is used or displayed-
 
                        -- if one of the boxes is checked, only activated markers or arrays will be used.
 
                        -- if the box is not checked, then ('''in most cases - are there any exceptions?''') the gene or phenotype panels will be ignored and all arrays or markers will be used.
 
                      Note that there is a new "plot" button that is available only when a gene panel is active.
 
              The  menu bar - point out that some commands are available both from the menu bar and by right-clicking on a dataset....
 
  
        2.3 Loading Data
+
http://wiki.c2b2.columbia.edu/mantis/
              2.3.1 File types supported
 
                    Expression
 
                        Affymetrix MAS5/GCOS (text files output by Affymetrix software)
 
                        Affymetrix File Matrix (.exp)(a geWorkbench defined format)
 
                        RMAExpress Processed File
 
                        GenePix
 
                        Note - the type "Normalized no-confidence expression matix" has switched the phenotype and gene labels -don't use until fixed.
 
                    Genotypic
 
                        Genotypic data files - is this working?
 
                    Sequence
 
                        Fasta
 
                    Pattern Detection
 
                        Pattern Files
 
  
              2.3.2 Loading MAS5/GCOS type files
+
http://wiki.c2b2.columbia.edu/mantis/view_all_bug_page.php
                        Use the 10 cardiomyopathy files from Harvard.
 
                        What happens the first time a new chip-type is loaded - how long does it take, what is happening, what internal files are being built?
 
              2.3.3 Merging loaded data
 
 
            [  '''These examples not really needed.....'''
 
              2.3.4 Loading matrix format files
 
                        Include webmatrix2000G?, webmatrix4000G? and webmatrix.exp?
 
                    Note - explain matrix format in an appendix
 
              2.3.5 Other file types supportedLoading RMAExpress files
 
                    Must generate an example RMAExpress file, start with harvard cardio files?
 
            ]
 
        2.4 Working with Marker and Phenotype Panels
 
                    Use the cardiomyopathy dataset created in 2.3
 
                2.4.1 Creating Phenotype Panels
 
                2.4.2 Assigning Case/Control status
 
                2.4.3 Activating a phenotype panel
 
                2.4.4 Creating Gene/Marker Panels
 
                2.4.5 Activating a phenotype panel
 
        2.5 Saving data files
 
                Use the cardiomyopathy dataset annotated in 2.4
 
                2.5.1 Save to matrix file
 
               
 
                 
 
  
          2.6 Visualize Gene Expression
+
http://wiki.c2b2.columbia.edu/mantis/login_page.php
                Microarray Panel
 
                  Point out intensity and array sliders, color key and array name.
 
                Color Mosaic
 
                  Point out only displays when "Display" button pushed.
 
                  Point out intensity, accession, gene height and width controls.
 
                  ??Explain whether remaining controls work or not: Pat,Abs,Ratio.???
 
                Expression Profiles
 
                  - displays expression level against array number. Each marker is a separate color line.
 
                Expression Value Distribution
 
                  - for a single array, plots expression value against marker number.
 
          2.6 Filter and Normalize Data
 
                2.6.1 Normalize
 
                2.6.2 Filter
 
          2.7 Clustering Gene Expression Data
 
          2.8 Differential Expression
 
                2.8.1 T Test
 
          2.9 Regulatory Network
 
          2.10 Integrated Annotation Information
 
          2.11 Enrichment Analysis
 
          2.12 Sequence Analysis
 
          2.13 Pattern Discovery
 
          2.14 Promoter Analysis
 
  
 +
http://wiki.c2b2.columbia.edu/isrce/index.php/MARINa,_IDEA,_CUPID_Grid_Service_Implementation
  
==Tutorial: Hierarchical Clustering==
 
  
===Preliminary Filtering and Normalization===
+
http://gforge.nci.nih.gov
  
 +
http://gforge.nci.nih.gov/projects/geworkbench
  
The file "webmatrix.exp" contains results from 100 Affymetrix HG-U95 chips containing B-cell samples from numerous different disease states (phenotypes). 12600 markers are represented. To prepare this dataset for clustering we will filter and normalize the data. The steps shown are just an example of how filtering and normalization can be used, and each dataset should be handled according to the type of analysis being undertaken and its goals.
+
http://wiki.c2b2.columbia.edu/informatics/
 +
same as
 +
(http://helpdesk.cu-genome.org/informatics/)
  
For this dataset, we performed the following steps:
 
  
1. Applied '''Expression Threshold Filter''' to remove very low expression values in the range 0-20.
+
ICTVdb
  
2. Applied the '''Missing Values Filter''' with a maximum number of missing values per marker of 2. (Deletes markers with more than 2 missing values).  This reduced the number of markers to 6327.
 
  
3. Performed '''Quantile Normalization''' using '''Averaging Method''' of '''Mean Marker Profile'''.
 
  
4. Applied the '''Deviation Filter''' with Deviation Bound of 20 and '''Missing Values''' set to '''Marker Average'''.
+
http://wiki.c2b2.columbia.edu/ictvdb/
  
5. Applied the '''Missing Values Filter''' as in (2), which further reduced the number of markers to 6270.
+
nonpublic documents:
  
The resulting dataset was named '''webmatrix_fn.exp''' and is available for download.
+
adcvs.cu-genome.org:/cvs/magnet
 
 
 
 
Hierarchical Clustering
 
 
 
 
 
 
 
[[Image:T_Analysis_FHC.png]]
 
 
 
'''separator'''
 
 
 
[[Image:T_Dendrogram_Clusters.png]]
 
 
 
'''separator'''
 
 
 
[[Image:T_Dendrogram_SelectCluster.png.png]]
 
 
 
'''separator'''
 
 
 
[[Image:T_Dendrogram_ClusterDetailAdd.png]]
 
 
 
'''separator'''
 
 
 
[[Image:T_GenePanel_ClusterTree.png]]
 
 
 
'''separator'''
 
 
 
[[Image:T_ProjectFolder_HierarchClust.png]]
 
 
 
'''separator'''
 
 
 
[[Image:T_MarkerAnnotations_ClusterTree.png.png]]
 
 
 
'''separator'''
 
 
 
[[Image:T_CGAP_Page_for_NME1.png]]
 
 
 
'''separator'''
 
 
 
[[Image:T_caBIO_Pathways_h_ndkDynamin.png]]
 
 
 
'''separator'''
 
 
 
[[Image:T_SeqeunceRetriever_ClusterTree.png]]
 
 
 
'''separator'''
 
 
 
[[Image:T_ProjectFolder_ClusterSeqs.png]]
 
 
 
'''separator'''
 
 
 
[[Image:T_PatternDiscovery_Run.png]]
 
 
 
'''separator'''
 
 
 
[[Image:T_PatternDiscovery_Result.png]]
 
 
 
 
 
 
 
 
 
'''separator'''
 
 
 
[[Image:T_ProjectFolder_PatternDiscovery.png]]
 
 
 
 
 
 
 
'''separator'''
 

Latest revision as of 13:11, 6 August 2013

Resources:

http://geworkbench.org = http://wiki.c2b2.columbia.edu/workbench

http://wiki.c2b2.columbia.edu/workbook/index.php/Genomics_Workbook

https://sharepoint.c2b2.columbia.edu/c2b2/default.aspx

http://wiki.c2b2.columbia.edu/mantis/

http://wiki.c2b2.columbia.edu/mantis/view_all_bug_page.php

http://wiki.c2b2.columbia.edu/mantis/login_page.php

http://wiki.c2b2.columbia.edu/isrce/index.php/MARINa,_IDEA,_CUPID_Grid_Service_Implementation


http://gforge.nci.nih.gov

http://gforge.nci.nih.gov/projects/geworkbench

http://wiki.c2b2.columbia.edu/informatics/ same as (http://helpdesk.cu-genome.org/informatics/)


ICTVdb


http://wiki.c2b2.columbia.edu/ictvdb/

nonpublic documents:

adcvs.cu-genome.org:/cvs/magnet