Difference between revisions of "Workspace"

(Microrray data and merging datasets)
(Supported data formats)
Line 20: Line 20:
 
*Microarray
 
*Microarray
 
**Affymetrix MAS5/GCOS files - produced by the Affymetrix data analysis programs.
 
**Affymetrix MAS5/GCOS files - produced by the Affymetrix data analysis programs.
**Affymetrix File Matrix - a spreadsheet-type multi-experiment format; this is the native file type created by geWorkbench from merged datasets.
+
**Affymetrix File Matrix - a spreadsheet-type multi-experiment format; this is the native file type created by geWorkbench from merged datasets.  There are two data columns per array; the first contains the signal value, the second contains either a p-value or an Affymetrix Present/Missing/Absent call.  The header format for this file is complex.
**Tab-delimited text (RMAExpress file) - A simple columnar file format, as produced by the program RMAExpress.
+
**Tab-delimited text (RMAExpress file or GEO series matrix) - A simple columnar file format.  geWorkbench can read files in this format produced by RMAExpress and in the GEO series matrix format.  They differ slightly in the headers.
 
**Genepix .GPR files - Produced by a popular analysis program for two-color microarrays.
 
**Genepix .GPR files - Produced by a popular analysis program for two-color microarrays.
 
**Affymetrix CEL files - these files of probe level data can be viewed graphically in geWorkbench but not used directly for analysis.
 
**Affymetrix CEL files - these files of probe level data can be viewed graphically in geWorkbench but not used directly for analysis.
Line 28: Line 28:
 
**PDB files - protein 3-dimensional structure files can be viewed in the JMol Viewer in geWorkbench.
 
**PDB files - protein 3-dimensional structure files can be viewed in the JMol Viewer in geWorkbench.
 
**NetBoost Edge List - used by a component still under development.
 
**NetBoost Edge List - used by a component still under development.
 
  
 
=Microrray data and merging datasets=
 
=Microrray data and merging datasets=

Revision as of 14:30, 24 February 2010

Home | Quick Start | Basics | Menu Bar | Preferences | Component Configuration Manager | Workspace | Information Panel | Local Data Files | File Formats | caArray | Array Sets | Marker Sets | Microarray Dataset Viewers | Filtering | Normalization | Tutorial Data | geWorkbench-web Tutorials

Analysis Framework | ANOVA | ARACNe | BLAST | Cellular Networks KnowledgeBase | CeRNA/Hermes Query | Classification (KNN, WV) | Color Mosaic | Consensus Clustering | Cytoscape | Cupid | DeMAND | Expression Value Distribution | Fold-Change | Gene Ontology Term Analysis | Gene Ontology Viewer | GenomeSpace | genSpace | Grid Services | GSEA | Hierarchical Clustering | IDEA | Jmol | K-Means Clustering | LINCS Query | Marker Annotations | MarkUs | Master Regulator Analysis | (MRA-FET Method) | (MRA-MARINa Method) | MatrixREDUCE | MINDy | Pattern Discovery | PCA | Promoter Analysis | Pudge | SAM | Sequence Retriever | SkyBase | SkyLine | SOM | SVM | T-Test | Viper Analysis | Volcano Plot


Outline

In this tutorial, you will learn how to:

  • Create a new Project.
  • Rename a project and/or project node.
  • Remove a project and/or project node.
  • Save project files that you have created.



Workspaces and Projects

In the Project Folders component there is a top-level object called a workspace. The workspace can contain one or more separate projects, and each project can contain opened data files and analysis results. An analogy might be that a workspace is like a drawer in a filing cabinet, and projects are individual folders in that drawer. Projects allow data to be grouped, for example by experiment. A project can contain many different types of data, for example microarray data, FASTA sequence files and graphical images. The workspace as a whole, with all its projects and data nodes, can be saved and restored. However, only one workspace can be open at one time.


Supported data formats

  • Microarray
    • Affymetrix MAS5/GCOS files - produced by the Affymetrix data analysis programs.
    • Affymetrix File Matrix - a spreadsheet-type multi-experiment format; this is the native file type created by geWorkbench from merged datasets. There are two data columns per array; the first contains the signal value, the second contains either a p-value or an Affymetrix Present/Missing/Absent call. The header format for this file is complex.
    • Tab-delimited text (RMAExpress file or GEO series matrix) - A simple columnar file format. geWorkbench can read files in this format produced by RMAExpress and in the GEO series matrix format. They differ slightly in the headers.
    • Genepix .GPR files - Produced by a popular analysis program for two-color microarrays.
    • Affymetrix CEL files - these files of probe level data can be viewed graphically in geWorkbench but not used directly for analysis.
  • Other
    • FASTA files. DNA or amino-acid sequence files in FASTA format.
    • PDB files - protein 3-dimensional structure files can be viewed in the JMol Viewer in geWorkbench.
    • NetBoost Edge List - used by a component still under development.

Microrray data and merging datasets

When working with microarray data, all data to be analyzed must be present within one data node in a project. If the data exists as multiple files containing results from single arrays, the data must be merged into a single node before it can be used. geWorkbench can perform this merging step either at the time data is read in, or later in a separate step. Once merged, such a dataset can be saved to disk; it will be saved in the geWorkbench matrix file format.

Data merging will be covered in the local and remote data tutorials.

Tutorial: Working with Projects

Creating a new project

All data must belong to a project. Right-click on the Workspace entry in the Project Folders window at upper left to create a new project.

T NewProject.png


Renaming a project

1. Right-click on Project folder.

2. Select Rename.


T ProjectFolder RenameProject.png


3. In the pop-up screen rename your project.

4. Click on the OK button



Renaming a project data node

1. Right-click on a Project Folder data node.

2. Select Rename.

T RenameNode.png


3. In the pop-up screen rename your data node.

T ProjectFolder RenameDataset2.png


4. Click on the OK button.


Removing a project

1. Right-click on Project folder.

2. Select Remove.


Removing a project data node

1. Right-click on the data node.

2. Select Remove.


Saving a data node to a file

It is here that, among other things, you can create the matrix multi-experiment file format used by geWorkbench from a merged dataset.

1. Right-click on data node that you want to save.

2. Click Save.

T NodeOptionsMenu.png


A standard file Save screen will come up.

3. Choose a location.

4. Enter a name.

5. Click on the Save button.