Annotation Dependencies

Revision as of 16:02, 18 June 2010 by Smith (talk | contribs)

geWorkbench currently can only read in annotation files based on the Affymetrix annotation file format. Following is a list of the column headers for data types required by various geWorkbench components.

  • All components require "Probe Set ID".
Component Columns required
CNKB Entrez Gene
Gene Ontology Gene Ontology Biological Process, Gene Ontology Cellular Component, Gene Ontology Molecular Function
Marker Annotations
Sequence Retrieval (EBI)
Sequence Retrieval (Santa Cruz)


All Affy headers:

  • Probe Set ID
  • GeneChip Array
  • Species Scientific Name
  • Annotation Date
  • Sequence Type
  • Sequence Source
  • Transcript ID(Array Design)
  • Target Description
  • Representative Public ID
  • Archival UniGene Cluster
  • UniGene ID
  • Genome Version
  • Alignments
  • Gene Title
  • Gene Symbol
  • Chromosomal Location
  • Unigene Cluster Type
  • Ensembl
  • Entrez Gene
  • SwissProt
  • EC
  • OMIM
  • RefSeq Protein ID
  • RefSeq Transcript ID
  • FlyBase
  • AGI
  • WormBase
  • MGI Name
  • RGD Name
  • SGD accession number
  • Gene Ontology Biological Process
  • Gene Ontology Cellular Component
  • Gene Ontology Molecular Function
  • Pathway
  • InterPro
  • Trans Membrane
  • QTL
  • Annotation Description
  • Annotation Transcript Cluster
  • Transcript Assignments
  • Annotation Notes