Difference between revisions of "User:Smith"

 
(21 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Tutorial: Promoter Analysis==
+
==Resources:==
  
   
+
http://geworkbench.org =
 +
http://wiki.c2b2.columbia.edu/workbench
  
==Tutorial: Regulatory Network Reverse Engineering==
+
http://wiki.c2b2.columbia.edu/workbook/index.php/Genomics_Workbook
  
 +
https://sharepoint.c2b2.columbia.edu/c2b2/default.aspx
  
==Tutorial: Integrated Annotation Information ???==
+
http://wiki.c2b2.columbia.edu/mantis/
  
 +
http://wiki.c2b2.columbia.edu/mantis/view_all_bug_page.php
  
==Tutorial: Enrichment Analysis/GO Term component==
+
http://wiki.c2b2.columbia.edu/mantis/login_page.php
  
 +
http://wiki.c2b2.columbia.edu/isrce/index.php/MARINa,_IDEA,_CUPID_Grid_Service_Implementation
  
==Tutorial: Sequence Analysis - BLAST==
 
  
Will provide some scenarios on when you might want to do BLAST queries in the context of geWorkbench - e.g. you have found an interesting marker, retrieve its gene, and want to see what it is related to ( we can't do by-gene queries right now though in the sequence retriever).
+
http://gforge.nci.nih.gov
  
Use the file "NM_024426-Wilms.fasta" provided in the tutorial data directory. This is a nucleotide sequence file.  There is a second file which contains the corresponding protein sequence, "NP_077744-Wilms.fasta".
+
http://gforge.nci.nih.gov/projects/geworkbench
  
Provide a little background info about Wilm's tumor. (It was chosen at random).
+
http://wiki.c2b2.columbia.edu/informatics/
 +
same as
 +
(http://helpdesk.cu-genome.org/informatics/)
  
Go to Sequence Alignment.
 
  
Select BLAST.
+
ICTVdb
  
Note that the subsequence displays the length of the longest sequence selected (here there is only one).  It can be used to select out a portion of the sequence to use for the query ( probably wouldn't make much sense if more than one sequence is selected for the query).
 
  
Select a program.  Since this is a nucelotide query, we want to select a nucleotide query program such as blastn.
 
  
Provide a one line description of each of the different blast algorithms. We have this info on the AMDeC website.
+
http://wiki.c2b2.columbia.edu/ictvdb/
  
Now that the program has been selected, note that the appropriate databases are displayed (need to verify this for all algorithms).  Here we will try ncbi/nt - the complete non-redundant nucleotide database.
+
nonpublic documents:
  
Go to the advanced options tab. Make sure the matrix "dna.mat" is selected.  Change the Expect value to 0.01.  We will leave checked the box to use PFP filtering for repeated sequence elements (Paracel Filtering Package).
+
adcvs.cu-genome.org:/cvs/magnet
 
 
In the Service tab, select Columbia.
 
 
 
Note the text field at bottom which shows that one sequence has been selected.  If you have a fasta file with mulitple sequences, you can select the ones you want in the Markers component and activate this selection, letting you search on a subset.  Or, you can search on all the sequences in a file (all markers checkbox, or also by default if no subselection made?  find out).
 
 
 
You can check the server status by hitting the "Refresh" button.  For the columbia machine, this can give you an idea of how busy it is.
 
 
 
Press the curved arrow submit button.  Observe the progress bar "Blast is running".
 
 
 
When the results return, they are placed in the Project Folders as a child of the sequence the correspond to.  Note that you can mouse over the result set to see how many sequences are in it.  In this case, I found 160.
 
 
 
 
 
In the BLAST results viewer, you can examine the alignments  (Note that this component is not working right - no point taking a screenshot yet).  Note that this sequences hits many other target sequences.  Each different target hit is listed on a line in the results table.  Note that a sequence can hit one target sequence in several different places.  Each is listed as a separate subentry under that target.  Note that there is a bug associated with the information displayed when there are such subhits.
 
 
 
Here you can select sequences to add back to the main project by checking "include" and then "Add selected sequences to project".  You can also add just the aligned parts by hitting the appropriate button - note- there is a bug here, it is adding the worst, not the best aligned subsequence for each target.
 
 
 
The Load button allows you to load an external BLAST file in HTML format into the viewer.
 

Latest revision as of 13:11, 6 August 2013

Resources:

http://geworkbench.org = http://wiki.c2b2.columbia.edu/workbench

http://wiki.c2b2.columbia.edu/workbook/index.php/Genomics_Workbook

https://sharepoint.c2b2.columbia.edu/c2b2/default.aspx

http://wiki.c2b2.columbia.edu/mantis/

http://wiki.c2b2.columbia.edu/mantis/view_all_bug_page.php

http://wiki.c2b2.columbia.edu/mantis/login_page.php

http://wiki.c2b2.columbia.edu/isrce/index.php/MARINa,_IDEA,_CUPID_Grid_Service_Implementation


http://gforge.nci.nih.gov

http://gforge.nci.nih.gov/projects/geworkbench

http://wiki.c2b2.columbia.edu/informatics/ same as (http://helpdesk.cu-genome.org/informatics/)


ICTVdb


http://wiki.c2b2.columbia.edu/ictvdb/

nonpublic documents:

adcvs.cu-genome.org:/cvs/magnet