Difference between revisions of "SAEC exec sum"

(Data reformatting)
Line 2: Line 2:
 
"[+]" denotes hidden additional information. Clicking on the "+" shows that information. "-" denotes available mouse-over information.
 
"[+]" denotes hidden additional information. Clicking on the "+" shows that information. "-" denotes available mouse-over information.
 
== Data reformatting ==
 
== Data reformatting ==
The original Illumina data that came in four comma separated files where divided up by subject and stored in separate files <span class="toggleblock" title="located on ~/SJS/Genotypes">[<font>-</font>]</span>
+
The original Illumina data that came in four comma separated files <span class="toggleblock" title="csv_files">[<font>+</font><font style="display:none;">–</font>]</span> where divided up by subject and stored in separate files <span class="toggleblock" title="located on ~/SJS/Genotypes">[<font>*</font><font style="display:none;">*</font>]</span>.
4 subject duplicates in Illumina records were removed <span class="toggleblock" title="removed_Illumina_Subjects">[<font>+</font><font style="display:none;">–</font>]</span>
+
4 subject duplicates in Illumina records were removed based on their call rate<span class="toggleblock" title="removed_Illumina_Subjects">[<font>+</font><font style="display:none;">–</font>]</span>.
  
 +
<div id="csv_files" class="hiddenblock">
 +
PGx40001_12278-DNA.csv
 +
PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12276-DNA.csv
 +
PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12277-DNA.csv
 +
PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12914-DNA.csv
 +
</div>
  
 
<div id="removed_Illumina_Subjects" class="hiddenblock">
 
<div id="removed_Illumina_Subjects" class="hiddenblock">
 
WG0012277-DNA_A10_2948_A10 and WG0012277-DNA_F04_2948_F04.
 
WG0012277-DNA_A10_2948_A10 and WG0012277-DNA_F04_2948_F04.
 
</div>
 
</div>
 +
 
== PLINK input data ==
 
== PLINK input data ==
  

Revision as of 15:37, 18 January 2008

SAEC Executive summary of data preparation

"[+]" denotes hidden additional information. Clicking on the "+" shows that information. "-" denotes available mouse-over information.

Data reformatting

The original Illumina data that came in four comma separated files [+] where divided up by subject and stored in separate files [**]. 4 subject duplicates in Illumina records were removed based on their call rate[+].

PGx40001_12278-DNA.csv
PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12276-DNA.csv
PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12277-DNA.csv
PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12914-DNA.csv

WG0012277-DNA_A10_2948_A10 and WG0012277-DNA_F04_2948_F04.

PLINK input data

removing SNPs without founder genotype

Individuals removed

SNPs removed