Difference between revisions of "SAEC exec sum"
(→Data reformatting) |
(→removing SNPs without founder genotype) |
||
(3 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
= SAEC Executive summary of data preparation= | = SAEC Executive summary of data preparation= | ||
− | "[+]" denotes hidden additional information. Clicking on the "+" shows that information. " | + | "<font color="red">[+]</font>" denotes hidden additional information. Clicking on the "<font color="red">+</font>" shows that information. "<font color="red">[*]</font>" denotes available mouse-over information. |
== Data reformatting == | == Data reformatting == | ||
The original Illumina data that came in four comma separated files <span class="toggleblock" title="csv_files">[<font>+</font><font style="display:none;">–</font>]</span> where divided up by subject and stored in separate files <span class="toggleblock" title="located on ~/SJS/Genotypes">[<font>*</font><font style="display:none;">*</font>]</span>. | The original Illumina data that came in four comma separated files <span class="toggleblock" title="csv_files">[<font>+</font><font style="display:none;">–</font>]</span> where divided up by subject and stored in separate files <span class="toggleblock" title="located on ~/SJS/Genotypes">[<font>*</font><font style="display:none;">*</font>]</span>. | ||
Line 17: | Line 17: | ||
== PLINK input data == | == PLINK input data == | ||
+ | info on reformatting | ||
== removing SNPs without founder genotype == | == removing SNPs without founder genotype == | ||
+ | link to file | ||
+ | |||
+ | following these obvious "outliers" a number of analysis were performed (see results page) that identified SNPs and subjects with inconsistencies. Those are listed below: | ||
== Individuals removed == | == Individuals removed == | ||
+ | removed because (link to results) concordance + ethnic inconsistancies | ||
== SNPs removed == | == SNPs removed == | ||
+ | SNPs where removed based on more than one criteria, which ones? where are the results? |
Latest revision as of 15:45, 18 January 2008
Contents
SAEC Executive summary of data preparation
"[+]" denotes hidden additional information. Clicking on the "+" shows that information. "[*]" denotes available mouse-over information.
Data reformatting
The original Illumina data that came in four comma separated files [+–] where divided up by subject and stored in separate files [**]. 4 subject duplicates in Illumina records were removed based on their call rate[+–].
PGx40001_12278-DNA.csv PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12276-DNA.csv PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12277-DNA.csv PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12914-DNA.csv
WG0012277-DNA_A10_2948_A10 and WG0012277-DNA_F04_2948_F04.
PLINK input data
info on reformatting
removing SNPs without founder genotype
link to file
following these obvious "outliers" a number of analysis were performed (see results page) that identified SNPs and subjects with inconsistencies. Those are listed below:
Individuals removed
removed because (link to results) concordance + ethnic inconsistancies
SNPs removed
SNPs where removed based on more than one criteria, which ones? where are the results?