SAEC exec sum
Contents
SAEC Executive summary of data preparation
"[+]" denotes hidden additional information. Clicking on the "+" shows that information. "[*]" denotes available mouse-over information.
Data reformatting
The original Illumina data that came in four comma separated files [+–] where divided up by subject and stored in separate files [**]. 4 subject duplicates in Illumina records were removed based on their call rate[+–].
PGx40001_12278-DNA.csv PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12276-DNA.csv PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12277-DNA.csv PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12914-DNA.csv
WG0012277-DNA_A10_2948_A10 and WG0012277-DNA_F04_2948_F04.
PLINK input data
info on reformatting
removing SNPs without founder genotype
link to file
following these obvious "outliers" a number of analysis were performed (see results page) that identified SNPs and subjects with inconsistencies. Those are listed below:
Individuals removed
removed because (link to results) concordance + ethnic inconsistancies
SNPs removed
SNPs where removed based on more than one criteria, which ones? where are the results?