SAEC exec sum

SAEC Executive summary of data preparation

"[+]" denotes hidden additional information. Clicking on the "+" shows that information. "[*]" denotes available mouse-over information.

Data reformatting

The original Illumina data that came in four comma separated files [+] where divided up by subject and stored in separate files [**]. 4 subject duplicates in Illumina records were removed based on their call rate[+].

PGx40001_12278-DNA.csv
PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12276-DNA.csv
PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12277-DNA.csv
PGx40001_GSK_SJS_B137_28Aug2007_Genotype_Report_12914-DNA.csv

WG0012277-DNA_A10_2948_A10 and WG0012277-DNA_F04_2948_F04.

PLINK input data

info on reformatting

removing SNPs without founder genotype

link to file

following these obvious "outliers" a number of analysis were performed (see results page) that identified SNPs and subjects with inconsistencies. Those are listed below:

Individuals removed

removed because (link to results) concordance + ethnic inconsistancies

SNPs removed

SNPs where removed based on more than one criteria, which ones? where are the results?