CNKB version 2.0 design notes
From Informatics
The CNKB servlet is implemented to supports a request and response programming model. When a client sends a query request to the servlet, the doPost() method is called. It constructs a query result based on client’s request and sends it back to the client.
In CNKB v1.0, the interaction query is implemented based on gene entrez Id. So, if the interaction data does not come from "Entrez Gene" data source, then the query result will not include them.
In CNKB v2.0, we implement the following query to support "Entrez Gene", "uniprot" and other data suorces.
CNKB client:
Send query request along with "interactome", "version", "geneId" and "geneName".
CNKB Servlet:
Receive query request and perform the following:
Step1: Get interaction id list from database base on entrezId, context, version. And these interaction's participant should came from "Entrez Gene" data source. The example detailed SQL statement looks as:
SELECT i.id FROM physical_entity pe, db_source ds, interaction_participant ip, interaction i, interaction_interactome_version iiv WHERE ip.participant_id = pe.id AND ip.interaction_id = i.id AND i.id = iiv.interaction_id AND ds.name='Entrez Gene' AND pe.accession_db = ds.id AND i.id = iiv.interaction_id AND iiv.interactome_version_id = 1 AND pe.primary_accession = '12345';
Step2: Get interaction id list from database base on geneName, context, version. And these interaction's participant should not came from "Entrez Gene" data source. The example detailed SQL statement looks as:
SELECT i.id FROM physical_entity pe, db_source ds, interaction_participant ip, interaction i, interaction_interactome_version iiv WHERE ip.participant_id = pe.id AND ip.interaction_id = i.id AND i.id = iiv.interaction_id AND ds.name <> 'Entrez Gene' AND pe.accession_db = ds.id AND i.id = iiv.interaction_id AND iiv.interactome_version_id = 1 AND pe.gene_symbol = 'ABCD';
Step3: Get related interaction rows base on step1 and step2. The detailed SQL example is:
SELECT pe.primary_accession as primary_accession, ds.name as accession_db, pe.gene_symbol as gene_symbol, i.id as interaction_id, i.confidence_value as confidence_value, it.name as interaction_type FROM physical_entity pe, interaction_participant ip, interaction i, interaction_type it, db_source ds WHERE pe.id=ip.participant_id AND pe.accession_db=ds.id AND ip.interaction_id=i.id AND i.interaction_type=it.id AND i.id in (3456, 7897, 5056, 5567) ORDER BY i.id";
Step4: Send all related interaction rows to cnkb client. The row includes: primary_accession, accession_db, gene_symbol, interaction_id, confidence_value, interaction_type
CNKB client:
Receive query result.
If the interaction participant came from "Entrez Gene" data source, then application treats primary_accession as "entrezId" and search it through the selected microarray.
If the interaction participant came from "Uniprot" data source, then application treats primary_accession as "swissprot id" and search it through the selected microarray based on primary_accession and gene name.
If the interaction participant came from other data source, then application just uses gene name(if it exists) to search through the selected microarray.