Contents |
Synthetic Five-Gene Network Inference (DREAM2, Challenge 3)
This archival page describes the challenge exactly as it was presented to the participants. Go to the main DREAM2 Challenge 3 page to download data, view team rankings, cite this work, etc.
Synopsis
A synthetic-biology network consisting of 5 interacting genes was created and transfected to an in-vivo model organism. The challenge consists of predicting the connectivity of the five-gene network from in-vivo measurements.
Datasets
Two datasets were generated using qPCR (Dataset FiveGeneNet1) and gene expression arrays (Dataset FiveGeneNet2), from which the five gene network could in principle be inferred independently. The predictions from each of these datasets will be evaluated independently.
FiveGeneNet1
File name: FiveGene_qPCR.xls
Description: This dataset contains two time series for the 5 genes of the network. In each of the time series, the organism was treated with the same initial perturbation, and cell cultures were collected at different times from qPCR measurements. The two time series correspond to samples taken at regular intervals for 3 hr (time series qPCR_A) and for 5 hr (time series qPCR_B).
Submission Information: Predictions for dataset FiveGeneNet1 can be submitted in one or more of the following categories: UNDIRECTED-UNSIGNED, UNDIRECTED-SIGNED, DIRECTED-UNSIGNED, DIRECTED-SIGNED.
For UNSIGNED submissions:
- Submit network predictions for dataset FiveGeneNet1 in one or both of the following categories: UNDIRECTED-UNSIGNED, DIRECTED-UNSIGNED. Submit a ranked list of gene pairs, ordered according to the confidence you assign to your prediction that a pair is connected, from the most reliable (first row) to the least reliable (last row) prediction. Use the following 3 tab-separated column format as in the example below:
- gene_A \tab gene_B \tab XYZ
- If the category is DIRECTED, the gene in the first column regulates the gene in the second column. (If both gene_A regulates gene_B and gene_B regulates gene_A, then both lines should be included.) If the category is UNDIRECTED, the order of the genes is irrelevant. XYZ is a connectivity score between 0 and 1 that indicates the confidence level you assign to the prediction that a pair is connected. (E.g., XYZ = 1 if the pair is deemed to be connected with highest confidence and XYZ = 0 if the pair is deemed not to interact.) All pairs omitted from the list will be considered to appear randomly ordered at the end of the list with XYZ = 0. Save the file as text, and name it:
- TeamName_Category_FiveGene_qPCR.txt
- where TeamName is the name of the team with which you registered for the challenge, and category can be one of the following types of predictions: UNDIRECTED-UNSIGNED, DIRECTED-UNSIGNED.
For SIGNED submissions:
- Submit one network predictions for excitatory connections and one for inhibitory connections for dataset FiveGeneNet1 in one or both of the following categories: UNDIRECTED-SIGNED, DIRECTED-SIGNED.
- For EXCITATORY connections:
- Submit a ranked list of gene pairs, ordered according to the confidence you assign to your prediction that a pair is connected with an axcitatory connection, from the most reliable (first row) to the least reliable (last row) prediction. Use the following 3 tab-separated column format as in the example below:
- gene_A \tab gene_B \tab XYZ
- If the category is DIRECTED, the gene in the first column regulates the gene in the second column. (If both gene_A regulates gene_B and gene_B regulates gene_A, then both lines should be included.) If the category is UNDIRECTED, the order of the genes is irrelevant. XYZ is a connectivity score between 0 and 1 that indicates the confidence level you assign to the prediction that a pair is connected with excitatory connection. (E.g., XYZ = 1 if one element of the pair is deemed to activate the other element with the highest confidence and XYZ = 0 if the pair is deemed to be disconnected, or deemed to interact with an inhibitory connection.) All pairs omitted from the list will be considered to appear randomly ordered at the end of the list with XYZ = 0. Save the file as text, and name it:
- TeamName_Category_EXCITATORY_FiveGene_qPCR.txt
- where TeamName is the name of the team with which you registered for the challenge, and category can be one of the following types of predictions: UNDIRECTED-SIGNED, DIRECTED-SIGNED.
- For INHIBITORY connections:
- Submit a ranked list of gene pairs, ordered according to the confidence you assign to your prediction that a pair is connected with an inhibitory connection, from the most reliable (first row) to the least reliable (last row) prediction. Use the following 3 tab-separated column format as in the example below:
- gene_A \tab gene_B \tab XYZ
- If the category is DIRECTED, the gene in the first column regulates the gene in the second column. (If both gene_A regulates gene_B and gene_B regulates gene_A, then both lines should be included.) If the category is UNDIRECTED, the order of the genes is irrelevant. XYZ is a connectivity score between 0 and 1 that indicates the confidence level you assign to the prediction that a pair is connected with an inhibitory connection. (E.g., XYZ = 1 if one element of the pair is deemed to inhibit the other element with the highest confidence and XYZ = 0 if the pair is deemed to be disconnected, or deemed to interact with an excitatory connection.) All pairs omitted from the list will be considered to appear randomly ordered at the end of the list with XYZ = 0. Save the file as text, and name it:
- TeamName_Category_INHIBITORY_FiveGene_qPCR.txt
- where TeamName is the name of the team with which you registered for the challenge, and category can be one of the following types of predictions: UNDIRECTED-SIGNED, DIRECTED-SIGNED.
Scoring Metrics: We will score the results using the area under the precision versus recall curve for the whole set of predicitons. For the first k predictions (ranked by score, and for predictions with the same score, taken in the order they were submitted in the prediction files), precision is defined as the fraction of correct predictions to k, and recall is the proportion of correct predictions out of all the possible true connections (with the approperiate sign, if the category is SIGNED). Other metrics such as precision at 1%, 10%, 50%, and 80% recall, and the area under the ROC curve will also be evaluated.
Dataset FiveGeneNet2
File name: FiveGene_chip.xls
Description: This dataset contains two time series corresponding to two different treatments. 588 genes from the original Affymetrix microarray data were selected, which include the 5 genes in the synthetic network plus genes known in the literature to be regulated by some of these 5 genes. The 5-gene network, which is a subnet of the bigger network, is oscillating with the cell cycle.
Submission Information: Predictions for dataset FiveGeneNet2 can be submitted in one or more of the following categories: UNDIRECTED-UNSIGNED, UNDIRECTED-SIGNED, DIRECTED-UNSIGNED, DIRECTED-SIGNED.
For UNSIGNED submissions:
- Submit network predictions for dataset FiveGeneNet2 in one or both of the following categories: UNDIRECTED-UNSIGNED, DIRECTED-UNSIGNED. Submit a ranked list of gene pairs, ordered according to the confidence you assign to your prediction that a pair is connected, from the most reliable (first row) to the least reliable (last row) prediction. Use the following 3 tab-separated column format as in the example below:
- gene_A \tab gene_B \tab XYZ
- If the category is DIRECTED, the gene in the first column regulates the gene in the second column. (If both gene_A regulates gene_B and gene_B regulates gene_A, then both lines should be included.) If the category is UNDIRECTED, the order of the genes is irrelevant. XYZ is a connectivity score between 0 and 1 that indicates the confidence level you assign to the prediction that a pair is connected. (E.g., XYZ = 1 if the pair is deemed to be connected with highest confidence and XYZ = 0 if the pair is deemed not to interact.) All pairs omitted from the list will be considered to appear randomly ordered at the end of the list with XYZ = 0. Save the file as text, and name it:
- TeamName_Category_FiveGene_chip.txt
- where TeamName is the name of the team with which you registered for the challenge, and category can be one of the following types of predictions: UNDIRECTED-UNSIGNED, DIRECTED-UNSIGNED.
For SIGNED submissions:
- Submit one network predictions for excitatory connections and one for inhibitory connections for dataset FiveGeneNet2 in one or both of the following categories: UNDIRECTED-SIGNED, DIRECTED-SIGNED.
- For EXCITATORY connections:
- Submit a ranked list of gene pairs, ordered according to the confidence you assign to your prediction that a pair is connected with an excitatory connection, from the most reliable (first row) to the least reliable (last row) prediction. Use the following 3 tab-separated column format as in the example below:
- gene_A \tab gene_B \tab XYZ
- If the category is DIRECTED, the gene in the first column regulates the gene in the second column. (If both gene_A regulates gene_B and gene_B regulates gene_A, then both lines should be included.) If the category is UNDIRECTED, the order of the genes is irrelevant. XYZ is a connectivity score between 0 and 1 that indicates the confidence level you assign to the prediction that a pair is connected with excitatory connection. (E.g., XYZ = 1 if one element of the pair is deemed to activate the other element with the highest confidence and XYZ = 0 if the pair is deemed to be disconnected, or deemed to interact with an inhibitory connection.) All pairs omitted from the list will be considered to appear randomly ordered at the end of the list with XYZ = 0. Save the file as text, and name it:
- TeamName_Category_EXCITATORY_FiveGene_chip.txt
- where TeamName is the name of the team with which you registered for the challenge, and category can be one of the following types of predictions: UNDIRECTED-SIGNED, DIRECTED-SIGNED.
- For INHIBITORY connections:
- Submit a ranked list of gene pairs, ordered according to the confidence you assign to your prediction that a pair is connected with an inhibitory connection, from the most reliable (first row) to the least reliable (last row) prediction. Use the following 3 tab-separated column format as in the example below:
- gene_A \tab gene_B \tab XYZ
- If the category is DIRECTED, the gene in the first column regulates the gene in the second column. (If both gene_A regulates gene_B and gene_B regulates gene_A, then both lines should be included.) If the category is UNDIRECTED, the order of the genes is irrelevant. XYZ is a connectivity score between 0 and 1 that indicates the confidence level you assign to the prediction that a pair is connected with an inhibitory connection. (E.g., XYZ = 1 if one element of the pair is deemed to inhibit the other element with the highest confidence and XYZ = 0 if the pair is deemed to be disconnected, or deemed to interact with an excitatory connection.) All pairs omitted from the list will be considered to appear randomly ordered at the end of the list with XYZ = 0. Save the file as text, and name it:
- TeamName_Category_INHIBITORY_FiveGene_chip.txt
- where TeamName is the name of the team with which you registered for the challenge, and category can be one of the following types of predictions: UNDIRECTED-SIGNED, DIRECTED-SIGNED.
Scoring Metrics: Out of all the predicted pairs of genes, we will select those pairs in which both genes are a subset of the five genes in our five gene network. The score will then be computed as explained in Dataset FiveGeneNet1.
