DREAM2 In Silico Network Challenge (DREAM2, Challenge 4)


This archival page describes the challenge exactly as it was presented to the participants. Go to the main DREAM2 Challenge 4 page to download data, view team rankings, cite this work, etc.

Synopsis

Three in-silico networks were created and endowed with a dynamics that simulate biological interactions. The challenge consists of predicting the connectivity and some of the properties of one or more of these three networks.

Datasets

Three datasets, named InSilico1, InSilico2 and InSilico3, were generated using simulations of biological interactions, as described below. The data from InSilico1 and InSilico2 correspond to mRNA levels of gene networks with qualitatively different topologies. InSilico3 corresponds to a full biochemical network, including metabolites, proteins and mRNA concentrations.




Dataset InSilico1

This datasets were produced from a gene network with 50 genes, where the rate of synthesis of the mRNA of each gene is affected by the level of mRNA of other genes.




Dataset InSilico2

The structure of the dataset and submission information for InSilico2 are similar to those of the InSilico1 dataset. However, the topology of the InSilico2 network is qualitatively different from the topology of the InSilico1 network. The InSilico2 datasets were produced from a gene network with 50 genes, where the rate of synthesis of the mRNA of each gene is affected by the level of mRNA of other genes.




Dataset InSilico3

The InSilico3 dataset was produced from a full in-silico biochemical network, that includes 24 metabolites, 23 proteins and 20 genes. The network has transcription, translation, some signaling, and metabolism. Variables are named Mxx for metabolites, Pyy for proteins (more specifically protein forms), and Gzz for mRNA (where xx, yy and zz are numbers between 1 and 24, 23 adn 20 respectively).

Notation: Besides binary reactions, dataset InSilico3 contains biochemical reactions that involve 3 or more molecular species. In order to represent these reactions as the usual networks, we have to agree on a notation to represent these reactions in terms of binary interactions. If we indicate the interaction A represses B, or A consumes B with A ---| B, and the interaction A activates B as A ==> B then we will write the three species reaction x ---> y, catalyzed by enzyme E as the following set of binary interactions

x ==> y
E ==> y
E ---| x

If the reaction between x and y were reversible, x <--> y, then the interaction

y ==> x

is also present. Similarly, the notation for a reaction of the form: w + x ---> y + z catalyzed by enzyme E should be

w ==> y
x ==> y
w ==> z
x ==> z
E ---| w
E ---| x
E ==> y
E ==> z
w ---| x
x ---| w

for the forward reaction. If it the reverse reaction is also present in the system, then the following lines should be added

y ==> w
y ==> x
z ==> w
z ==> x
y ---| z
z ---| y




Submission Information

Predictions for datasets InSilico1, inSilico2 and InSilico3 can be submitted in one or more of the following categories: UNDIRECTED-UNSIGNED, UNDIRECTED-SIGNED, DIRECTED-UNSIGNED, DIRECTED-SIGNED.

For UNSIGNED submissions:

Submit network predictions the corresponding dataset in one or both of the following categories: UNDIRECTED-UNSIGNED, DIRECTED-UNSIGNED. Submit a ranked list of pairs of molecular species ordered according to the confidence you assign to your prediction that a pair is connected, from the most reliable (first row) to the least reliable (last row) prediction. Use a 3 tab-separated column format as in the example below:
A \tab B \tab XYZ
where A and B are mRNA species in datasets InSilico1 and InSilico2, but can independently be a metabolite, a protein or an mRNA in dataset InSilico3.


If the category is DIRECTED, the molecular species in the first column causes a change in the molecular species in the second column. (If both A regulates B and B regulates A, then both lines should be included.) If the category is UNDIRECTED, the order of the molecular species is irrelevant. XYZ is a connectivity score between 0 and 1 that indicates the confidence level you assign to the prediction that a pair is connected. (E.g., XYZ = 1 if the pair is deemed to be connected with highest confidence and XYZ = 0 if the pair is deemed not to interact.) All pairs omitted from the list will be considered to appear randomly ordered at the end of the list with XYZ = 0. Save the file as text, and name it:
TeamName_Category_Dataset.txt
where TeamName is the name of the team with which you registered for the challenge, Category can be one of the following types of predictions: UNDIRECTED-UNSIGNED, DIRECTED-UNSIGNED, and Dataset can be InSilico1, InSilico2 or InSilico3.

For SIGNED submissions:

Submit one network predictions for excitatory connections and one for inhibitory connections for dataset FiveGeneNet1 in one or both of the following categories: UNDIRECTED-SIGNED, DIRECTED-SIGNED.
For EXCITATORY connections:
Submit a ranked list of pairs of molecular species, ordered according to the confidence you assign to your prediction that a pair is connected with an excitatory connection, from the most reliable (first row) to the least reliable (last row) prediction. Use a 3 tab-separated column format as in the example below:
A \tab B \tab XYZ
If the category is DIRECTED, the molecular species in the first column causes a change in the molecular species in the second column. (If both A regulates B and B regulates A, then both lines should be included.) If the category is UNDIRECTED, the order of the molecular species is irrelevant. XYZ is a connectivity score between 0 and 1 that indicates the confidence level you assign to the prediction that a pair is connected with an excitatory connection. (E.g., XYZ = 1 if one element of the pair is deemed to activate the other element with the highest confidence and XYZ = 0 if the pair is deemed to be disconnected, or deemed to interact with an inhibitory connection.) All pairs omitted from the list will be considered to appear randomly ordered at the end of the list with XYZ = 0. Save the file as text, and name it:
TeamName_Category_EXCITATORY_Dataset.txt
where TeamName is the name of the team with which you registered for the challenge, and category can be one of the following types of predictions: UNDIRECTED-SIGNED, DIRECTED-SIGNED, and Dataset can be InSilico1, InSilico2 or InSilico3.
For INHIBITORY connections:
Submit a ranked list of pairs of molecular species, ordered according to the confidence you assign to your prediction that a pair is connected with an inhibitory connection, from the most reliable (first row) to the least reliable (last row) prediction. Use a 3 tab-separated column format as in the example below:
A \tab B \tab XYZ
If the category is DIRECTED, the molecular species in the first column causes a change in the molecular species in the second column. (If both A regulates B and B regulates A, then both lines should be included.) If the category is UNDIRECTED, the order of the molecular species is irrelevant. XYZ is a connectivity score between 0 and 1 that indicates the confidence level you assign to the prediction that a pair is connected with an inhibitory connection. (E.g., XYZ = 1 if one element of the pair is deemed to inhibit the other element with the highest confidence and XYZ = 0 if the pair is deemed to be disconnected, or deemed to interact with an excitatory connection.) All pairs omitted from the list will be considered to appear randomly ordered at the end of the list with XYZ = 0. Save the file as text, and name it:
TeamName_Category_INHIBITORY_Dataset.txt
where TeamName is the name of the team with which you registered for the challenge, and category can be one of the following types of predictions: UNDIRECTED-SIGNED, DIRECTED-SIGNED, and Dataset can be InSilico1, InSilico2 or InSilico3.

Scoring Metrics

We will score the results using the area under the precision versus recall curve for the whole set of predicitons. For the first k predictions (ranked by score, and for predictions with the same score, taken in the order they were submitted in the prediction files), precision is defined as the fraction of correct predictions to k, and recall is the proportion of correct predictions out of all the possible true connections (with the approperiate sign, if the category is SIGNED). Other metrics such as precision at 1%, 10%, 50%, and 80% recall, and the area under the ROC curve will also be evaluated.

Retrieved from "http://wiki.c2b2.columbia.edu/dream/index.php/D2c4full"

This page has been accessed 2,834 times. This page was last modified 02:24, 16 June 2009.

x
Find
Browse
The DREAM Project
Community portal
Current events
Recent changes
Random page
Help
Donations
Edit
Edit this page
Editing help
This page
Discuss this page
Post a comment
Printable version
Context
Page history
What links here
Related changes
My pages
Create an account or log in
Special pages
New pages
File list
Statistics
Bug reports
More...