The DREAM Project: Discussion
Login to the DREAM Project
|
Subscribe to the Mailing List
|
Un-subscribe to the Mailing List
You do not need to be logged in to comment or reply
DREAM 6 Discussion link
Please write any comments on the new site http://www.the-dream-project.org/forum/29 thanks
DREAM 6 Challenge
Hi, Has the DREAM 6 Challenge been posted? This website indicates that the challenge would be posted on July 1.
RE: DREAM 6 Challenge
Working on it as we talk. Hopefuly tonight
Annotations for DREAM5 challenge 4 network 3 microarray conditions
I have been looking back at the DREAM5 challenge 4 dataset and am very interested in seeing the annotations for the microarray conditions. For example, for the e.coli and yeast networks (networks 3 and 4) it would be very interesting to see what experimental conditions (strains, perturbations, etc.) were used. Thanks in advance!
RE: Annotations for DREAM5 challenge 4 network 3 microarray conditions
Hi Alex, I'll look into it.
Ensemble of results
Many of the people at my institute are very interested in the ensemble of all teams' predictions as presented during the DREAM talks. Are there any plans to make these results public?
RE: Ensemble of results
Sam: Yes. We are planning to release the team predictions at the latest by the end of March. Stay tuned!
Evaluation Pearson' and '8mers Evaluation Correlation' challenge 2
Challenge 2 Daniel James - Wellcome Trust Sanger InstituteDec 2, 2010 - 12:09 pm On the results page for challenge 2 the columns for 'Probe Evaluation Pearson' and '8mers Evaluation Correlation' appear to be the same for all teams. Should this be the case? Please look at teams with aggregate rank of 10 and 2, the numbers are slightly different. For the rest of the teams the values are the same
DREAM going forward
There was a great discussion session yesterday. I did not get to provide my comments because I had to leave earlier. Today I talked to Boby Prill who encouraged me to speak them out at this forum. So here I am. As an experimentalist, what motivates me to come to the RECOMB meetings two years in a row is that I would like to see real community efforts such as DREAM can eventually translate into something applicable to the biology world. DREAM has been in existence for five years and will be going forward at least until its mission is accomplished. Speaking of real-world application, I meant not only research and development in computational methodology, but some guidelines or standard protocols that can be easily adopted or followed by non-specialists in the computational/systems biology field. Would it be good or bad for DREAM and the computational/systems biology community at large to devote some efforts into making guidelines/protocols for at least some common biological applications? We know similar efforts such as MIAMI (Minimum information about a microarray experiment) have been successful historically. We might succeed as well if we can work together on this one. One question you might have is timing. Is it too early to do this now? Shall we wait for a couple of more years? According to my own observation, there is an urgent demand in the biological world for such guidelines and/or protocols. Meanwhile, computational methods that have shown consistently good performance are already out there and have been explored nearly to an exhausing point. So, my personal view is that this is the right time. I wrote the above while attending the 2010 Joint RECOMB. And I look forward to active debates as we move to DREAM 6 and forward....
RE: DREAM going forward
I agree. As a start I think that the DREAM data should be internationally considered as benchmarks, so that any new paper describing a novel method for reverse engineering should be expected to have been demonstated on some of the DREAM data. Whenever I review a paper on a novel reverse engineering method, which has not been demonstrated on a DREAM dataset, I will raise this as a mayor issue. Maybe some the editors of relevant journals could be contacted and made aware of the existence of these benchmarks so that they can require authors to comply with these standards, just like editors require authors to provide their data MIAMI compliant.
Challenge 4 evaluation script
Would it be possible to post the evaluation script used to assess network inference methods for challenge 4? We would like to be able to evaluate our method and evaluations thereof. thanks!
RE: Challenge 4 evaluation script
Alex, the evaluation scripts for Challenge 4 are already there for your use (go to the ranking page). Thanks. Gustavo
Challenge 4
I think that it would be interesting to also see, if it's not much trouble, the results after isolating (a) purely biological E. coli and yeast and (b) purely in silico as in the last two years. Specifically, what would have been the ranking of the "in silico winner" if only the two biological databases existed in the challenge. And what would have been the ranking of the "biological winner" if only the in silico database existed. Based on the numbers in the table, I would think that the winners of one case would not do very well in the other. Therefore, I think it will also be interesting if the future validation experiments in S. Aureus included interaction predictions that were among the top in one case but not so good in the other, rather than just overall top predictions.
RE: Challenge 4
I think it is a very interesting question, that will help to dissect the usefulness of in-silico netoworks to inform real life systems. I'll see what can be done about it.
Challenge 2 - bonus round
In the bonus round, we have the task to predict the names of the TFs using "official Mouse Genome Informatics (MGI) website symbols". However, we did not find a list of TFs and corresponding PWMs at this website. Alternatively using TRANSFAC, ... as done for instance in the AMADEUS paper does not return symbols from the "official Mouse Genome Informatics (MGI) website". Could anybody please give hint? Thanks a lot.
RE: Challenge 2 - bonus round
Many people in my school are very interested in all the predictions that all the teams, "as presented in the dream speech. Are there plans to publish these results?
RE: Challenge 2 - bonus round
Hi Jens, The official symbols on MGI are essentially just gene names. We chose MGI symbols arbitrarily to deal with the fact that most genes have multiple names. The bonus round is intended to be challenging, and the majority of TFs in the dataset do not have known sequence specificities. One hint I can give you is that if one of your TF predictions had a sequence that looks something like "TGGTCA", you might be inclined to guess that it is a nuclear receptor, and then possibly narrow it down to a short list of candidates based on known specificities from other species. Matt
Challenge 1, bonus round
Are there any requirements as to the peptide lengths in the bonus round, challenge 1?
RE: Challenge 1, bonus round
As stated in the CHallenge description: Each column should look like S \tab XYZ where S is a peptide sequence of length 15 and XYZ can be any of the letters H, M and L. There are other restrictions imposed to discourage a conservative strategy of generating novel peptides in this bonus round. Please see the Bonus Round section of the challenge description.
Clarification for Challenge 1
In response to a question posted by Chen Yanover: Two non-standard letters show up in the sequences of the data in Challenge 1: X and Z. X means an unknown aminoacid, and Z stands for citrulin. (In a previous posting we incorrectly stated that X stands for citrulin.) Good luck! Gustavo
Dream5, challenge 2, bonus round
Many transcription factors have similar binding sites, and even identical ones. It is theoretically impossible in some cases to identify to which TF a site belongs to. For example, homologous genes usually have the same binding domain; another example is a group of proteins that operate as a single complex - obviously, in both these cases there isn't a single TF that recognizes the binding-site motif, but rather several such TFs. My question is: In the bonus round - do you accept several TF names (i.e., if, say, E2f1 and E2f2 have the same binding motif, are both considered correct answers)? Also, should we assume that the TFs are unique (i.e., each TF is represented at most once in the data)? Sincerely, Yaron
RE: Dream5, challenge 2, bonus round
Thanks for the reply. May I suggest a different scoring method? If the name of the TF is correct, and only its serial number is different (e.g., E2f2 vs. E2f1), then a score of, say, 0.9 should be given. In addition, 3 TF names should be allowed in each entry, and if one of them is correct, then a full point should be scored; likewise, if one of them is the correct name but wrong serial number, then the score should be 0.9. For example, ATF/CREB is a family of TFs that bind to the same motif. ATF itself is a class of TFs, consisting of multiple genes (ATF1, ATF2, ...); likewise, CREB is a class of genes with various subtypes. I don't think it's theoretically possible to identify the specific gene based on the binding motif, since they all bind to the same sequences. So, what should the entry contain in this case? I think ATF and/or CREB are correct answers! Therefore, allowing the ocntest participants to provide several TFs, and giving 0.9 for a correct TF name with a wrong serial number, is a good compromise. In the above example, if the participant's entry contains "ATF1, ATF2, CREB", and the real answer is ATF2, then a full point would be rewarded; whereas if the answer is ATF3 or CREB3, then 0.9 would be given; and if the correct answer is, say, E2f2, then obviously no points will be rewarded.
RE: Dream5, challenge 2, bonus round
Yaron. Thanks for the good points you raise. The problem with adding many transcription factors as possible solution is that if you added all known TFs, one will be the correct. We expect that you will pick one TF (after all there was one TF that was used in the assay). In view of this, if your entry contains k TFs, it will be scored as 1/k, if one of the TF was the actual one, or 0 if none of them was the actual one. Hope you agree with this solution to your question.
DREAM5: Challenge 3, SysGenB
Thanks for posting the 941 genotype probe Ids. I am not a soybean expert, but I cannot find those markers anywhere(google,netaffx,ncbi,soybase). For every genetic marker, I can find a probeset that has a matching substring. This leads me to believe that each genotyping marker is associated with a (probeset and hence ) a gene. Is this right or is there a more formal source of annotation?
RE: DREAM5: Challenge 3, SysGenB
Ok, it is correct that each (or most) genotype marker is associated with a gene. If you look here you can find out which gene: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GPL4592 Madhu, sorry, this is the best I can do: I don't have the precise locations of the markers.
RE: DREAM5: Challenge 3, SysGenB
We had the same problem. To us it seemed like these are probes from the Affymetrix soybean genome GeneChip. Using that we got some annotations but not what we were looking for. We would have liked to have the locations of these SFPs. Any help?
RE: DREAM5: Challenge 3, SysGenB
Sorry for the delay, Hugues. The answer is yes, because all (or most of) the markers are SFPs (Single Feature Polymorphisms), occurring inside genes and identified from the Affy chip data. I will get back to you soon about a more formal source of annotation.
DREAM5 Challenge4 Network2
Hello, everyone! While I\'m dealing with Challenge4 Network2, I got a question. Hope can discuss with you guys. In Challenge4\'s \"Chip Features\" description, it said that:\" Microarrays that are from the same experiment were done in the exact same experimental setting (same experimenter, strain, growth medium, growth phase, etc). \" Does this imply if we quantize the expression data, different rows came from the same experiment setting should have same number of quantization level and value? Thanks for your time~~ :)
RE: DREAM5 Challenge4 Network2
Hi, I think it probably depends on how you do the quantization and how you use the data for inference. I haven't explored quantization of the data, so unfortunately I don't have any advice for you. Maybe someone else does? Best, Daniel
DREAM5, Challenge 2
Hi, it seems that the data has not been normalized. Hence, we followed the procedure described in the RankMotif++ paper. After normalization, we tried to identify positives (following again RankMotif++). Unfortunately, we did not get any positives for some data sets. On extreme example is GMEB2 array ME. What should we do? How are those cases handled during evaluation?
RE: DREAM5, Challenge 2
The RankMotif method identifies its “positive” set of probes as those with intensities greater than the median across all intensities + 4 standard deviations. As you mention, this value is > 66,000 for the GMED2 ME array, which is greater than the intensity of any single probe (and hence there are no positive probes). I would argue again that this is simply a problem with the RankMotif method itself. For example, 4 standard deviations is clearly an arbitrary cutoff, and a better method might attempt to optimize this value for a given array. There is indeed useful information contained on this array- for example, if you take a naive approach and calculate the median score of all 8mers across all the probes, align the top 100 8-mers, and build a position weight matrix, you get a motif with several informative positions. Our evaluations will only be based upon comparing your submitted probe intensity predictions to the actual values, so there is no need for a corresponding “positive” set (although a subset of our evaluations will only consider a subset of the brightest spots).
DREAM5, Challenge 2
Hi, could you provide the predictions from the 8-Mer (cf. RankMotif++ paper) as a reference for the 20 training data sets? This would be great.
RE: DREAM5, Challenge 2
Hello, unfortunately we do not have any results from RankMotif++ calculated for this challenge. The major motivation behind creating this challenge is that we do not believe that the solution implemented by RankMotif (or any other available methods) is optimal. The purpose of the training data is two provide a “practice” set for you to tune your own 8mer prediction algorithms. Results from RankMotif might be incorporated into our evaluations for the purpose of comparison, but no part of the RankMotif algorithm will be used for evaluation purposes.
D5C4: Time series with genetic perturbations
Please note the following difference in how time series with genetic perturbations were simulated in-silico (network 1) and performed in most real experiments (networks 2-4). For the simulated time series, genetic perturbations (deletions and overexpressions) were applied at t=0, i.e., the first time point (t=0) still corresponds to the unperturbed state. In contrast, in most real experiments the genetic perturbations have already been done beforehand using genetic manipulations. In this case, the gene would already be deleted/overexpressed at t=0 (in these experiments, there is typically in addition also a drug perturbation, the goal being to show how the genetically manipulated organism reacts to the drug). Thanks to Alberto de la Fuente for bringing this point to our attention.
D5C4 - questions about the data
I have a few questions about the time series in the real data: (1) For a time series where no perturbation/gene deletion/etc is given, should we assume that some "multifactorial" perturbation has taken place, like in the in silico data? (2) For the time series with a perturbation - some of the series have the perturbation already marked at time t=0; others say "NA" at t=0 and then specify a perturbation at the next time. Is this a difference in notation, or does it signify that the perturbation is sometimes applied before time t=0, and sometimes immediately after time t=0? (Example: Experiments 21 and 22 in Network 2). (3) Can we assume that the time units of the different experiments are all on the same scale? Thank you!
RE: D5C4 - questions about the data
(1) The types of experiments in the compendia are very diverse, because a large part was compiled from the literature and was done by different experimenters in different labs. Time series without (explicit) perturbation are often a control for an experiment with a drug or genetic perturbation (e.g., experiment 47 of network 3 has two time series, one with and one without perturbation P13). Even though no perturbation was applied explicitly, gene expression may vary over time for reasons specific to the experimental setting (e.g., adaptation to the growth medium, nutrient depletion, waste accumulation, etc). Assuming some "multifactorial" perturbation seems reasonable to me, but there may be other approaches and the best model will certainly depend on the specific inference method used. (2) The two notations are equivalent. In both cases, the measurement at t=0 is still the unperturbed state. (3) Time units of different experiments all have the same scale.
DREAM5 Challenge 4: The first network is the in-silico one
For your information, we have added the following sentence to the description of the DREAM5 network inference challenge: "Network 1 is the in-silico network, Networks 2-4 correspond to the real microarray compendia obtained from microorganisms."
RE: DREAM5 Challenge 4: The first network is the in-silico one
Many people in my school are very interested in all the predictions that all the teams, "as presented in the dream speech. Are there plans to publish these results? Annuaire
Phenotype names
The training data does no;t include markers names, probe ids or phenotype names. Is this intended? How will we know which phenotype is which for out predictions table? Not having probe or transcript ids means that we cannot use functional information as prior knowledge. No marker name, no QTL mapping. Is this the idea or was it an unintended omission? Thanks in advance
RE: Phenotype names
Hey Ricardo, They said they'll get you the answer soon, have you heard back yet? There's no data on the blank decks or blank skateboard decks yet either.
RE: Phenotype names
We'll get you the answer soon. Thanks!
RE: Phenotype names
I forgot to mention that I am referring to Challenge 3 B.
DREAM 5 Challenge 2
In both the training and prediction data sets, there is a single probe sequence that seems out of place: CCCGCAGTCAAGCAAAAAAAAATAAAAGTCACCTGTGTGAAATTGTTATCCGCTCTTTTT It contains the linker sequence, but that sequence is followed by four additional T's. Can we assume that this is incorrect, and if so can we get the correct sequence for this particular probe?
RE: DREAM 5 Challenge 2
Hello Michael, There are three probes that have this "odd" appearance. The others are CTGTACAGCGTTCCCTGTGTGAAATTGTTATCCGCTCTTTTTTTTTTTTTTTTTTTTTTT and CTTGCGACATGTCCCTGTGTGAAATTGTTATCCGCTCTTTTTTTTTTTTTTTTTTTTTTT They take on this form because they represent the end of the de Bruijn sequence used to design the probes. In the example you asked about, the de Bruijn sequence ends with the sequence CCCGCAGTCAAGCAAAAAAAAATAAAAGTCA which is followed by the primer sequence CCTGTGTGAAATTGTTATCCGCTCT and then padded with Ts at the end TTTTT You will notice that these probes are flagged as "bad" for all arrays- we recommend just ignoring them. Matt
Challenge 4
"Thus, the lists probably contain incorrectly annotated/predicted TFs that are in truth not regulatory proteins. The scoring, however, will be done on the known transcription factors". Consider a situation where a,b,c are in the list of potential TFs and a,b are in {known} and c is in {incorrect}. a is a TF regulating b. So a prediction c->b is always evaluated as incorrect even though it might be due to a PPI c->a? This wont be a problem in network 4 as there can never be effects originating from c, right?
RE: Challenge 4
Yes, c->b would be evaluated as a \"false positive\". More generally, for in vivo networks the \"negative\" set (absent interactions) of our gold standard is not perfect: there are certainly many unknown transcriptional interactions and other types of regulatory interactions that are incorrectly labeled as negatives. Thus, \"false positives\" according to our gold standards are not necessarily incorrect predictions: they may correspond to newly discovered regulatory interactions, for instance. The assumption is that the gold standards are comprehensive enough to give a meaningful ranking of the inference methods, despite the imperfect negative sets. As you say, for the in silico network the problem doesn\'t arise, because there are no incorrectly labeled edges in the gold standard. (Protein-protein interactions are not explicitly modeled, only transcriptional regulatory interactions are part of the in silico network).
DREAM 1 challenge
To clarify, can any external data be used to predict peptide recognition and binding? For example, external databases of known recognized peptides, etc....
RE: DREAM 1 challenge
You have got some great posts in your blog. Keep up with the good work.
RE: DREAM 1 challenge
Daniel, as stated in the challenge description: "Any publicly accessible information available for studying protein-protein-interactions as well as any approach enabling the determination of rule sets for predicting peptide-antibody affinities might be applied." So please use whatever is available to you.
Updates to DREAM 5 Challenge 2
1) Microarray flags (i.e. bad spots - dust, scratches) were omitted from the originally posted data (files DREAM5_PBM_Data_TrainingSet.txt and DREAM5_PBM_Data_Needed_For_Predictions.txt). Data downloaded prior to June 8, 2010 did not contain a "Flag" column. Data files with the flag column were posted on June 8, 2010. Typically much less than 1% of spots are flagged, but since these spots are suspect and may have aberrantly high or low intensity, they should be masked in training data. Flagged spots will also not be considered in the evaluations. 2) Among the supplementary files are the microarray grid layout maps (files in DREAM5_PBM_Data_GridFiles.zip). Minor changes were incorporated to the grid layout maps after June 8, 2010. [Either one or two lines in each file (out of ~40,000) were slightly changed with respect to the data posted prior to June 8, so that the probe sequences match up better with the sequences in the other files.] Files downloaded prior to June 8, 2010 do not contain these mild updates in the layout maps. Please forward this e-mail to your other team participants.
dream5 challenge #1
In the dream5 challenge #1 data, once in training and two times in test set \'X\' is present in the sequence. Does this indicate the amino acid at this position is unknown?
RE: dream5 challenge #1
How about the more frequent Z?
RE: dream5 challenge #1
Aalt: the X in the peptides stands for an additional aminoacid that usually is not encoded in the human genome. X stands for citrullin. Thanks for the question, and thatnks to Hans-Juergen Thiesen for the answer. Gustavo
Survey Results
For the peptide recognition challenge, the survey said that people participating in the survey would get to see the results. Is there a timeline to release the results to survey participants? Thanks!
RE: Survey Results
I don't think there is a timeline for the blank skateboards. There might be one if you're looking for the blank skateboard decks
GeneNetWeaver (GNW) version 2.0
Dear DREAM challenge participants, We would like to announce that GeneNetWeaver (GNW) version 2.0, the tool that we used to generate the DREAM4 in-silico challenges, has been released open-source on: http://gnw.sourceforge.net GNW can be used, for example, to generate additional datasets for the networks of the challenge, or entirely new benchmarks similar to those of the challenge. We didn't have time to implement all requested features in this version. We plan to release a version 2.1 in February that will give the user more flexibility in defining different types of perturbations to be applied to the networks. We are also working on a version 3.0 that implements the network-motif analysis that we presented at the DREAM4 conference. Best, Daniel Marbach
Self Interaction
Hi, What shall we consider about the self interaction of each gene(In Silico Network Challenge)?(G1 G1=0 or G1 G1=1?). I will be so grateful if you can help me.
RE: Self Interaction
Xinyi, we do not evaluate self interaction, so in the actual prediction, we do not consider G1 G1 or GN GN. (It is neither 0 or 1 as it not evaluated). Hope this helps. Gustavo
Evaluation Scripts
When will the evaluation scripts be posted? I am most interested in the details of the p-value calculation for DREAM4 Challenge 1. If the scripts themselves aren't ready, then would it be possible to have those details posted in this discussion. Thanks.
RE: Evaluation Scripts
Colin, check for the evaluation scripts in http://wiki.c2b2.columbia.edu/dream/data/scripts/DREAM4/ (or go to team ranking --> Click here for the results from DREAM4 --> Download Evaluation Scripts). The way the p-value was computed was different for the different subchallenges. For the kinase and PDZ subchallenges we use a null model corresponding to a matrix with random entries whose columns normalize to 1. In the SH3 challenge, we used a similar scheme, but with matrices of different number of columns, as the absence of an anchor point made it necessary to use a sliding window. Please write to me separately, and I will send you the repsentation that we used in the conference to explain these null models. Best regards, Gustavo
RE: DREAM4 Challenge #1
It should be uploaded later today. The blank skateboards page is now up as well at complete blank skateboards.
RE: DREAM4 Challenge #1
Brian, We'll solve those problems tomorrow. Gustavo
RE: DREAM4 Challenge #1
I am still unable to access the gold standards. When attempting to do so, I arrive at a page ( http://wiki.c2b2.columbia.edu/dream/data/gold-standards/DREAM4/ ) that states I need to register to access the DREAM2 or DREAM3 data, but there is no option to download the DREAM4 gold standards. Thanks, Brian
RE: DREAM4 Challenge #1
Gold standards will be uploaded later today.
Results for bonus challenges for in-silico networks(dual knockouts)
When attempting to view results for this challenge, I get a message: No Files to Display.
RE: Results for bonus challenges for in-silico networks(dual knockouts)
Kevin, these were analyzed months ago. Also, why did you post a link to blank complete skateboards?
RE: Results for bonus challenges for in-silico networks(dual knockouts)
Have they been analyzed yet? I'm interested in the results... Complete Blank Skateboards
RE: Results for bonus challenges for in-silico networks(dual knockouts)
Brian, As the dual knockout experiments were optional, we didn;t analyze those yet. But we will soon, hopefully by the end of the week. Also the Gold standards will be there soon as well. Gustavo
DREAM resuls are ready
Go to http://wiki.c2b2.columbia.edu/dream/results/DREAM4/ and sign in with your team name to see where your team ranked, or just explore the results anonymously. Have fun.
RE: where are the results posted?
Shouldn't they have been posted by now? In the famous words of tony hawk, "An ollie position requires you to place your back foot at the tail of your skateboard and your front foot in between the front and middle part of the longboard shop. Make sure your front foot is not too much at the front or too close to your back foot of the loaded longboards. For higher longboard skateboards, you need to bend your knees as low as possible. The lower you buy longboards, the higher you can jump!"
RE: where are the results posted?
We'll post very shortly.
Dream participation
I successfully uploaded the insilico size 10 data. Is there additional confirmation that I indeed participate with my submission? Thanks, Robert
RE: Dream participation
Your submissions were successfully uploaded. Thanks! Don't forget about carver skateboards and carver trucks
RE: Dream participation
Robert, Your five size 10 network predictions, you five size ten dual knowckouts and the survey have all been succesfully uploaded. Thanks, Gustavo
Bonus round submission problem
When attempting to submit bonus round (double knockout) predictions for a 100 gene network I get the following error message: The entries in the third column, indicating your confidence level in a prediction, must be numeric. Entry 0.5079057 detected. The 10 gene network bonus round predictions went through fine.
RE: Bonus round submission problem
Are you a bride who loves wedding dresses and wants to capture the romance and beauty of an earlier age? Well you are in luck. Maggie Sottero Wedding Dresses are in and very popular with many brides.
RE: Bonus round submission problem
If anyone else is encountering this problem, the solution is to make sure none of your numbers are in exponential format (ie 0.1e-07) in order to pass the filter.
problem uploading - challenge 2
Hello, I'm trying to upload my predictions for the second challenge. I succeed with the "InSilico Size10". For the "InSilico Size100" however, I get an error message that tells me that I should see the cause of the error in red font at the top of the page but there is no such message. For the "Multifactorial" subchallenge, I get a "proxy error" but then the file seems to be uploaded. Thanks for your help, Vân Anh
RE: problem uploading - challenge 2
Thanks for your message, You uploaded only 3 of predictions for the multifactorial. It seems that they uploaded in spite of the error message. Please upload the remaining two. We will check the cause of your difficulties in uploading. Thanks! Gustavo
Problems in uploading DREAM4_Funkoverload_InSilico_Size10_1_dualknockouts
It seems that the upload tool requires data between 0 and 1 in the predictions for the bonus round in Challenge 2, since whenever I try submit the predictions for the double knockouts, it does not accept my file. I don't understand the upper limit, since in my interpretation it could very well be that the maximum is higher than 1 if the network is perturbed (and I also get some values higher than 1), or do I misinterpret the challenge?
RE: Problems in uploading DREAM4_Funkoverload_InSilico_Size10_1_dualknockouts
Nils, sorry for the confusion. In the challenge description we explain that: In all cases, the data corresponds to noisy measurements of mRNA levels, which have been normalized such that the maximum normalized gene expression value in the datasets of a given network is one.". Please normalize by th maximum of your gene expression prediction to have values between 0 and 1. Let me know if you have any further concern.
Problems with upload
Hello! I am trying to upload the PSSMs for the PRD challenge and I keep on getting the error message: teamName SCB_lab status FAILED Any suggestions why that is or what I can do to fix it? My team name is SCB_lab. Thanks!
RE: Problems with upload
Is the uploading fixed yet? I haven't seen any new news lately, so there's no way to know about the skateboard bearing press. Thanks, Todd.
RE: Problems with upload
Please send me your prediction at gustavo_AT_us.ibm.com. I am not sure why it is not uploading. We'll take it from there.
InSilico_Size100_Multifactorial sub-challenge
For the data in the InSilico_Size100_Multifactorial sub-challenge would the submitted networks be evaluated as 'undirected unsigned'? Causal inference using this type of data alone is not possible (I believe....) in contrast to what is provided in other sub-challenges.
RE: InSilico_Size100_Multifactorial sub-challenge
Yes, I too agree with alberto's statements. As do my carver trucks if you catch my drift ;)
RE: InSilico_Size100_Multifactorial sub-challenge
We agree with Alberto, due to the type of data provided in the InSilico_Size100_Multifactorial sub-challenge the undirected networks predictions should be evaluated as well.
RE: InSilico_Size100_Multifactorial sub-challenge
Gustavo, no no, (I see you are getting way to many messages at the last day before the day of the deadline!) ;) my question concerns Challenge 2: InSilico_Size100_Multifactorial sub-challenge, not Challenge 3: SignalingNetworkChallenge, where indeed there are inhibitors and proteins. In Challenge 2: InSilico_Size100_Multifactorial sub-challenge only gene expression steady state levels are provided and the (multifactorial) perturbations are unspecified. Lacking that information would make it (almost) impossible to decide edge directions, one can truly only find associations between gene expression levels. Anyway, if you and the other organizers of Challenge 2 disagree with this, I would only request to ALSO evaluate the submission as undirected network in addition to the directed. Thanks! :)
RE: InSilico_Size100_Multifactorial sub-challenge
Alberto, I believe you can, to some extent anyway, predic direction, given that there are some interventions (inhibitors) that when they affect the soruce should affect the downstream protein, but not viceversa. The challenge was conceived as a directed signed challenged. If you have A --> B and B -->A, then that interaction will count as 2 edges. However, as we don't have the gold standard, the best we will be able to do is to determine the smallest network consistent with your predictions, regardless of direction. But because your prediction is dependent on whether you have an activation or an inhibition, the resulting "new" edge (with direction and sign) will affect your prediction. Let's see what we can learn collectively about this challenge. Thanks! Gustavo
Dream 4 Challenge 1
By specifying that each column in the position weight matrix should sum to 1, isnt is also specified that each column has equal weights in predicting binding? I would like to submit matrices that give more weights to one column wrt to another column. Maybe the sum of all elements in the matrix can be fixed to 1 for normalization? thanks
RE: Dream 4 Challenge 1
Hi Subu. Unfortunately, we can't consider different weightings between columns because the challenge is set. However, we are really interested to see if a more detailed model will do better than the simple PWM, so let us know how your model does compared to the simple PWM when the results of the challenge are made available.
Multifactorial experiment
Hi there, Does anyone know how to simulate multifactorial experiment datasets using the GNW software? Or alternatively, does anyone know where would be possible to find a simulated dataset including the true topology? Thanks in advance!
RE: Multifactorial experiment
The multifactorial experiments are a new feature in GNW version 2.0, which will be released only after the submission deadline for the challenges. We chose not to provide a test dataset with the true topology (I see the interest of having such a test dataset, but we want to avoid that teams \"tune\" their method to details of the benchmarks using a test dataset that would not be available in a real biological application). Best, Daniel
submission deadline
Hi, I would like to know when exactly the submission deadline is on October 15th - at midnight on October14th or midnight on October 15th? Cheers, constanze
RE: submission deadline
Don't forget, seismic trucks are there, too.
RE: submission deadline
Correct, just as Gustavo said. carver trucks on the internet.
RE: submission deadline
Constanze, it\'s at midnight of Oct 15th.
Abstract submission
Hi, I was just wondering whether it's possible to submit more than one abstract for oral/poster presentation during the registration procedure. Thanks! Best regards, Basilio
RE: Abstract submission
Basilio, Yes, you can do it, but each person can only present one accepted abstract. Gustavo
conversion factor training and test data
Dear DREAM4 Signaling Network Challenge organizers, I have a question about the following: "(d) Comparability between training and test sets. The lysate concentration used for the measurements of the training data set (contained in the file SignalingNetworkChallenge_TrainingData.csv) was different from the lysate concentration used for the test data set. Therefore, even for the same phosphoprotein and under the same conditions, the measurement in the training and test data sets could be different. This is why we give the value of the measurement at t=0, as these values could, in principle, be different from the values at t=0 for similar conditions in the training set. Therefore, the predictions at t=30 min have to take into account the baseline value at t=0 of the test set rather than equivalent measurements in the training set." You state 'similar' and 'equivalent' experiments, but all the experiments in the training data are different from those in the test data. Why not simply provide at least one identical experiment in both lysate concentrations to calculate the conversion factor of the baseline levels or just give the conversion factor? It would be ironic if this little issue will spoil someones predictions....
RE: conversion factor training and test data
Thanks very much for your reply! In other words, we don't need to rescale anything, we can just ignore the above issue and submit the 'raw' predictions based on the model obtained from the training data?
RE: conversion factor training and test data
Dear Alberto, Thanks for your question. You are right in that having experimental data for both data sets under the same conditions is important to evaluate the predictions of the test data given the training data. We do have a subset of the experiments done together with the test data set under the same conditions as (a subset of) the training data se. We will use this data to re-scale the results at the time of the evaluation of the submissions.
RE: conversion factor training and test data
Alberto. Sorry for the slow response. Your question is far from being stupid. It's a truly good one. The brains are boiling thinking how to address your problem.Honestly, this is one of the weak points of this challenge, and of the ability of doing true quantitative predictions with this kind of data. We have been trying to dig for more data where the conditions where the same, and we hopefully will give you an answer today. But believe me, we have been thinking. Thanks
RE: conversion factor training and test data
Still no answer....if this is a stupid question I would like to know...that way at least I know I have missed the point....
Submission
I just submitted my solution to the DREAM4 InSilico Size 100 Multifactorial challenge, all the 5 files, but the submission page does not give any indication about whether the files submitted are correct, it does not give any feedback. I would like to know if my submission is fine, my team name is NET2009. Thanks.
RE: Submission
Hey, it doesn't look like your submissions uploaded. Here's a tip: carver skateboards will get it to work next time.
RE: Submission
Your predictions don't appear to have uploaded. Please try again.
DREAM challenge 4/Dream3
We are wondering how you determined the noise level to be around 300, as many measurements exhibit values far below this threshold with very high reproducibility. It seems like this threshold varies for the different antibodies in use, rather than being determined by the general detection method.
RE: DREAM challenge 4/Dream3
Leonidas is correct. blank skateboards
RE: DREAM challenge 4/Dream3
You are right. You can go below 300. We choose 300 to be on the safe side. This threshold varies for the different antibodies in use and the natural abundance of phopshorylated protein in the untreated state (which can vary significantly among cell types). Does this answer your question?
Questions about perturbations
I just wanted to know if a perturbation is time varying or time invariant. e.g., for multifactorial perturbations, a same perturbation has been added to all genes for each experiment, right? for each time course, is the perturbation at time 0 same as the one at time 100? Thanks.
RE: Questions about perturbations
The perturbations are time invariant. For multifactorial perturbations (as well as time-series), each gene may be perturbed by a different amount, but the perturbation is constant in time for a given experiment. Best, Daniel
pertubation experiments
i was kindly wondering, if there is a way to generate some perturbation timeseries as given in the dream 4 challenge using the gennetweaver 1.2. for instance via the gennetweaver console. or should i dive into the source code? regards robert
RE: pertubation experiments
Time-series perturbation experiments (as well as multifactorial steady-state perturbation experiments) as provided for the DREAM4 challenge can't be generated with the currently available version of GNW (1.2). We implemented this newly for DREAM4 in GNW version 2.0, which will only be released after the submission deadline for the challenge. We don't release this version before the submission deadline because we want to avoid that participants "tune" their inference methods to details of the benchmarks that wouldn't be known in a real biological application. Best, Daniel
Another Correction to the Signaling Network Challenge Data (DREAM4, Challenge 3)
A fellow participant identified inconsistencies in the format of the TEST file (DREAM4_TeamName_SignalingNetworkPredictions_Test.csv). (Thanks again, Tarmo.) Some row labels were incorrect. Participants were emailed an updated set of data files. The correct data files are now posted on the download page (http://wiki.c2b2.columbia.edu/dream/data/DREAM4/). We are very sorry for inconvenience. Best of luck to all the teams.
Hints for the In Silico Challenge (DREAM4, Challenge 2)
A participant asked a few questions about the nature of the in silico data that we answered. Just so everyone is on equal footing, here is a transcript.
>> 1. Are the initial points for all of the time series >> (*timeseries.tsv) the same steady state, up to noise? In principle yes. The perturbation is applied at t=0, but did not yet have an effect. Thus, the initial points of each time series are independent samples of the wild-type steady state. >> 2. Are the last time points (t=1000) the same as the initial state, >> up to noise? Not necessarily, because the time from t=500 when the perturbation is removed until t=1000 may not be sufficient for the networks to completely go back to the initial steady state. Also, in exceptional cases the network could go to a different attractor after the perturbation. We did not analyze whether this actually occurs, so this possible issue should be considered part of the challenge. >> 3. More generally, the description of the data leads one to assume >> that both the wildtype and the single deletion networks have a >> unique steady state. Is this assumption correct? We generated the data using numerical simulations, without a detailed analysis of the different attractors that the in-silico gene networks have. Some networks probably have several steady states. The wild-type steady state was defined arbitrarily as one of possibly several steady states of the network. The steady states for the genetic perturbations (knockouts and knockdowns) are those that the network converged to from this wild-type steady state after applying the genetic perturbation. Note that the networks typically converge to a steady state, but as in a biological experiment, there is no absolute guarantee. In exceptional cases, there may be oscillations in the in silico networks. Again, this possible issue is part of the challenge. > 4. One more question: in the time series experiments, are there 5 > distinct perturbation effects (chemicals), or a single perturbation > at 5 different concentration levels? Each time series corresponds to a distinct perturbation.
Formatting Correction to one of the files in DREAM4 Challenge3
Dear DREAM4 participants, We were informed by one of the DREAM4 participants that there was an inconsistency in the training data set of the signaling network challenge. (Thanks Tarmo). If you haven't received my mail with the right file earlier today, please dowload the corrected files: DREAM4_SignalingNetworkPrediction_Data.zip, and note that the file SignalingNetworkChallenge_TrainingData.csv was changed due to a formatting error. The other files were fine. Sorry for the inconvenience. I hope you are having some fun with the challenges this year. Please feel free to give us any feedback you might have. Gustavo
GNW can NOT open tsv/dot/gml file
Hello! I download the GNW 1.2 from http://lis.epfl.ch/?content=research/projects/EvolutionOfAnalogNetworks/ReverseEngineeringGeneRegulatoryNetworks/DREAMChallenges.php when I'm using GNW to open tsv/dot/gml file, it complains: opening file failed, Unhandled exception Is anyone has any idea about this problem? Thank you!
RE: GNW can NOT open tsv/dot/gml file
Thanks to Thomas Schaffter, the problems are fixed. Please download the new version of GNW (1.2.2) and report any persisting issues on sourceforge. Thanks for bringing this problem to our attention!
RE: GNW can NOT open tsv/dot/gml file
So far everything worked like a DREAM, and now problems... We think it's due to the recent Java update. Thomas Schaffter did some tests. Linux i386 (Java 1.6.0_14 up-to-date): works. Mac OS X 10.5.7 (Java 1.5.0_19 up-to-date): works, except for "out of memory" problems for large networks. Windows XP SP3 (Java 1.6.0_13 up-to-date): unhandled exception. We will fix these problems asap, and let you know. In the meantime, you may try using a previous version of java, and let us know the result. Irina, your "unsupported format" error may be due to another problem, please file a support request (https://sourceforge.net/projects/gnw/support) mentioning your platform, java version, and console output. Daniel
RE: GNW can NOT open tsv/dot/gml file
I have the same trouble. To verify that I use good format, I opened GeneNetWeaver, created subnetwork and then saved the subnetwork description in tsv, dot and gml formats. After that, I tried to open these files. GeneNetWeaver cannot open these files! It gave a message "Openining file failed. Unsupported format"
RE: GNW can NOT open tsv/dot/gml file
PS: We do have a user manual, though I'm not sure whether it would be helpful in this case (http://gnw.sf.net/manual/gnw-user-manual.pdf).
RE: GNW can NOT open tsv/dot/gml file
Hi, I need more information to find out what the problem is in your case (usually it should work). As mentioned in the error message, more information is given in the logs (click on \"Console\" in GNW to see them). By the way, the best way to get help with GNW is to use the tools provided by sourceforge for this purpose (e.g., support requests and bug tracking system): https://sourceforge.net/projects/gnw/support. Could you make a support request with a copy of you console output? This may be helpful for other users with the same problem. Thanks, Daniel
RE: GNW can NOT open tsv/dot/gml file
I had the same problem. You can check author site, if he had manual or user guide.
DREAM EVALUATION SCRIPT
Hi ALL I do not why when I use either Dream2 , Dream3 evaluation scripts did not get the prefect ROC curve. This happen when I use the best test file(which contains all correct edges in gold file). I do not why I could not get the perfect ROC currve which should be like curve A in this Figure. http://kjronline.org/abstract/journal_figure.asp?img=v5n1011fig2.jpg&no=354&desc=desc5 Regards
RE: DREAM EVALUATION SCRIPT
Hi Fadhl, (You meant tpr= 0.01, 0.02, 0.03..., right?) I believe the confusion arises because we are thinking of a slightly different way to plot the ROC. We are thinking of fpr as a function of tpr, while the usual way is to think is of tpr as a function of fpr. In non pathological cases there is non confusion, as the plot fpr vs tpr is symmetric around the y=x line, with the first point is at 0,0 and the last point is at 1,1. In your case, which is a little pathological, it matters how we plot. So think of the plot as fpr vs tpr, and you will have a straight line at fpr=0 from tpr from 0 to 1. That's why in the line 168 of the code the AUROC is computed as 1- area of the fpr vs tpr ROC curve, to give the AUROC under the usual representation of tpr vs fpr. Note that if you choose to represent the ROC as fpr vs tpr, the optimal AUROC will be 0, and not 1. The reason we did the calculation in this way is because it facilitated some of our computations. I hope this clarifies your confusion. Best regards, Gustavo
RE: DREAM EVALUATION SCRIPT
The problem is like this: assume you want to produce the perfect ROC curve. So you compare between two gold files one of them with the whole space P+N and the other just with the P examples. the Dream 2 and 3 produce the ROC curve as a vertical line at x=0; which is not the perfect ROC curve. when you trace the code you will find that false positive rate fpr=0 0 0 0 and true postive rate tpr =1 1 1 1 1 1 at all k so the ROC curve=plot(fpr,tpr) will generate vertical line. I have no explanation for this as the code produce perfect PR curve. I appreciate any help
RE: DREAM EVALUATION SCRIPT
I have verified that the script called DREAM3_Challenge4_Evaluation.m available here http://wiki.c2b2.columbia.edu/dream/data/scripts/DREAM3/ works as it is supposed to. It does not actually show the ROC and PR plots. If you would like to see them, you can add the following code at the end of the script.
figure(1)
subplot(2,2,1)
plot(fpr,tpr)
title('ROC')
xlabel('FPR')
ylabel('TPR')
subplot(2,2,2)
plot(rec,prec)
title('P-R')
xlabel('Recall')
ylabel('Precision')
When is deadline for Dream4 challenge?
Hello, our team would like to participate in Dream4 Challenge (problem 2). When is deadline for submission?
RE: When is deadline for Dream4 challenge?
The deadlines for submission are now posted on the DREAM home page: http://wiki.c2b2.columbia.edu/dream/index.php/The_DREAM_Project
Prediction submission for InSilico_Size100_Multifactorial
I would like to submit a solution for the InSilico_Size100_Multifactorial problem. How do I submit the solution files?
RE: Prediction submission for InSilico_Size100_Multifactorial
You are fast ;) The submission is not yet open. Last year, submission was done only few weeks before the deadline. You will be notified with the instructions for submission. Best, Daniel
zeros
The data files in the dream3 InSilico challenge contain some zero concentrations. Are these real zeros, or maybe missing values? Thanks Bert Kappen
RE: zeros
These are real zeros, there are no missing values. In DREAM3, we added normal noise to the simulated data. If a concentration became negative due to the additive noise, we set it to zero. Note that in DREAM4, we use a more realistic model of the noise. Best, Daniel
Gene Five data
I would like to use and reference the two gene-five networks used in the DREAM2 challenge for a publication, do you know who is the author and how to contact so I can ask permission?
RE: Gene Five data
The reference is here: http://wiki.c2b2.columbia.edu/dream/index.php/D2c3
DREAM 2 - In-Silico-Network Challenges. Description
I want to use this data for my research. It will be very helpful for my research if I know the name and functional description of metabolites, proteins and mRNAs have been used to generate these 3 data set.. Please guide me how to get them. Thanking you,
RE: DREAM 2 - In-Silico-Network Challenges. Description
This information should be on the homepage, correct? Your Friend, -Lazar
DREAM2 InSilico networks
I would like to use and reference the two InSilico networks used in the DREAM2 challenge for a publication, do you know who is the author and how to contact so I can ask permission?
RE: DREAM2 InSilico networks
Scott, the author is Pedro Mendes, and you can use it as he made it public though our web site. Also cite the DREAM project, to give us a little visibility. :-) Thanks! Gustavo
challenge results
Will the challenge results and team scores be available some time before the DREAM3 conference, or will they be available only around the time of the conference? Thanks..
RE: challenge results
Roger, We will announce them this week.
Dream2 submited paper
Hi! Are the best dream2 prediction challenges submit any journal papers? Regards
RE: Dream2 submited paper
Yes, the best predictor challenges will be published in a volume of the Annals of the NY Academy of Sciences. SHould appear before the end of the year.
the Dream_Evaluation_Script.m divide by zero
when I use the Dream_Evaluation_Script.m to test my network with dream2 gold standard insilico challenge I get divided by zero error. the test file should be the same size as the gold file size. so is there method to accommodate with this.
RE: the Dream_Evaluation_Script.m divide by zero
Please send me an e-mail to gustavo@us.ibm.com and I will send you the debugged version. Sorry for the belated response.
Insilico challenge: How could from three files get one Network Topology?
what I do if the generated network from heterozygous data is different than generated network from null-mutants and trajectories files? How could from three files get one Network Topology? Regards
RE: Insilico challenge: How could from three files get one Network Topology?
Well, I guess that that depends on your algorithm; indeed that is part of the algorithm. For example, you can decide to use the union (the rational being that different perturbations provide a new perturbation and unravels a new connection). Otherwise, the intersection of the connections sets if you believe that the perturbations tickle all connections at the same time...
where to submit the predictions
I can't find the information. Should we submit the results directly to the organizer?
RE: where to submit the predictions
We will open the site on September 15. (May be earlier, if we can test everything before that.) We'll let you know. Gustavo
more questions on in silico networks
I have two related questions about the in-silico networks challenge. 1) Are there any external regulatory inputs in these networks, or can we assume that each network is self-contained? 2) What is the nature of the perturbations in the trajectory datasets? The data description says that the 50 gene networks were designed to have the same set of perturbations as in DREAM2. Were those perturbations revealed after DREAM2? From what I can tell, the main difference between trajectories is the initial concentration of all the genes in the network. Thanks..
RE: more questions on in silico networks
1) There are no external inputs, the networks are self-contained. 2) The time series data shows how the networks recover from external perturbations. Trajectories were simulated by integrating the networks from a different randomly perturbed initial condition for every time series (only the initial conditions change, we assumed the network structure and parameters remain constant). Good luck! ;)
in silico networks: Ecoli heterozygous?
Am I crazy, or does it make no sense to have heterozygous data for the in silico Ecoli networks? Bacteria are haploid! or at best partial diploids, sometimes.. so what are those datasets doing in challenge 4?
RE: in silico networks: Ecoli heterozygous?
Roger, I guess it's the "freedom" given by the InSilico world. You are right of course that Ecoli is haploid. But the only reason why we call that data Ecoli is because it is a subetwork with a topology of connetions borrowed from the Ecoli GRN. We wanted to keep, for comparison to last year's challenge 4, a set of perturbations that was similar to those of DREAM2. So this year we abused notation, and called heterozygous mutant (which should be read: transcription rate for that gene is half the wild type transcription rate, implementable with siRNA, e.g.) even to the networks with topology borrowed from Ecoli. Sorry for the confusion. We'll add a disclaimer in the description of the challenge. Gustavo
Specialization
Can a registered team address a subset of the four Dream3 challenges? Or must a team field answers to all four in order to qualify for evaluation? Is the assessment based on performance across all challenges? Thank you.
RE: Specialization
Albert, Your team can address any subset of the challenges independently. You can participate to anly one challenge if you so wish, and you will be evaluated on that challenge only. Gustavo
DREAM2 Data for Benchmark
We registered the in-silico challenge of DREAM3 this year, we need some data for benchmark purpose to validate our ideas. However, we are not told how to generate the data. DREAM2 data would be a good choice, but we can only download the golden data, not the observation data. I wonder if the organizers can make DREAM2 data free to DREAM3 participators. Otherwise it is **totally unfair** for new players in DREAM3, compared to those participated in DREAM2 and DREAM3, because the later can use DREAM2 data to validate their methods.
RE: DREAM2 Data for Benchmark
Thank you, Gustavo !
RE: DREAM2 Data for Benchmark
Yes, I tried, but failed. I registered DREAM2 with a new account, but when I click the "DREAM2: Proceed to data." link, it leads to a page without data available. I tried different internet browser and in different computer but all failed. Can you send me this data by email? It will be greatly appreciated.
RE: DREAM2 Data for Benchmark
Xuebing, Sure, you should be able to download DREAM2 data. Can't you? Try registering fpor DREAM2 to download. If you already did this and didn't work, I'll make sure that you can download the data. Gustavo
DREAM3 Challenges
Nice challenges, very nice challenges! Something seems to be wrong with INFO.txt from challenge 1.
RE: DREAM3 Challenges
Alberto, this challenge is yet tba. Will get to it in the next few days.
participate Dream3?
I did not found any details to participate the dream3 challenge. We just established a new prediction pipeline for transcription factor binding sites with some fresh ideas. We are able to validate our predictions in vivo/in vitro. Now we are looking for a challenge:)
RE: participate Dream3?
A real informative blog like this is an exceptionally cool helping resource for a needy information seeker like me! Thanks a lot... online gifts to pakistan
RE: participate Dream3?
Daniela, you can now see the DREAM3 data from: http://wiki.c2b2.columbia.edu/dream/index.php/The_DREAM3_Challenges Have fun, Gustavo
RE: participate Dream3?
This year the DREAM Challenge has been combined with the RECOMB Satellite on Regulatory Genomics and Systems Biology. Please find the necessary information here: http://compbio.mit.edu/recombsat/
participate Dream3?
I did not found any details to participate the dream3 challenge. We just established a new prediction pipeline for transcription factor binding sites with some fresh ideas. We are able to validate our predictions in vivo/in vitro. Now we are looking for a challenge:)
RE: DREAM 3 Information
This year the DREAM Challenge has been combined with the RECOMB Satellite on Regulatory Genomics and Systems Biology. Please find the necessary information here: http://compbio.mit.edu/recombsat/
RE: DREAM 3 Information
We will be posting the challenges probably on June 15th. Stay tuned.
request for comment: Critical Assessment of Mutant Prediction (hypothetical challenge)
Hello all. I am re-posting this as a separate message because my earlier "reply" got buried. We (my colleague Harold Smith and I) would like to get feedback on the idea of a "Critical Assessment of Mutant Prediction" (CAMP), a community experiment in which prediction teams would use systems-biology models to infer the identify of unknown mutations, by using experimental data on the effects of mutations (ideally, data at a variety of levels). As a start, we are planning a feasibility project based on microarray characterization of yeast knockouts. An experimental team would characterize gene expression patterns in a set of yeast knockout mutants, then community prediction teams would be challenged to infer the identities of the knockouts. The CAMP idea (in general) and the pilot project are described in a brief presentation (http://www.molevol.org/camel/projects/CAMP/camp_pilot_rfc.pdf). Let us know what you think. How can we make this work? Is there some entirely different way to do this? Please email a response or (better) keep the discussion here on the DREAM web site (we asked the DREAM organizers and they encouraged us to do so).
RE: request for comment: Critical Assessment of Mutant Prediction (hypothetical challenge)
I agree with Neil, it's the only way to be sure! smoking pipes
RE: request for comment: Critical Assessment of Mutant Prediction (hypothetical challenge)
In addition to the knock-out perturbations you could also consider a gene-expression study of over-expression perturbations: Mnaimneh S, Davierwala AP, Haynes J, Moffat J, Peng WT, Zhang W, Yang X, Pootoolal J, Chua G, Lopez A, Trochesset M, Morse D, Krogan NJ, Hiley SL, Li Z, Morris Q, Grigull J, Mitsakakis N, Roberts CJ, Greenblatt JF, Boone C, Kaiser CA, Andrews BJ, Hughes TR. Exploration of essential gene functions via titratable promoter alleles. Cell. 2004 Jul 9;118(1):31-44
RE: request for comment: Critical Assessment of Mutant Prediction (hypothetical challenge)
Thanks, Neil Clarke, for your comments. The available data (that we know about) consist of profiles for 276 deletion mutants chosen by Hughes, et al 2000. This paper shows that, even when using only a single growth condition, most mutants can be distinguished from wild-type at loci other than the deleted locus, which means that, in principle, our proposed CAMP pilot experiment could work. The complete citation and the data for the Hughes paper are here: http://www.rii.com/publications/2000/cell_hughes.html . The set of 276 knockouts represents less than 5 % of yeast genes, so it won't be hard to avoid them.
RE: request for comment: Critical Assessment of Mutant Prediction (hypothetical challenge)
It is a good idea, though you\'ll have to be careful to avoid genes for which deletion expression data is already available. We expect to be generating some related data soon that I expect we'll provide as a test set for DREAM3. This will probably take the form of expression data from three strains deleted in a single transcription factor. The challenge would be to predict the effect of double (or perhaps triple) deletions of these genes.
RE: request for comment: Critical Assessment of Mutant Prediction (hypothetical challenge)
Arlin and Harold, this is a really nice idea! I am looking forward seeing this pilot project turn into a competition like the DREAM. There are already approaches that make explicitly use of network models to infer identities of perturbations (e.g. Chemogenomic profiling on a genome-wide scale using reverse-engineered gene networks Diego di Bernardo, et al. (2005) Nature Biotechnology; 23, 377 – 383) and this CAMP would thus be a nice extension from the DREAM competition: inferring networks is one thing, being able to predict something with them is another. This could be nicely tested in the CAMP. How can we make this work? I don't know....make some training data available (to formulate a network model) and then the knock out data to see how well the network models predict them. Or let the teams find the data themselves to build a model to use in the predictions. Again, nice project, I hope you get a lot of feedback and get it going soon!
What should we predict?
(1) I agree with Pablo Verdes that the primary goal of reverse engineering is identifying the system, and not predicting data. From a black-box model that only predicts data we cannot learn much about the functioning of a biological system. Furthermore, not all reverse engineering methods construct a predictive model for the data. (2) It has been argued that since a large class of networks may be consistent with the data, reverse engineering methods should not be compared on a single network prediction. I agree, but I believe that the current format (a ranked list of link predictions) is a very good one, precisely for this reason. This list is *not* a network, and in a sense it is even wrong to interpret it as a network. Instead, the list gives for every link the estimated confidence level (e.g., the posterior probability given the data and the prior knowledge) that it is present in the gold standard network. For example, we obtained this list by combining the information contained within an ensemble of different networks that fit the data well. (3) Methods should be compared on several gold standard networks to compare their performance (at least for the in silico benchmarks, where many gold standards can easily be generated).
RE: What should we predict?
Hello all. I would like to share some ideas on "what should we predict?", and get your feedback. We (my colleague Harold Smith and I) imagine a Critical Assessment of Mutant Prediction (CAMP) experiment in which prediction teams would use systems-biology models to infer the identify of unknown mutations, by using experimental data on the effects of mutations (ideally, phenotypic data at a variety of levels). To get this project started, we are planning a feasibility project based on microarray characterization of yeast knockouts. An experimental team would characterize gene expression patterns in a set of yeast knockout mutants, then community prediction teams would be challenged to infer the identities of the knockouts. The CAMP idea (in general) and the pilot project are described in a brief presentation, which I hope you will read: http://www.molevol.org/camel/projects/CAMP/camp_pilot_rfc.pdf Let us know what you think. How should we design the pilot project so that its neither too easy nor too hard? Please email us or (better) keep the discussion here on the DREAM web site (we asked the DREAM organizers and they encouraged us to do so).
RE: What should we predict?
Dear Neil, let's imagine for a moment the following hypothetical scenario: (1) gene X influences both Y and Z, but there is no causal, mechanistic biological coupling between Y and Z: Y <- X -> Z; (2) Y is some pretty intrincate nonlinear function of X: Y=F(X); and (3) let's further assume that the X -> Y and X -> Z interactions are of a similar character, so that Z is related to X through the same function F but, say, with double intensity: Z=2*F(X). In a "predicting-data" framework, the strongest link of the inferred network will (incorrectly) connect Y and Z. The reason is that, within this network, Y is the best possible predictor of Z (and viceversa), in particular if the model family is not flexible enough to approximate the intrincate function F. Notice that if we make an empirical, data-driven model on the correct skeleton Y <- X -> Z, this model will not be optimal in a mean squared prediction error sense --a model connecting Y and Z will perform better. However, if we wanted to control the levels of Z, the "predicting-data" framework would be incorrectly directing our efforts towards affecting Y instead of X. In my opinion, in this case as well as in all the artificially generated ones, the gold-standard network should be the one that actually generated the data. It may not be the best predictive model now, but it will certainly be so in the future, when we try to actively manipulate the system.
RE: What should we predict?
Daniel, Pablo: While I agree that the goal is to "identify the system", I just don't see any other way to know how well you have done that except by predicting data. If its true that "not all reverse engineering methods construct a predictive model for the data", then the methods need to be modified so that they do make predictive models for data . Otherwise its just data fitting. Maybe it would be helpful to the discussion if you or other coulld give explicit examples of what you would consider to be a legitimate procedure for constructing a "gold standard network". My concern is that DREAM could end up testing our ability to think the same way, rather than whether we can explain reality (data). But I'd like to hear of counter-examples that people consider to be well-defined gold-standard networks that are free of model bias
Too many categories?
I agree with Alberto de la Fuente that the number of categories in the challenge should be reduced. (1) The biological significance of undirected signed predictions is not clear to me. For example, if gene A activates gene B and B inhibits A, should the corresponding undirected link be positive or negative? Also, I am not familiar with any method that produces undirected signed predictions as primary output. Thus, I suggest to remove the undirected signed categories in the future. (2) I agree very much with Pablo Verdes that decomposing directed signed predictions into two separate categories, one for inhibitory and one for excitatory links, is confusing. If you do signed predictions, it is very unlikely that you focus just on excitatory or inhibitory interactions, and indeed the two categories have the same participants. The methods I am familiar with treat positive and negative interactions "symmetrically", i.e., a method would not be expected to perform very good on positive but bad on negative interactions or vice versa. In my opinion, it makes thus sense to rank the methods in a single category "directed signed predictions" instead of two separate categories "directed signed excitatory" and "directed signed inhibitory". As submission format I suggest a single ranked list of predictions, qualified with +/-. Predictions could be scored using a multi-class generalization of the AUC. (3) In my opinion, the category directed-unsigned can also be removed. This year, all the teams that participated in this category also participated in the directed-signed categories, except one team. It seems predictions in this category were mainly obtained by removing the sign from the directed signed predictions. In summary, I'm in favor of having only two categories: undirected-unsigned and directed-signed.
RE: Too many categories?
About point 1: Algorithms using correlation coefficients will produce as primary output. Interestingly, most submissions to the undirected categories came from algorithms that identify directed networks primarily, which then were made undirected. About point 3: Directed-unsigned should not be removed. This is an important category, since it evaluates how algorithm are able to identify the structure of the network. Finding weight-signs is a next step, quantifying weights the next, etc. Still I agree with you that most algorithms identify directed weighted networks, and that this provides information on the directed signed structure. How do we judge algorithms that perform weak in directed signed categories, but very good after throwing aways signs and directions?
need to use "gold standard" knowledge to evaluate methodologies
First, thanks to Gustavo for setting this up. Right after this discussion forum was set up, I wrote a draft set of comments but never got around to finishing it. I'll have to do that. In the meantime, I'm posting here my response to the email that was sent recently asking us to remember that information on the gold standards is the IP of the folks who provided the data for the Challenges. Let me say at the outset that I do appreciate the work involved - and the potential risk to publishing priority - that is incurred by those who put together the Challenges. Sincere thanks for that. I do think, though, that it is absolutely necessary for the continued success of DREAM that the gold standards be divulged to teh predictors, preferably before the meeting but certainly before publication. The following is the main text of the email response I sent earlier today: In order to figure out what worked - and, more importantly, what failed - we *have* to be able to use the list of "gold standards". Fortunately, I *do* know which of the 200 genes we were given were considered true positives. I was able to extract that information from the precision-recall and ROC curves provided to us by the organizers, based on our prediction. This knowledge of the "gold standard" set is absolutely essential to analyzing what worked and what didn't. Those of you who were in NY may remember that I used that knowledge to show that we would have been much better off if we had only used our expression data analysis. We hurt ourselves considerably trying to include gene ontologies, predicted binding sites, ARACNe, publically available ChIP data, etc. Without knowing what the gold standard set was, I would not have been able to figure this out. I would gotten up and said that we did all these different things, and you (and I) would probably have come to the conclusion that we did something smart by incorporating these different terms. In fact, that's the wrong conclusion, but the only way we know that its wrong that is because I was able to figure out what was considered the gold standard set. The talk would have been almost meaningless without this - and the same goes for the paper that we are writing. I have no intention of identifying the gold standard genes in my paper. There wouldn't be any point, anyway - it doesn't matter to the analysis what the gene names are. However, I *do* need to do analyses that rely on knowing which of the genes are in the gold standard set. I honestly don't see how these analyses could possibly infringe upon the intellectual property of those providing the Challenge set, or affect in any way their ability to publish or patent. However, if anyone disagrees with this, I would welcome further discussion before we get much further in the publication process.
RE: need to use
Of course I agree completely with Neil's point that, for a project like this to work optimally, the Gold Standards should be available a soon as possible to the public.
RE: need to use
Neil, why would you want to 'reverse engineer' the gold standard from the ROC curves? If there is a web-tool to use for your evaluations you don't need to bother...and the owners of the Gold Standard can be happy.
RE: need to use
yes, it is an interesting idea to provide a web site that evaluates alternative prediction. But if people are so concerned about inappropriate use of the gold standard, I can't see them being reassured by this solution. Unless I misunderstand the idea, it should be as easy for me to learn the gold standard list from such a site as it was to infer the list from the ROC and PR curves that we provided based on our predictions. I've said it before and I'll say it again: unless DREAM convinces the providers of the data that it is fair use for the predictors to use the gold standard list in assessing what worked and what didn't, there is very little point to the DREAM exercise. Imagine the analogy to CASP and protein structure prediction. What if predictors were not allowed to see the structures they were trying to predict, either at the CASP meeting itself, or subsequently while preparing their papers for publication. What would they be able to say that would be of interest to anyone?. If that were the way CASP were run, I'm pretty sure we never would have gotten beyond CASP 1. Have all crystallographers and NMR spectroscopists been willing to make coordinates available to hundreds of CASP predictors in advance of their own publication? No. But many have been willing, and without them the CASP experiments could not be conducted.
RE: need to use
Good idea Alberto. I'll see if it can be implemented.
RE: need to use
A possible solution would be for the organizers to set up a 'web-tool' that allows people to evaluate their results to Challenge 1. This way no-one needs to actually know the Challenge 1 Gold Standard, but is still able to evaluate alternative trials.
RE: need to use
Neil and all, Here is a conundrum: 1) We all agree that the DREAM exercise would be most useful with full disclosure of the gold standards. 2) However, in order to obtain data for the DREAM challenges we need to ensure that the data owners feel comfortable enough sharing that information before publication. In the case of the BCL6 target challenge, the fact that Neil Clarke could figure out which the gold standards were from the PR and ROC curves allowed him to get to richer conclusions than he could have obtained without that information. (Those are the conclusions that make DREAM useful.) I believe that Neil should be able to use that information provided that he doesn't disclose the true postive and negatives before publication by Andrea and collaborators. I agree with Neil that providing the target identity to facilitate analyses with the commitment of the participants to not disclose the actual targets, would not affect in any way the data owner's ability to publish. But the data owners will have the last word on this.
DREAM 3?
Unfortunately I just found out about DREAm 2 and so was unable to participate. Are there any plans for a DREAM 3 next year? Is there a mailing list to receive notification of future events?
RE: DREAM 3?
David, I will add you to the mailing list. No problem. We are in the planning phase of DREAM3. I look forward to your participation.
Food for Thought from DREAM2
At the end of DREAM2 there were a number of good ideas that people suggested. Here are the few of them that I could capture. If you made the comment, please feel free to ellaborate on it, as in some cases I could only capture the general idea, but none of the details. 1) Make the data (used in the challenges) comparable accross years. 2) Invite the pharmaceutical industry to participate more actively in the reverse engineering challenges. 3) Have some ongoing benchmark in the website. 4) Create a protocol for the comparison of reverse engineering methods. 5) It is unclear whether the networks underlying the data can be inferred from the available data. This is the issue of model identifiability. There is also the isuue of model distinguishability: there can be more than one model consistent with the data. Which one is the "right" one? 6) Related to the previous item, there is the question of what is it that we should infer? The "network" that we are trying to reverse engineer should not be what we are predicting. May be the emphasis should be on predicting experimental measurements. In other words, instead of predicting networks, we should be predicting data. 7) There was a suggestion to add the methods that were used in the network prediction challenges on the web. 8) Some people think that it is necessary to clearly say who the participating teams are, not just the best performers. I would like some feedback on this, as there are pros and cons in both cases. 9) Some people felt that we were ignoring the last 30 years of molecular biology in the design of the challenges. The question is how to make predictions in addition of what is known. 10) At the end of DREAM2 there was a 100% agreement that a DREAM3 network inference challenge would be useful. Do you agree? These are the ideas and suggestions that were made at the conclusion of DREAM2. Please feel free to comment on any of these or on anything else.
RE: Food for Thought from DREAM2
I found this post while surfing the web for freebies.There seemed to me something cheap ed hardy alarming in such easy delights. In my ed hardy sale heart was desire to live more ed hardy clothing dangerously. I was not unprepared for jagged herve leger dress rocks and treacherous, shoals it I could only have change-change and the http://www.edhardy-buy.com/ exicitement of unforeseen.Thanks for sharing this article.
RE: Food for Thought from DREAM2
Daniel, I must agree with you. The DREAM competition is foremost of all a great learning experience. And we could learn from highest ranked as well as lowest ranked (for example, my colleagues and I learned a lot by trying to understand why in Challenge 1 and 5 we performed so badly, while in Challenges 3 and 4 very good). But of course, given the competition spirit of this project, not all teams will be happy to be publicly announced :) But personally I am very curious about all methods used, also the bottom ranked ones.
RE: Food for Thought from DREAM2
Alberto de la Fuente wrote: "In a marathon only the top runners are announced. No-one seems to care about who finished 467th place ;)". Even though the declared goal of organizing a marathon is not to scientifically compare the performance of the runners and understand, for example, why runner A performs good in one condition, but runner B performs better in another condition, the complete ranking with the identities and times of all participants is actually published ;) (http://www.nycmarathon.org/results/index.php) :) As a compromise, one could only publish the identities of the top 50% or so (of course ideally with an abstract or paper describing the method). In my opinion, the goal is to compare and analyze the performance of different methods, and not just to declare a "winner" that performed better than some other undisclosed methods, on an undisclosed gold standard, for unknown reasons ;)
RE: Food for Thought from DREAM2
My comments on the 10 issues raised by Gustavo Stolovitzky 1) Make the data (used in the challenges) comparable across years. I don’t agree with this. As time goes by datasets become different: new experimental technologies will produce data that couldn’t be produced before. We should move along and focus on up-to-date datasets even if it leads to challenges which are not comparable to those of other years. For example, there is one important issue that has not been addressed at all by the challenges and I think for the DREAM3 competition it should be emphasized: Gene Networks consist of thousands of genes but usually only tens to hundreds of experimental observations are available, several orders of magnitudes less than the number of genes. It will be very important to evaluate how different methods perform at different amounts of data. Ideally there will be a challenge involving simulated data on a network of about 5000 genes with a range of experimental data sizes obtained through for example single gene perturbations (like in the ‘in-silico’ datasets of the DEAM2 challenge). Datasets contain for example 50, 500, 1000, 2500 and 5000 observations. There will be methods that perform very well on the larger datasets, but it is unlikely that such datasets will appear in the near future. The main goal of such challenge is to find out how many observations really are necessary and which methods to use on which amounts of data. 2) Invite the pharmaceutical industry to participate more actively in the reverse engineering challenges. Certainly a good idea. In that case there should be a challenge that has immediate relevance to the pharmaceutical industry, for example a challenge with the goal to rank disease related genes based on a network approach like the work of Diego di Bernardo in his Nature Genetics paper: Chemogenomic profiling on a genome-wide scale using reverse-engineered gene networks Diego di Bernardo, et al. (2005) Nature Biotechnology; 23, 377 – 383 as applied in recent paper by Ayla Ergün for ranking cancer related genes: Ergün A, et al. (2007) A network biology approach to prostate cancer. Mol Syst Biol 3: 82 3) Have some ongoing benchmark in the website. A very good idea. 4) Create a protocol for the comparison of reverse engineering methods. I think the DREAM organizers did a great job in setting a standard on how to compare methods based on significance of the AUCs. Such general protocol must be based on these or similar ideas. 5) It is unclear whether the networks underlying the data can be inferred from the available data. This is the issue of model identifiability. There is also the issue of model distinguishability: there can be more than one model consistent with the data. Which one is the "right" one? There is a large body of theory about ‘model equivalence’ showing that given data it is often impossible to infer a single model, but rather a ‘class’ of indistinguishable models. Given a particular network, the whole equivalence class can be obtained. To be able to give a score of goodness to a method one has to consider the whole class rather than the ‘true’ network topology alone. For Directed Acyclic networks obtaining the class is somewhat trivial, but for Directed Cyclic networks, which are of course far more relevant in this case, not really (see for example: http://ftp.andrew.cmu.edu/pub/phil/thomas/equivprf.ps) . 6) Related to the previous item, there is the question of what is it that we should infer? The "network" that we are trying to reverse engineer should not be what we are predicting. May be the emphasis should be on predicting experimental measurements. In other words, instead of predicting networks, we should be predicting data. This is indeed another very valid way of evaluate a network model. Rather than precisely reflecting the ‘true’ underlying network, a model must be able to predict data that has not been used to infer the model. This also refers to what I wrote under issue 2: a network model could be used to rank disease related genes without necessarily reflecting the ‘true’ network topology precisely. 7) There was a suggestion to add the methods that were used in the network prediction challenges on the web. I agree and am willing to provide descriptions and/or software implementations of the methods I used, so that they can be used by others. 8) Some people think that it is necessary to clearly say who the participating teams are, not just the best performers. I would like some feedback on this, as there are pros and cons in both cases. It makes more sense to announce only the best performers. In a marathon only the top runners are announced. No-one seems to care about who finished 467th place ;) 9) Some people felt that we were ignoring the last 30 years of molecular biology in the design of the challenges. The question is how to make predictions in addition of what is known. I don’t really understand this concern….so, no comments. 10) At the end of DREAM2 there was a 100% agreement that a DREAM3 network inference challenge would be useful. Do you agree? Yes!
RE: Food for Thought from DREAM2
Pablo Verdes wrote: "On a comment by Alberto de la Fuente: I believe that decomposing the analysis in directed-undirected and signed-unsigned is a good approach. The benefits of decomposing the study of signed predictions in excitatory and inhibitory, though, are less clear to me." Ok, maybe it is a good idea to have these categories. In any case it should be acknowledged that inferring an 'undirected' network is only 'half' the job of inferring 'directed' networks. I agree with Pablo that there should be one overall signed category rather than decomposing into excitatory and inhibitory.
RE: Food for Thought from DREAM2
4) I agree. On this respect, I also agree with the comment by Barbara Di Camillo on the possibility of making some statistics on the performance results by considering several networks. This would apply, in particular, to small networks like the one in Challenge 3, where it's not very clear how a single mistake affects the resulting scores. 6) I believe that the biologically relevant question is not whether we can predict data, but whether we can infer the underlying network under realistic scenarios, and if not, under which conditions. In my opinion, we shouldn't predict data. 7-8) I agree with 7 but disagree with 8. Since this is a learning experience, it would be positive that the methods be briefly described. On the contrary, I don't see an important advantage in disclosing the participants' identity. On a comment by Alberto de la Fuente: I believe that decomposing the analysis in directed-undirected and signed-unsigned is a good approach. The benefits of decomposing the study of signed predictions in excitatory and inhibitory, though, are less clear to me.
RE: Food for Thought from DREAM2
My comments on point 9 and 10: 9. Maybe some of the challenges can be designed such that the network is partially known. The challenge is to incorporate a priori knowledge in the prediction. This might not be directly related to real existing problem, but at least, knowledge in this direction can help in integrating reverse engineering and the existing body of knowledge. 10. I agree.
Welcome to the DREAM discussion forum
Welcome to the Discussion Page of the DREAM project. This forum is intended to discuss anything related to Reverse Engineering in biological systems. Please feel free to comment on recent papers, interesting conferences, the recent or future DREAM conference, suggest gold-standards for future DREAM challenges or anything else you feel is appropriate for this forum. We look forward to lively and intellectually provocative conversations.
RE: Welcome to the DREAM discussion forum
Here are my contribution of the best website ever.
flying v guitar
Ebook Value Depot
Best Android Application
Programming in Android
Dancing with the Staff
nintendo ds consoles
arctic cat snowmobile
vhs dvd recorder
buy runnerrugs
RE: Welcome to the DREAM discussion forum
Tony Lama Boots the way more to wear..
RE: Welcome to the DREAM discussion forum
The training data does not include the names of the markers, the probe ID or phenotype names. Is this expected? How will we know that the phenotype is that the predictions of the table? No ID or transcripts probe means that we can not use functional information as prior knowledge. No name tag, no QTL mapping. It is the idea or was it an oversight? Thanks in advance credit card processing
RE: Welcome to the DREAM discussion forum
Wow, very awesome project. Any new breakthroughs lately? buy longboards
RE: Welcome to the DREAM discussion forum
Hey AJ, It's really pretty simple - just read everything that people have posted. You'll understand what they're talking about soon enough. If you want to check out a cool longboard shop after reading this, check out that link.
RE: Welcome to the DREAM discussion forum
I completely agree with the issues raised and that there should be a DREAM3 challenge. I think, though, that the number of categories should be reduced. Gene Networks are directed networks, so the goal is to infer directed networks, there is no need for an undirected category. I agree that from certain types of data one can only infer undirected networks, but having perturbation data, time-series and knowledge which are TFs (like in the DREAM2 challenges) causal inference is possible.
RE: Welcome to the DREAM discussion forum
Point 5: Maybe it is important to score the Reverse Engineering algorithms taking into account which network interactions are identifiable. A poor score could reflect a poor identifiability of the system given the available data rather than the method performance. This point is also related to the problem of using data able to stimulate different parts of the system under study. Point 9: The simulator we developed and presented at DREAM2 can be used to some extent to analyze the performance of Reverse Engineering approaches that include a priori knowledge in the process of learning the network. In particular, it can be used to test reverse engineering methods where, besides expression data, information on the network topology or regulatory interaction is partially available, either in terms of transcription factors known to regulate certain genes or in terms of proteins known to interact to regulate transcription. I give my availability to help preparing the new challenges. Moreover, I think that it is important to evaluate the average performance of Reverse Engineering algorithms (not just the performance on a single network) and this is certainly possible in the context of simulated data. Point 10: I think that a DREAM3 network inference challenge would be useful

web
catoo - sccOct 17, 2011 - 5:55 am
still say it is only a problem with the method RankMotif itself. For example, four standard deviations is clearly an arbitrary cutoff, and a better method would be to try to optimize the value of a given matrix. In fact, there is useful information in this table Directory Website