Daniel Marbach - Swiss Federal Institute of Technology LausanneFeb 16, 2008 - 8:52 am
I agree with Alberto de la Fuente that the number of categories in the challenge should be reduced.
(1) The biological significance of undirected signed predictions is not clear to me. For example, if gene A activates gene B and B inhibits A, should the corresponding undirected link be positive or negative? Also, I am not familiar with any method that produces undirected signed predictions as primary output. Thus, I suggest to remove the undirected signed categories in the future.
(2) I agree very much with Pablo Verdes that decomposing directed signed predictions into two separate categories, one for inhibitory and one for excitatory links, is confusing. If you do signed predictions, it is very unlikely that you focus just on excitatory or inhibitory interactions, and indeed the two categories have the same participants. The methods I am familiar with treat positive and negative interactions "symmetrically", i.e., a method would not be expected to perform very good on positive but bad on negative interactions or vice versa. In my opinion, it makes thus sense to rank the methods in a single category "directed signed predictions" instead of two separate categories "directed signed excitatory" and "directed signed inhibitory". As submission format I suggest a single ranked list of predictions, qualified with +/-. Predictions could be scored using a multi-class generalization of the AUC.
(3) In my opinion, the category directed-unsigned can also be removed. This year, all the teams that participated in this category also participated in the directed-signed categories, except one team. It seems predictions in this category were mainly obtained by removing the sign from the directed signed predictions.
In summary, I'm in favor of having only two categories: undirected-unsigned and directed-signed.
Too many categories?
Daniel Marbach - Swiss Federal Institute of Technology LausanneFeb 16, 2008 - 8:52 am
I agree with Alberto de la Fuente that the number of categories in the challenge should be reduced. (1) The biological significance of undirected signed predictions is not clear to me. For example, if gene A activates gene B and B inhibits A, should the corresponding undirected link be positive or negative? Also, I am not familiar with any method that produces undirected signed predictions as primary output. Thus, I suggest to remove the undirected signed categories in the future. (2) I agree very much with Pablo Verdes that decomposing directed signed predictions into two separate categories, one for inhibitory and one for excitatory links, is confusing. If you do signed predictions, it is very unlikely that you focus just on excitatory or inhibitory interactions, and indeed the two categories have the same participants. The methods I am familiar with treat positive and negative interactions "symmetrically", i.e., a method would not be expected to perform very good on positive but bad on negative interactions or vice versa. In my opinion, it makes thus sense to rank the methods in a single category "directed signed predictions" instead of two separate categories "directed signed excitatory" and "directed signed inhibitory". As submission format I suggest a single ranked list of predictions, qualified with +/-. Predictions could be scored using a multi-class generalization of the AUC. (3) In my opinion, the category directed-unsigned can also be removed. This year, all the teams that participated in this category also participated in the directed-signed categories, except one team. It seems predictions in this category were mainly obtained by removing the sign from the directed signed predictions. In summary, I'm in favor of having only two categories: undirected-unsigned and directed-signed.