Project Lead(s): Izhar Wallach
Issue
Pesticides are extensively used in developed and developing countries to increase crop yield.
However, pesticide use results in an estimated 26 million poisoning cases and 220,000 deaths annually, so accurate testing is required to measure pesticide safety.
Solution
Crematoria proposed screening pesticide molecules against human biological targets in silicon, to ensure pesticide safety.
Previous work by Crematoria has shown that computational models can be effective in predicting adverse drug reactions in humans.
They have developed a technology that uses novel statistical methods to extract patterns governing molecular activity from massive bioactivity data sets and tested the software in over 102 disease-related systems, covering diverse protein categories such as kinases, proteases, nuclear receptors, GPCRs and ion channels.
This project was an attempt to extend this work into the agrochemical domain, by curating a panel of toxicity-related targets from humans and other species.
The first step was an extended and detailed assessment of the predictive performance of the technology; the team increased the number of disease targets over which they evaluated the predictions from 13 to 102.
Outcome
Results showed that the method was not sufficient to identify safe pesticide molecules against human biological targets and the team ran out of time to complete the project.
They did show that the performance of their predictive technology was more accurate than existing, state-of-the-art computational approaches. Furthermore, the variance in their results were smaller.
These best-in-class results derive from the novel use of binding data, which is measured in solution as opposed to typically used x-ray co-crystals.
However, when trying to identify potential off-target effects, the error rate for any target is multiplied by the number of targets against the performance of a molecule and was being analyzed.
Given even a moderate false-negative or false-positive rate, making predictions over a set of one hundred targets would mean some effects could either be missed consistently or every trial would set off spurious warnings.
While accuracy improved with more data and longer training times, the performance of the software was already on an asymptotic curve and, to achieve our desired accuracy, the required quantity of training data and computation scaled exponentially.
Further development of the technology was halted to identify limitations imposed by the choice of machine learning model.
A total of $1,000,000 was received from a combination of federal, provincial and private grants for the project.
The researchers are currently working with a major pharmaceutical company to assess the ability of the software to recapitulate their experimental assay results, and assess its applicability to their research pipeline.