Model building

The major goal of the task was to build a QSPR model on a real-life data set (20,000 chemical structures). Measurement of the cytotoxicity, preparation of the database upon the experimental data and finding the best QSAR/QSPR model for the measured cytotoxicity values have been accomplished.
All tests have been made from the user perspective. Additionally, the tests have been made using workflows allowing repeatability of any test cases. Test results, and model building procedures, including detailed test conditions were recorded. The system has been tested in many different aspects. The workflow control and automation have been widely tested. There have been workflows prepared for all tasks. The parameters for the preparation steps for model building, as well as for the model definition itself have been optimised. The experimental dataset has been investigated, the chemical structures and the corresponding descriptors have been analysed. Data and descriptor subsets have been selected to ease the model building procedure.
The most important overall result of this test phase was that the model building procedure of the OpenMolGRID system worked properly.
During this test phase, large number of QSAR/QSPR models has been developed on the experimental cytotoxicity values, and the results have been compared to non-linear models built outside of the system. Unfortunately, the statistical parameters of the models were not excellent, but the models were still usable in the next test phase, i.e. prediction. The model built on approximately 20,000 compounds enabled us to test the predictive power of the OpenMolGRID system, and use it in molecular engineering scenarios.

