Data Management focuses on the location, extraction, transformation (e.g. descriptor calculation), and integration of globally distributed data sets describing environmental, toxicological, chemical, biological, and pharmacological information. Its architecture is described.
As an initial approach we will first integrate data from two underlying molecular or chemical repositories:
1. National Toxicology Program (NTP)
A key to understanding the OpenMolGRID data warehousing repository is that it does not integrate information to provide a general repository for querying molecular information (although this will be a useful by-product). Its main purpose is to provide pre-computed data in order to improve the efficiency and effectiveness of subsequent automatic and semiautomatic data analysis and data mining operations. By providing cached computations of frequently used and computationally expensive data processing tasks, it offers value for all OpenMolGRID users now and in the future, where a much wider community is expected.
Further information can be found in the downloads section.