Chemography: the art of navigating in chemical space. Academic Article uri icon

start page

  • 157

end page

  • 166

abstract

  • Combinatorial chemistry needs focused molecular diversity applied to the druglike chemical space (drugspace). A drugspace map can be obtained by systematically applying the same conventions when examining the chemical space, in a manner similar to the Mercator convention in geography: Rules are equivalent to dimensions (e.g., longitude and latitude), while structures are equivalent to objects (e.g., cities and countries). Selected rules include size, lipophilicity, polarizability, charge, flexibility, rigidity, and hydrogen bond capacity. For these, extreme values were set, e.g., maximum molecular weight 1500, calculated negative logarithm of the octanol/water partition between -10 and 20, and up to 30 nonterminal rotatable bonds. Only S, N, O, P, and halogens were considered as elements besides C and H. Selected objects include a set of "satellite" structures and a set of representative drugs ("core" structures). Satellites, intentionally placed outside drugspace, have extreme values in one or several of the desired properties, while containing druglike chemical fragments. ChemGPS (chemical global positioning system) is a tool that combines these predefined rules and objects to provide a global drugspace map. The ChemGPS drugspace map coordinates are t-scores extracted via principal component analysis (PCA) from 72 descriptors that evaluate the above-mentioned rules on a total set of 423 satellite and core structures. Global ChemGPS scores describe well the latent structures extracted with PCA for a set of 8599 monocarboxylates, a set of 45 heteroaromatic compounds, and for 87 alpha-amino acids. ChemGPS positions novel structures in drugspace via PCA-score prediction, providing a unique mapping device for the druglike chemical space. ChemGPS scores are comparable across a large number of chemicals and do not change as new structures are predicted, making this tool a well-suited reference system for comparing multiple libraries and for keeping track of previously explored regions of the chemical space.

PubMed Identifier

  • 11300855

volume

  • 3

number

  • 2