abstract
- Four different ligand-based virtual screening scenarios are studied: (1) prioritizing compounds for subsequent high-throughput screening (HTS); (2) selecting a predefined (small) number of potentially active compounds from a large chemical database; (3) assessing the probability that a given structure will exhibit a given activity; (4) selecting the most active structure(s) for a biological assay. Each of the four scenarios is exemplified by performing retrospective ligand-based virtual screening for eight different biological targets using two large databases--MDDR and WOMBAT. A comparison between the chemical spaces covered by these two databases is presented. The performance of two techniques for ligand--based virtual screening--similarity search with subsequent data fusion (SSDF) and novelty detection with Self-Organizing Maps (ndSOM) is investigated. Three different structure representations--2,048-dimensional Daylight fingerprints, topological autocorrelation weighted by atomic physicochemical properties (sigma electronegativity, polarizability, partial charge, and identity) and radial distribution functions weighted by the same atomic physicochemical properties--are compared. Both methods were found applicable in scenario one. The similarity search was found to perform slightly better in scenario two while the SOM novelty detection is preferred in scenario three. No method/descriptor combination achieved significant success in scenario four.