Ligand-based virtual screening by novelty detection with self-organizing maps.

abstract

We describe a novel method for ligand-based virtual screening, based on utilizing Self-Organizing Maps (SOM) as a novelty detection device. Novelty detection (or one-class classification) refers to the attempt of identifying patterns that do not belong to the space covered by a given data set. In ligand-based virtual screening, chemical structures perceived as novel lie outside the known activity space and can therefore be discarded from further investigation. In this context, the concept of "novel structure" refers to a compound, which is unlikely to share the activity of the query structures. Compounds not perceived as "novel" are suspected to share the activity of the query structures. Nowadays, various databases contain active structures but access to compounds which have been found to be inactive in a biological assay is limited. This work addresses this problem via novelty detection, which does not require proven inactive compounds. The structures are described by spatial autocorrelation functions weighted by atomic physicochemical properties. Different methods for selecting a subset of targets from a larger set are discussed. A comparison with similarity search based on Daylight fingerprints followed by data fusion is presented. The two methods complement each other to a large extent. In a retrospective screening of the WOMBAT database novelty detection with SOM gave enrichment factors between 105 and 462-an improvement over the similarity search based on Daylight fingerprints between 25% and 100%, when the 100 top ranked structures were considered. Novelty detection with SOM is applicable (1) to improve the retrieval of potentially active compounds also in concert with other virtual screening methods; (2) as a library design tool for discarding a large number of compounds, which are unlikely to possess a given biological activity; and (3) for selecting a small number of potentially active compounds from a large data set.

authors

published in

Journal of chemical information and modeling Journal

keywords

Enzyme Inhibitors
Enzymes
Ligands
Molecular Structure
Time Factors

Digital Object Identifier (DOI)

https://doi.org/10.1021/ci700040r

PubMed ID

17854167

start page

2044

end page

2062

volume

47

number

6

VIVO

Ligand-based virtual screening by novelty detection with self-organizing maps. Academic Article

Overview

abstract

authors

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

start page

end page

volume

number