abstract
- Over the last few years many articles have been published in an attempt to provide performance benchmarks for virtual screening tools. While this research has imparted useful insights, the myriad variables controlling said studies place significant limits on results interpretability. Here we investigate the effects of these variables, including analysis of calculation setup variation, the effect of target choice, active/decoy set selection (with particular emphasis on the effect of analogue bias) and enrichment data interpretation. In addition the optimization of the publicly available DUD benchmark sets through analogue bias removal is discussed, as is their augmentation through the addition of large diverse data sets collated using WOMBAT.