abstract
- The barrier word method of identifying nominal phrases in text, using a very long barrier word list, was evaluated in two different sets of text. In a sample of 10 paragraphs from the Medical Knowledge Self-Assessment Program of the American College of Physicians, the yield of nominal phrases as a percent of total chunks isolated was 66%. Some 500,000 chunks were isolated from Principles and Practice of Oncology (PPO). 38% of these chunk-occurrences were of chunks which matched to 10,000 concept names in Meta-1.4, the most recent version of the UMLS Metathesaurus. 50 paragraphs from PPO were chosen at random. Co-occurrences of concepts in those paragraphs were reviewed. 42 of the paragraphs had unique or infrequently occurring co-occurrences which described closely the major thrust of the paragraph.