Entry Date:
September 9, 2003

Medical Text Analysis


For better or worse, most clinical data accessible to computer processing is still in the form of unstructured natural language. Although great strides have been made in formalizing the content of medical descriptions, with the exception of billing data and (in many places) lab results and pharmacy orders, very little is actually stored in such formal vocabularies as SNOMED, ICD9, etc. Instead, doctors' and nurses' notes, reports of all sorts of tests, referral documents, discharge summaries, plans, and most other documents on which clinical care is based still use "free text" as their representation.

To enable general-purpose language processing tools to manipulate medical text, we must augment their typically non-technical vocabularies with a large medical lexicon. This paper presents a heuristic method for translating lexical information from one lexicon to another, and applies it to import lexical definitions of about 200,000 word senses from the UMLS's Specialist lexicon to the lexicon of the Link Grammar Parser.