Entry Date:
May 31, 2013

Capturing Patient-Provider Encounter Through Text Speech and Dialogue Processing


Create a system that captures primary medical data mentioned during an encounter between a health care provider and a patient. We use speech-to-text technology to create an approximate transcript of both sides of such a conversation, use natural language processing and machine learning methods to extract relevant clinical content from the transcripts, organize these according to medical conventions, and display the data to both provider and patient to allow them to correct mistakes made by this process. We are applying this in the Pediatric Environmental Health Clinic at Children’s Hospital Boston.

Complete and accurate collection of clinical data in the course of health care is a long-standing goal that has not been achieved either by manual record-keeping or through electronic record systems. This proposed project addresses the problem from the beginning of the clinical process, by aiming to improve the capture of relevant medical facts during the face-to-face interaction between a patient and provider. Instead of relying on the provider’s fallible memory to record facts after the visit, the proposed system will “listen” to the conversation, use automatic speech recognition to produce an (imperfect) record of what was said, and apply a variety of text analysis and extraction methods to create a draft record of the encounter. Further, it will provide an interface that should permit patients and providers to examine the facts that were recorded and to correct and complete them, also using speech as the primary interface.

The projects aims are to develop and integrate the components needed to accomplish this goal, to create a testbed in collaboration with researchers at the environmental health clinic of a children’s hos- pital in which experiments can guide system development and assess progress, and to conduct a series of evaluations that assess a series of objectives. First, the research will characterize the ability of the speech recognition, information extraction and information organization components to process the target conversations. Second, it will evaluate the hypothesis that this system can collect a more complete and accurate record than what is routinely collected. Subsequently, it will evaluate the time taken by clinicians to use the system, the extent to which the system is seen to disrupt the patient-provider encounter, the ability of patients to use the system to make additions and corrections to their records, and the subjective response of both patients and providers to use of the system.

Success in this effort should lead to better clinical care that is based on more complete and accurate data. In addition, clinical data are also becoming an important resource in the conduct of translational medicine research, where improved data are obviously highly valuable.