6. Providing evidence: personalized context-sensitive summarization and question answering
The need to link evidence to patients’ records was stated in the 1977 assessment of computer-based medical information systems undertaken because of increased concern over the quality and rising costs of medical care [103]. The assessment concluded that the quality and cost concerns could be addressed by medical information systems that will supply physicians with information and incorporate valid ?ndings of medical research [103]. The results of medical research might soon become directly available through querying clinical research databases, however to date, ?ndings of medical research can be primarily found in the literature. Following the 1977 report, medical informatics research focused on understanding physicians’ information needs and enabling physicians’ access to the published results of clinical studies. This research provides a solid foundation for NLP aimed at satisfying physicians’ desiderata. The most desired features include comprehensive specific bottom-line recommendations that anticipate and directly answer clinical questions, rapid access, current information, and evidence-based rationale for recommendations [104].
6.1. Clinical data and evidence summarization for clinicians
Unlike the comparatively better researched summarization and visualization of structured clinical data [105–108], summarization of clinical narrative is an evolving area of research. Afantenos et al. surveyed the potential of summarization technology in the medical domain [109]. Van Vleck et al. identified information physicians consider relevant to summarizing a patient’s medical history in the medical record. The following categories were identified as necessary to capturing patient’s history: Labs and Tests, Problem and Treatment, History, Findings, Allergies, Meds, Plan, and Identifying Info [110]. Meng et al. approached generation of clinical notes as an extractive summarization problem [111]. In this approach, sentences containing patient information that needs to be repeated are extracted based on their rhetorical categories determined using semantic patterns. This extraction method compares favorably to the baseline extraction method (the position of a sentence in the note) on a test set of 162 sentences in urological clinical notes [111]. Cao et al summarized patients’ discharge summaries into problem lists [70].
The PERSIVAL project (a prototype system, not currently in use) summarized medical scienti?c publications [112,113]. The summarization module of the PERSIVAL system generated summaries tailored for physicians and patients. Summaries generated for a physician contained information relevant to a specific patient’s record. Each publication was represented using a set of templates. Templates were then clustered into semantically related units in order to generate a summary [112,113].
Based on the semantic abstraction paradigm, Fiszman et al. are developing a summarization system that relies on SemRep for semantic interpretation of the biomedical literature. The system condenses SemRep predications and presents them in graphical format [114]. We hope to see in the future if the above method holds promise for summarization and visual presentation of clinical notes.
6.2. Clinical data and evidence summarization for patients
The online access to personal health and medical records and the overwhelming amount of health-related information available to patients (alternatively called health care consumers and lay users) pose many interesting questions. Hardcastle and Hallet studied which text segments of a patient record require explanation before being released to patients and what types of explanation are appropriate [115]. Elhadad and Sutaria presented an unsupervised method for building a lexicon of semantically equivalent pairs of technical and lay medical terms [116].
Ahlfeldt et al. surveyed issues related to communicating technical medical terms in everyday language for patients and generating patient-friendly texts [117]. The survey presents research on alleviating the lack of understanding of clinical documents caused by medical terminology. This research includes generation of patient vocabularies and matching those vocabularies and problem lists with standard terminologies; generation of terminological resources, corpora and annotation tool; development of natural consumer anguage generation systems; and customization of patient education materials [117]. Green presents the design of a discourse generator that plans the content and organization of lay-oriented genetic counseling documents to assist drafting letters that summarize the results for patients [118].
6.3. Clinical question answering
One of the principal purposes of CDS is answering questions[14]. Questions occurring in clinical situations could pertain to "information on particular patients; data on health and sickness within the local population; medical knowledge; local information on doctors available for referral; information on local social in?uences and expectations; and information on scientific, political, legal, social, management, and ethical changes affecting both how medicine is practiced and how doctors interact with individual patients” [119]. Some questions do not need NLP and can be answered directly by a known resource. For example, the NLM Go Local service19 (which connects users to health services in their local communities and directs users of the Go Local sites to MedlinePlus health information) was established to answer logistics questions by providing access to local information. Questions about particular patients are currently answered by manually browsing or searching the EHR. Answering these questions can be facilitated by summarization (which requires NLP if information is extracted from free-text fields) and visualization tools [105–108]. Facilitating access to medical knowledge by providing answers to clinical questions is an area of active NLP research [120]. The goal of clinical question answering systems is to satisfy medical knowledge questions providing answers in the form of short action items supported by strong evidence.
Jacquemart and Zweigenbaum studied the feasibility of answering students’ questions in the domain of oral pathology using Web resources. Questions involving pathology,procedures, treatments,examinations, indications, diagnosis and anatomy were used to develop eight broad semantic models comprised of 66 different syntactico-semantic patterns representing the questions. The triple-based model ([concept]–(relation)–[concept]) combined with which, why, and does modalities accounted for a vast majority of questions. The formally represented questions were used to query 10 different search engines. Search results were checked manually to find a passage answering the question in a consistent context[121]
The [concept]–(relation)–[concept] triples generated by SemRep can be used to generate conceptual condensates that summarize a set of documents [114], or answer speci?c questions, for example, ?nding the best pharmacotherapy for a given disease[65]. Within the EpoCare project, the same question type is answered by using an SVM to classify MEDLINE abstract sentences as containing an outcome (answer) or not and extracting the high-ranking sentences [122]. The CQA-1.0 system also implements an Evidence Based Medicine (EBM)-inspired approach to outcome extraction [120]. In addition to extracting outcomes from individual MEDLINE abstracts to answer a wide range of questions,the CQA-1.0 system aggregates answers to questions about the best drug therapy into 5–6 drug classes generated based on the individual pharmaceutical treatments extracted from each abstract. Each class is supported by the strongest patient-oriented outcome pertaining to each drug in the class. The EpoCare and CQA-1.0 systems rely on the Patient-Intervention-Comparison-Outcome (PICO) framework developed to help clinicians formulate clinical questions [99]. The MedQA system answers de?nitional questions by integrating information retrieval, extraction, and summarization techniques to automatically generate paragraph-level text [123].
7. Clinical NLP: direct applications of NLP in healthcare
In addition to processing text pertaining to patients and generated by clinicians and researchers, NLP methods have been applied directly to patients’ narratives for diagnostic and prognostic purposes.
The Linguistic Inquiry and Word Count (LIWC)20 tool was used to explore personality expressed through a person’s linguistic style [124]. The LIWC tool (which calculates the percentage of words in written text that match up to 82 language dimensions) was evaluated in predicting post-bereavement improvements in mental and physical health [125], predicting adjustment to cancer [126], differentiating between the Internet message board entries and homepages of pro-anorexics or recovering anorexics [127], and recognizing suicidal and non-suicidal individuals [128]. Pestian et al. demonstrated that the sequential minimization optimization algorithm can classify completer and simulated suicide notes as well as mental health professionals [129].
Another potential clinical NLP application is assessment of neurodegenerative impairments. Roark et al. studied automation of NLP methods for diagnosis of mild cognitive impairment (MCI).Automatic psychometric evaluation included syntactic annotation and analysis of spoken language samples elicited during neuropsychological exams of elderly subjects. Evaluation of syntactic complexity of the narrative was based on analysis of dependency structures and deviations from the standard (for English) rightbranching trees in parse trees of subjects’ utterances. Measures derived from automatic parses highly correlated with manually derived measures, indicating that automatically derived measures may be useful for discriminating between healthy and MCI subjects. [130].
Clinical NLP is also used for medication compliance and drug
abuse monitoring. Butler et al. explored usefulness of content analysis of Internet message board postings for detection of potentially abusable opioid analgesics [131]. In this study, attractiveness for abuse of OxyContin, Vicodin, and Kadian determined automatically (using the total number of posts by product, total number of mentions by product (including synonyms and misspellings), total number of posts containing at least one mention of each product,total number of unique authors, and the number of unique authors of posts referencing any of the 3 target products) was compared to the known attractiveness of the products. The numbers of mentions of the products were signi?cantly different and corresponded to the product attractiveness. Based on this and other metrics, the authors conclude that a systematic approach to post-marketing surveillance of Internet chatter related to pharmaceutical products is feasible [131]. Understanding patient compliance issues could help in clinical decisions. This understanding could be gained through processing of informal textual communications found in the publicly available blog postings and e-mail archives. For example, Malouf et al. analyzed 316,373 posts to 19 Internet discussion groups and other websites from 8731 distinct users and found associations (such as cognitive side effects, risks, and dosage related issues) the epilepsy patients and their caregivers have for different medications [132].
To the best of our knowledge, the applications described in this section are experimental rather than deployed and regularly used in clinical setting. The dif?culties in translation of clinical NLP research into clinical practice and obstacles in determining the level of practical engagement of NLP systems are discussed in the next section.
Most of the above presented methods and systems were developed for speci?c users, document types and CDS goals. Future research might indicate if such systems could be easily retargeted for new users and goals and whether the retargeted systems can compete with those designed for speci?c tasks and clinical systems. Evaluation methods for measuring the impact of NLP methods on healthcare in addition to reliable standardized evaluation of NLP systems need to be developed.
For several issues very important to the future development of NLP for CDS, there is currently only anecdotal evidence and sparse publications. For example, with few exceptions, we do not know which of the reviewed NLP–CDS systems are actually implemented or deployed, and what makes these systems worthwhile. We might speculate that, for example, MedLEE is successfully integrated with a clinical information system because it was developed and adapted, as needed, for specific users and CDS goals, but the reason for its success could also be its sophisticated NLP. We could better judge which features determine whether NLP–CDS systems are applied outside of the experimental setting if we had more data points. We believe it would be valuable to have a special venue for presenting case studies and analysis of applied NLP systems in the near future.
Priorities in NLP development will be determined by the readiness of intended users to adopt NLP. The early successes in NLP and CDS led to high user expectations that were not always met. NLP researchers need to re-gain clinicians’ trust, which is achievable based on better understanding of the NLP strengths and weaknesses by clinicians, as well as significant progress in biomedical NLP. Reacquainting clinicians with NLP can be facilitated by NLP training, well-planned NLP experiments, careful and thoughtful evaluation of the results, high-quality implementation of NLP modules, semi-automated and easier methods for adapting NLP for other domains, and evaluations of NLP–CDS adequacy in satisfying user needs.
We believe NLP can contribute to decision support for all groups involved in the clinical process, but the development will probably focus on the areas for which there is higher demand. For example,if researchers are more eager consumers of NLP than clinicians,NLP research into text mining and literature summarization will continue dominating the field.
The NLP CDS tasks are so numerous and complex that this area of research will succeed in making practical impact only as a result of coordinated community-wide effort.