Skip to main content

6 Tips for choosing a natural language processing solution

Anand Shroff, Co-founder, Chief Development Officer Health Fidelity
Written by: Anand Shroff

The digitization in health care has resulted in an abundance of data, and many are grappling with how best to manage and use the data to their advantage. An added challenge in health care data is that much of the most valuable information is in the form of unstructured data, primarily in physicians’ clinical notes inside patients’ medical records.

Historically, analyzing this data has required a human touch—individuals reading through medical records and manually extracting information from the data. Of course, this is incredibly resource-intensive and error-prone, no doubt leaving errors and other mistakes undetected among the sheer volume of records.

Natural language processing (NLP) has emerged as a way to streamline the analysis of that unstructured data, dramatically improving both the speed and accuracy with which health care organizations can turn their big data into smart data. Here’s how leveraging NLP technology can help improve efficiency in data analysis and what to look for when choosing a solution.

NLP 101

A subset of artificial intelligence, NLP is a technology that is able to understand human language. Human language embodies an enormous amount of expressiveness, variety, ambiguity and vagueness. The core of NLP is to understand human language in an automated way. NLP works by processing natural language through its library of words, concepts and relationships to piece together an understanding of the narrative. NLP understands not only individual words, but also the expressions, phrases and sentences that convey meaning. NLP is much more than just a text search; its value lies in its ability to understand grammar, syntax, context, and intent.

For health care organizations, this capability is game-changing when it comes to analyzing clinical narratives. By interpreting both the words and the context, NLP enables organizations to systematically analyze large volumes of patient data with unprecedented efficiency and discover insights that might be missed by human analysts.

Choosing an NLP Solution: 6 Key Factors

Like most other HIT solutions, not all NLP engines are created equal. Understanding the features and differentiators that matter is critical to making a smart decision in choosing a platform that will serve your needs both today and in the future.

If you’re ready to invest in NLP technology, consider these critical factors when choosing the right provider and platform.

  • Clinical language. Health care speaks an entirely different language from other fields—medical language is very different from legal language, for example. For maximum effectiveness, it’s critical to have a clinically oriented NLP, one whose library includes biomedical terms and phrases.
  • Contextual analysis. Understanding and analyzing individual words is only half the battle. A robust NLP engine should also be able to interpret the context in which those words are used. For example, if a physician documents “AF,” he or she could be referring to atrial fibrillation, or it could be someone’s initials. An NLP with sophisticated contextual analysis capabilities will make that determination based on other evidence inside the note—mentions of cardiac-related medications or terms describing heart rhythm appear, for example—and make the appropriate association. Other types of NLP intelligence include negation, certainty, temporality, and disambiguation – all contribute to extracting the most accurate data out of natural language.
  • Accuracy. The accuracy of an NLP system is typically measured in terms of its recall capability—the probability that the system accurately detects the targeted information (e.g., the presence of a medical condition, procedure, diagnosis, etc.). When choosing a solution, look for one with a recall consistently above 95 percent that also includes a mechanism for continuous improvement—in other words, that the system gets smarter over time.
  • Use Cases. Reimbursement, population health, quality improvement, and clinical research are just some of the possibilities for NLP use cases. Do the outputs of the NLP engine support your particular use case? Remember that the NLP’s role is to turn unstructured data into structured data, and the particular format of the output needed will vary depending on the intended use case. Look for a solution that can be flexible in terms of its output taxonomy, including SNOMED, ICD-10, LOINC, RxNorm, and other code systems.
  • Throughput capacity. NLP-enabled analysis is faster than human analysis, but that doesn’t mean it will have the capacity to serve your business down the road. As data volumes increase, look for solutions with high-throughput and real-time performance. The most sophisticated NLP platforms are capable of handling millions of transactions per hour, which will enable your business to keep pace with the avalanche of data that seems to be generated daily.
  • Continuous improvement. The best NLP systems improve their results over time. NLP developers should continue to analyze data and create feedback loops to improve the accuracy of outputs on an ongoing basis. The system might be tweaked with new rules to better reflect the grammatical patterns physicians use in their notes. At the same time, embedded machine learning technology can enable the system to continuously learn and improve its output.

While there’s certainly no substitute for humans in interpretation of natural language, NLP technologies can streamline the extraction of insights from narratives, enabling freeing up human resources to use their expertise where it matters most, rather than combing through gigabytes of data that may be of no use.

By choosing the right solution, health care organizations can gain a more comprehensive, accurate and efficient system for analyzing the growing volume of clinical data to help deliver better quality care at a lower cost.