93.1% of doctor’s reports created with a non-commercial LLM could be used with only minimal adjustments, according to a recent study conducted by the University Medical Center Freiburg. [JMIR Med Inform 2024;12]. When listening to conversations between doctors and patients, many people expect even more from language models. At medicalvalues, we’re frequently asked, “Can’t something like ChatGPT handle that too?”
Let’s look at three real-life examples from medical practice.
Not a Case for LLMs like ChatGPT: Dusty Folders
A 30-year-old woman arrives at the emergency room. Just in case, she brought a folder of documents. Could these contain critical clues for her acute problem?
To unlock such data from private collections or medical archives, it must be transformed into standardized, processable formats. LLMs like ChatGPT are not the best tools for this job. OCR technologies (Optical Character Recognition), in our view at medicalvalues, offer many advantages here:
- Speed and efficiency: specifically designed to convert printed or handwritten text into machine-readable text.
- Accuracy with structured documents: particularly good at extracting text from structured documents like forms.
- Cost-effective and easy to implement: especially when large volumes of documents need to be digitized.
AI in the Emergency Room: No Room for Scattershot Approaches or Experiments
The patient complains of worsening abdominal pain over the past day. If you ask an LLM like ChatGPT about the next appropriate diagnostic steps, you’ll receive a lengthy response: general tips on medical history and physical examination, a variety of lab tests, imaging, or gynecological assessments. Initiating all these at once would be inefficient, to say the least, and not economically viable.
Moreover, LLMs belong to the category of generative AI technologies, meaning they are designed to create new content autonomously. This also means they might give different answers to the same question, often with a dash of creativity. In many medical situations, this is undesirable.
In such cases, we have found that AI technologies that support guideline-based, personalized step-by-step diagnostics and decision trees, integrating your own SOPs (Standard Operating Procedures), are much more effective. Are such rule-based systems considered AI? Yes, because they can model the enormous complexity of existing knowledge and learn new insights in real-time. Such process-mining technologies, already widely used in other industries, can also help optimize processes in the healthcare system—from prevention to diagnosis and treatment.
Artificial Intelligence for Gazing into the Medical Crystal Ball
Our patient has a conservatively treatable gallbladder inflammation. However, an elevated creatinine level remains concerning before her discharge, with no clear cause. What will happen to the patient’s kidney function?
LLMs like ChatGPT only provide general knowledge on such questions. Textbooks, guidelines, or studies are often not well-represented or reliably included. Better predictions for individual future developments can be made by non-generative machine learning algorithms or deep-learning technologies. These can detect complex relationships and make precise predictions.
Putting It into Words: This Is Where LLMs Like ChatGPT Shine
Tomorrow, our patient can go home—if her discharge summary is ready. Fortunately, an LLM has already drafted it, needing only the doctor’s final touch and approval: one version for the general practitioner, one for the nephrologist, and a layman-friendly version in the patient’s native French.
LLMs are specifically developed to analyze, understand, and generate human-like responses based on text. In the medical field, they are ideally suited for:
- Text analysis, e.g., searching through literature, guidelines, or medical history for relevant information and combining it with specific inquiries.
- Text generation, e.g., medical documentation, informational materials, or support with scientific papers.
- Audience-appropriate communication, e.g., translation into simple language or foreign languages, and tailoring content in terms of style and depth for different audiences.
Leveraging the Full Spectrum of AI “Drugs”
These few examples clearly show that one AI solution alone—even a versatile one like LLMs—is not enough. Just like in medicine, it’s important to first diagnose what exactly AI needs to achieve in each specific case. Only then can the optimal combination of AI “drugs,” in the right doses, lead to successful outcomes at economical prices.