Clinician Language in Discharge Requests

Using Large Language Models to Predict Transitional Care Breakdown

When a patient leaves the hospital, the discharge summary becomes the bridge between secondary and primary care. But how often are any follow-up actions actually carried out, and can the language of the request itself influence whether they are?

We explored this question using large language models (LLMs). The result? We found that while politeness doesn’t make a difference, language precision does, and that an AI model can predict which follow-up requests are likely to be missed.

The Problem

Hospital discharge summaries often include requests for primary care, e.g. “check bloods in 1 week,” “review medication,” and so on. But studies show that these actions are inconsistently followed.

Sometimes the request seems unnecessary, sometimes the patient doesn’t attend, and sometimes the workload in primary care just makes it hard to keep up.

We wanted to know:

  • Does the way a request is written affect whether it’s completed?
  • And can AI help identify which requests are at risk of being missed?

Some Data

We gathered thousands of discharge summaries from Salford Royal Hospital, and used a langauge model to extract requests from the text:

Diagram showing how requests can be extracted from free-text. A discharge summary is turned into structured data extracting both the request name and the timeframe requested for completion.

We can then compare the requests to primary care records to see if and when any blood tests were actually done:

Diagram showing how requests can be extracted from free-text. A discharge summary is turned into structured data extracting both the request name and the timeframe requested for completion.

We label any requests where the blood test is done within 7 days of the requested timeframe as "concordant":

Diagram showing how requests can be extracted from free-text. A discharge summary is turned into structured data extracting both the request name and the timeframe requested for completion.

Predicting Concordance

Now we have a dataset of requests and concordance, we can train a model.

Diagram showing how an LLM can predict concordance.

The model predicts concordance directly from request text with 83% accuracy.

These metrics indicate that on internal validation, the model utilising linguistic features was both accurate and robust.

Investigating the patterns the model learned, we found that:

  • Precise, clinical language (words like “renal,” “potassium,” “kinase”) predicted higher follow-through.
  • Conditional or vague terms (“if possible,” “consider checking”) predicted non-concordance.
  • Surprisingly, words of gratitude (“thanks,” “thank you”) often appeared in requests less likely to be completed, possibly because they softened uncertain or tentative instructions.

Why it matters

The fact that we can accuratly predict concordance suggests that how we write discharge summaries affects how care is delivered.

More importantly, an AI model can flag which requests are unlikely to be actioned, allowing clinicians to write better discharge summaries.

Imagine an EPR system that quietly warns:

“This request is phrased in a way that historically leads to low follow-through.”