The Amazon tool no doctors know about — but could change everything

Amazon has come a long way since it started with book sales out of Jeff Bezos’ garage. The tech giant is investing heavily in its online capabilities and amazon web services are by far the biggest provider of cloud computing services. This is of little interest to most doctors. However, the companies recent foray into the field of healthcare, Amazon Comprehend Medical, is a tool that few doctors have heard of but could change every aspect of our practice.

Matthew Stubbs

30th Jun 2019

Amazon Comprehend Medical is a natural language processing service (you don’t need to worry about the details) designed to help medical practitioners extract useful medical information from unstructured (prose) text. Basically, it can read text, and extract useful information from it.

What it does

Amazon Comprehend Medical

Amazon Comprehend Medical will take any unstructured text and return it in a useful structured format. The service will extract medications (with their doses), medical conditions, tests, treatments, procedures, anatomy and protected health information and return it in a format in which it can be analysed as below.

Amazon Comprehend Medical 3

This is not new, but the approach that Amazon has taken means anybody can use this service. You need no prior computer science knowledge, and not a line of code in sight (at least for simpler implementations of the service).

What can you use it for?

Extracting usable information from unstructured text has always been a challenge — and considering this is how most medical information is stored it has hampered research and deep learning within this field. Until now!

Using Amazon’s service any medical practitioner can extract important information from text and analyse it for relationships between content. I have listed possible use cases below, but the list is far from exhaustive.

  • Clinical decision making — by extracting usable content from the medical text we will be able to link together signs and symptoms to produce differentials with a certain probability. Alongside clinical judgement, this will undoubtedly increase the accuracy of diagnosis and result in improved patient outcomes.
  • Clinical coding — hospitals get paid by their coding (at least in the UK). They will get paid a certain amount for a patient with pneumonia, a little more if they’re septic, as well as money for procedures performed, etc. Being able to programmatically extract this information from inpatient notes will save hours of time in a process that is often done manually. It is also likely to increase the accuracy, and ultimately ensure that hospitals are paid appropriately for the work they do.
  • Research — Data science and deep learning shows tremendous promise when combined with Amazon’s Comprehend service. As well as using currently known links between signs and symptoms to improve diagnostic accuracy, by accumulating large amounts of data with known outcomes/diagnoses we will be able to find new correlations between diseases and treatments to improve diagnosis and treatment of poorly understood conditions.

Does it work

I wanted to challenge the service by writing in pure prose using some terms that may be difficult to detect. I settled on using a clinic letter and wrote the following mock letter to feed into the service:

Thank you for your referral for Ms Jane Dow, a 42 year old lady, to my Endocrinology clinic today. You referred her to my clinic due to increased anxiety and a swelling within her neck. You found an increased T4 level with depressed TSH.

Ms Jane Dow has been suffering with ongoing anxiety and difficulty sleeping. In addition to this, she has noted an increased sensitivity to heat with excessive perspiration. On examination of Ms Dow I noted an increased heart rate at approximately 105 bpm. There is a large nodular goitre anterolaterally to the left of her neck. It is not hot to touch and does not move on swallow. There is no exophthalmos.

Ms Dow reports that her Mother suffered from thyroid dysfunction, though she is unsure of the exact diagnosis and whether this was a hyper- or hypothyroid condition.

I will arrange for Ms Dow to have further blood tests including thyroid stimulating immunoglobulin, stimulating hormone receptor antibody and anti-thyroid peroxidase antibody. I will also arrange an ultrasound thyroid scan.

I have started Ms Dow on Propranolol 10mg TDS to control her current symptoms until a definitive diagnosis is found.

Yours Sincerely, Dr Smith

And here is what Amazon Comprehend Medical returned:

Amazon comprehend medical results

So yes, it definitely works, to an impressive degree of accuracy. I was impressed with the ability of the system to detect phrases such as ‘increased sensitivity to heat’ as a symptom. Also by its ability to detect the link between frequency and symptom. For example, increased heart rate is a symptom but it has also correlated this to 105 bpm within the same phrase. I deliberately used abbreviations such as T4 instead of thyroxine, and again, the service had no problems detecting this.

One criticism I have, which you may have already spotted, is its inability to separate entities such as family history. In the document I have stated that the patients’ Mother has a thyroid condition, however, you can see that the service has only detected the diagnosis and not its relation to the patient's mother.

Despite this, the analysis is impressive and is undoubtedly at a point to use in industry level projects and research.

How can you use it

Amazon Medical Comprehend is part of the Amazon Web Services toolkit. To start using the service today you simply need to sign up for an account, which you can do here.


Once signed up you can begin using all amazons web services straight away. There is no monthly fee for Amazon Comprehend Medical, you pay only for what you use. More information on costings for the service can be found here (they vary depending on which API you use). Simply, it is charged per unit of 100 characters at a rate of $0.01 per unit. So analysing a 10,000-word document would cost as little as $1.00.

It is worth noting that in the first 3 months of your subscription there is a free tier of 25k units of text (2.5M characters). Meaning you can analyse 2.5 million characters in the first 3 months, completely free. To put that in context, the average novel is approximately 200,000 characters, so you’re unlikely to run out quickly unless you have a very large dataset.

Analysing audio and images

It is often the case that your data is not in a format you can directly analyse, especially in healthcare. Perhaps you have letters that have been scanned into a database, so losing text data. Or maybe you have hours of audio content you want to analyse. By combining with other amazon services you can solve most of these issues with minimum effort.

  1. Scanned Documents: Combine with Amazon Textract service to extract text from scanned documents and then feed the text into amazons comprehend service.
  2. Audio Data: Combine with Amazon Transcribe to convert your audio data into text which can then be analysed by the comprehend service.

In Summary

Amazon Comprehend Medical service is likely to revolutionise the way we interact with data, and has implications that span all aspects of healthcare.

There is no better time to start experimenting with this service, as its implementation is relatively new and there will be many use cases for such a tool that have yet to be discovered.

I will post an example application for those interested in integrating this tool into there applications in the near future, make you sure you follow Doctors in Tech to get notified on all our content.