Two people with identical cancers – same type, same stage, same tumor size, same location – walk into different hospitals. This cancer has been studied extensively through clinical trials, case studies, and meta-analyses. There are experts worldwide who have dedicated decades to studying and treating this type of cancer. If you gathered all these experts in one room, they would agree on an evidence-based course of treatment (or further tests) for either patient.
In an ideal world, both patients would receive this optimal treatment at any hospital they visit. But the reality of medicine is complex – doctors work under time pressure, have different training backgrounds, and may not have immediate access to the latest research. This is where clinical guidelines come in. Guidelines distill the vast body of medical evidence into structured recommendations, helping standardize care across different hospitals and clinicians. While individual responses to treatment vary, following evidence-based guidelines gives patients the best chance for positive outcomes.
Every year, top oncologists who specialize in a particular type of cancer lay out and enumerate the different types and situations they might see in care.
-
What if there’s early stage breast cancer, with a previous partial mastectomy, and the patient is currently pregnant?
-
What if there’s invasive cancer in the breast ducts, and the tumor is between 0.6 and 1cm long?
For each situation, the panel reviews existing clinical evidence and recommends an action. That could be ordering certain tests, prescribing a course of treatment, or scheduling future check-ins to see how the situation is changing. The panel meets regularly to review new evidence and update the guidelines. There are several groups around the world who all draft different guidelines. Some courses of treatment aren’t available in some countries, so guidelines are regionally applied.
The National Comprehensive Cancer Network (NCCN) is an alliance of 32 leading cancer centers in the United States. Each year, they bring together panels of multidisciplinary experts for each cancer type. These panels review the latest research, debate the evidence, and work to reach consensus on the best approaches to screening, diagnosis, treatment, and supportive care. The resulting guidelines—“The NCCN Clinical Practice Guidelines in Oncology”—are the most comprehensive and widely-used standards for cancer care globally.
This is a page from the NCCN breast cancer guidelines:
Each of those hyperlinks (like BINV-6) is its own page with its own flow chart, pointing to more pages. As a doctor trying to interpret guidelines for a patient, your job is to apply the patient’s situation to the flow chart, continuing until you hit some information that you don’t know (like the HER2 status), in which case you order tests to find it out. Or you hit a recommendation for a course of treatment.
It’s not perfectly standardized. Sometimes there’s multiple options for treatment, each with similar levels of support. Sometimes there’s a patient situation that is relevant to treatment but isn’t enumerated in the guidelines. Doctors still have to make judgment calls, even when they work with guidelines. But the space of possible judgment calls is smaller. And the chance that a patient receives a treatment even though another treatment has stronger clinical support is lower.
Centers of excellence like Mayo Clinic and MD Anderson achieve better cancer outcomes for several reasons: They’re well funded, their oncologists are highly specialized, and they have broader access to potentially life-saving clinical trials. But one other important factor is their systematic approach to care, including robust processes for implementing clinical guidelines. Many of these institutions also play a crucial role in developing the NCCN guidelines, contributing their expertise and research.
At any hospital, care can deviate. Criteria can be missed. Maybe a postmenopausal patient is given a treatment that is better supported for premenopausal patients because it was missed in her charts. Doctors see hundreds of patients, and even with access to guidelines, it’s inevitable that something will be overlooked.
There’s so many different ways to improve cancer outcomes. You can discover new drugs. You can improve surgical techniques. But an immediate, universal, and attainable approach is to reduce the gaps in evidence-based care.
Part of the issue with guidelines is their format. The guidelines represent hundreds of thousands of hours of expert oncologists’ time. But the end result is a dense and hard to navigate PDF.
Imagine an oncologist faced with a rare subtype of breast cancer they see perhaps once every few years. They need to first locate the correct PDF version, then identify the right section based on the cancer’s characteristics, then follow multiple hyperlinks across different pages while keeping track of various patient factors – all while managing their heavy patient load. The guidelines are also constantly being revised, with new versions releasing as separate PDFs. It’s easy for doctors to miss updates or continue referencing outdated versions that have gotten buried in their files among other clinical documents.
At their core, guidelines are decision trees. The teams that have drafted them have spent an considerable effort wading through evidence and corralling it into a decision tree representation. But the guideline documents that describe these trees are difficult to follow.
With properly structured data, machines should be able to interpret the guidelines. Charting systems could automatically suggesting diagnostic tests for a patient. Alarm bells and “Are you sure?” modals could pop up when a course of treatment diverges from the guidelines. And when a doctor needs to review the guidelines, there should be a much faster and more natural way than finding PDFs.
The organizations drafting guidelines should release them in structured, machine-interpretable formats in addition to the downloadable PDFs. Ideally, all guideline organizations would agree on the same format, and all patient data systems would be able to interpret that format. But that sort of data approach isn’t their specialty.
Recently, I had the chance to work with some talented oncologists as a software engineer. It gave me a window into this world of cancer care and guidelines that I never expected to see. The problem of these guidelines being trapped in PDFs stuck with me, even after I moved on to other projects. I couldn’t stop thinking about these crucial decision trees sitting in these dense documents.
So I built a small proof-of-concept tool to see if I could extract the guidelines into something more structured that machines could understand. If we could get the guidelines into a proper format, we could build better interfaces for doctors to navigate them, and maybe even get computers to help follow them.
I started by defining a schema that could represent most of the information in the NCCN guideline. The guidelines look like trees of information. Because some references can loop around, there can be cycles. So the whole set of guidelines for a type of cancer breaks down into a few disjointed directed graphs.
Each point in the graph is represented as a node, whether it is a condition, qualifying characteristic, treatment option, or something else. The nodes also store the associated references and footnotes along with them.
Each arrow in the flow chart or reference from one page to another is represented as a relation between nodes.
To construct the full guideline tree, I perform a first pass over the document with an LLM to create references for each page in the guideline. These page references are the parent nodes for the top nodes on each page. When another node points to a certain page, it points to that page’s node.
Then, I use an LLM to read through the guidelines in small chunks of pages, and extract the guidelines in a JSON representation of the schema, passing in the page references:
I ran this extraction on the NCCN Breast Cancer Guidelines and saved the nodes and relations to a database. I ended up with 271 nodes in total.
The next step was building a graph viewer. I used the React Flow library, and built an interface that displayed all the root nodes (the pages that didn’t have incoming references). Clicking on a node reveals its children, and you can follow a train of nodes down the line.
I wanted to understand how this structured representation would make it easier to go from a patient case to guideline recommendations. I built an agent that could take a patient history and traverse the graph. First, it finds the top-level node that most closely matches the patient’s situation. Then, it looks at the children node and chooses the one that matches the patient most closely. It repeats this until it arrives at a leaf node, or it does not have enough information to continue.
You can try the tool for yourself here.
The data in the demo has not been reviewed and contains errors and omissions. Please don’t use it for any clinical decisions.
I spent a few hours working on the extraction process (and even less on the agent), and you could achieve more accurate results with more effort. I sit around 70–80% accuracy right now. LLM extraction is an easy way to prototype this, and with human review it would be reliable, but ideally these guidelines would be drafted in a structured way from the beginning.
Other people are working on this. I purposefully didn’t look around much before starting this, because that usually kills my motivation to build anything. If you’re working on this, or work with guidelines, or are building a product you think this could be useful for, please reach out. I’d love an excuse to spend more time on this and collaborate.
The data is still only semi-structured. Information like causal relationships and how nodes should be evaluated still requires natural language understanding of each node. Future work could define an even more structured schema that would be easier to evaluate.