Using Natural Language Processing to Monitor Quality in Colorectal Cancer Screening

400 400 Artificial Intelligence in Medicine Inc.

By Jack Golabek, March 22, 2017

Using Natural Language Processing to Monitor Quality in Colorectal Cancer Screening

In January 2007, the Ontario Ministry of Health and Long Term Care announced $193 million in funding over a five-year period to establish a colorectal cancer screening program in the province of Ontario Canada. With the assistance of Cancer Care Ontario and a Clinical Advisory Committee comprised of experts in the field, the “ColonCancerCheck” program was launched April 1, 2008. Colorectal cancer is the second most deadly cancer in Canada and the province of Ontario has one of the highest incidence rates in the world; age standardized incidence rates per 100,000 people of 80.2 in males and 57.7 in females in 2016[i]. The goal of the screening program is to reduce mortality by early detection of colorectal cancer or precursor lesions. Early detection affords a 90% chance of a cure[ii].

[i] “Ontario Cancer Statistics 2016”, Cancer Care Ontario, Toronto 2016

[ii] Last modified May 16, 2016

The provincial colorectal cancer screening program consists of two parts: a laboratory test to determine if there is blood in a person’s stool, the “Fecal Occult Blood Test” (FOBT for short), and a colonoscopy follow-up in case of a positive FOBT result.

When the screening program was launched, research had shown that that the chances of an incomplete colonoscopy procedure were more than three times greater in a private clinic than for procedures done in an academic hospital.[i] As a result funding was originally provided only for hospitals to increase capacity to meet the expected growth demand for colonoscopy procedures. In January 2012, after the original funding had been expended, performing a colonoscopy for colorectal cancer screening became a generally insured service for Ontario residents with a positive FOBT test result, and for those with risk factors for developing colorectal cancer.[ii] Over the ensuing years, colonoscopies were increasingly offered at privately-owned but publicly-funded clinics, driven largely by the desire to reduce wait times. Today, there are more than fifty private endoscopy clinics throughout Ontario.

[i] “Factors associated with incomplete colonoscopy: a population-based study.” Shah HA, Paszat LF, Saskin R, Stukel TA, Rabeneck L., Gastroenterology. 2007 Jun;132(7):2297-303. Epub 2007 Mar 21.

[ii] “Implementation of 2012 Physician Services Agreement – Amendments to the Schedule of Benefits for Physician Services” – OHIP Info Bulletin #4585 Effective January 1, 2012

To ensure high-quality services across the public and private sectors, the Ontario government’s advisory agency, Cancer Care Ontario, published guidelines for colonoscopy quality assurance in 2007, with an update in 2013.[i] The guidelines set out recommendations based on best evidence in respect of training and maintaining competency for endoscopists, institutional quality assurance, and measurable quality indicators for colonoscopy exams. These quality indicators include cecal intubation rate, adenoma detection rate, polypectomy rate and post-colonoscopy colorectal cancer rate.

[i] “A Quality Initiative of the Program in Evidence-Based Care (PEBC), Cancer Care Ontario (CCO) – Guideline for Colonoscopy Quality Assurance in Ontario”, J. Tinmouth, E. Kennedy, D. Baron, M. Burke, S. Feinberg, M. Gould, N. Baxter, N. Lewis and the Colonoscopy Quality Assurance Guideline Expert Panel, October 9, 2007, updated September 9, 2013,

The cecal intubation rate (CIR) is defined as the passage of the scope beyond the ileocecal valve into the cecal pole or terminal ileum. Low CIRs have been associated with increased incidence of post-colonoscopy colorectal cancer. A CIR of 95% is desirable in patients with adequate bowel preparation and no obstructive lesions.

The adenoma detection rate (ADR) is the proportion of patients that have at least one adenoma detected and removed during colonoscopy. ADR is a direct indicator of quality since adenomas are known precursors to cancer. The expected ADR is dependent on the characteristics of the screened population, such as age, sex, and family history of cancer. The guidelines recommend that endoscopists monitor their individual ADR.

The polypectomy rate (PR) is the proportion of patients who have at least one polyp detected and removed during colonoscopy. Studies have shown that PR can be used as a proxy for ADR, and more recently, that sessile serrated polyps may be important cancer precursors as well. The guidelines recommend tracking PR although a specific target rate has not been identified.

The post-colonoscopy colorectal cancer rate (PCCRC) is the proportion of patients diagnosed with colorectal cancer who had a colonoscopy performed 6 to 36 months prior to diagnosis. The guidelines recommend that PCCRC should be monitored at the facility level and at the province level.

Any tissue removed during a colonoscopy is forwarded to a pathology laboratory for examination. The ensuing pathology report describes the nature of the specimens and histologic findings as shown in the examples below.

As can be seen, each pathology report identifies the institution where the colonoscopy was performed, the date, and the individual who performed the procedure. Each report also describes the specimens examined and associated diagnostic findings. As such, analysis of these reports can inform two of the colonoscopy quality indicators. Together with information on the number of patients undergoing the procedure, these reports permit determination of adenoma detection rate (ADR) and polypectomy rate (PR), by physician, for a given institution. Additionally, the types of adenomas detected and the types of polyps removed can be categorized in more detail if desired.

In almost all cases, the information in the pathology reports is expressed in narrative text. To tabulate statistics from these reports requires reading through reports over a given time span and calculating the incidence of polyps, adenomas, and colorectal cancer for each physician and institution by hand. This approach, however, becomes quickly inefficient as the volume of reports grows. In Ontario, approximately 10,000 screening colposcopies are perfumed each month. Moreover, manual review, transcription and tabulation of the results is prone to human error.

Compiling polyp, adenoma and colorectal cancer counts from large numbers of pathology reports was a challenge faced by one of our laboratory customers. The laboratory provides pathology services to a large number of colonoscopy clinics, and wanted to provide data for calculating colonoscopy quality indicators to its client physicians on an on-going basis. The challenge was how to process the large volume of colonoscopy related pathology reports in order to render accurate statistics.

AIM offered a solution using its artificial intelligence (AI) technology to perform natural language processing (NLP) of the pathology reports to extract discrete data from the text, in real-time, as the reports were being released to customers. The concepts to be isolated were as follows:

The solution—AIM’s Rapid Case Ascertainment (RCA) software—inputs colonoscopy pathology reports in HL7 V2.x format. These reports are produced by the lab’s pathology information system (LIS), and are sent to RCA in real-time as the reports are released from the LIS. The data flow is shown schematically in Figure 4.

The RCA software reads each report using natural language processing and determines if the relevant concepts are expressed in a report, taking into account thesauri, equivalent forms of expressions, and positive or negative context. For example, in contrast to simple word searches, the expressions “absence of tubular adenoma” or “no evidence of carcinoid tumour” would be interpreted correctly and not count as instances of adenoma or colorectal cancer. The RCA software stores the identified concepts with each report processed.

The RCA software outputs a file of counts of the above concepts for a specified period of time. These data are input into the laboratory’s web-based physician services portal, where an online report is generated on a monthly basis for each colonoscopy clinic that lab services. An example colorectal cancer screening report is shown below.

The Colorectal Cancer Screening Report allows endoscopy clinics to easily calculate the polypectomy, adenoma, and colorectal cancer rates for each physician on a monthly basis. This information allows clinics to compare performance to provincial rates, and to report quality indicators to the Colon Cancer Check program accurately, rapidly and efficiently.

Natural language processing is at the heart of several software solutions offered by Artificial Intelligence in Medicine Inc. Our E-Path Reporter software is the industry leading automated cancer reporting system for hospitals and laboratories. It uses NLP to interpret pathology reports, identify reportable cancers and report these automatically, via secure networks to hospital and state cancer registries. The E-Path Reporter system is faster and more accurate than alternative methods and ensures that hospitals and laboratories meet regulatory reporting requirements with complete, high quality information at a significantly reduced cost.

Abrevio, the successor software to RCA, is a cancer document consolidation system that converts information embedded in various clinical documents, including pathology reports, into usable data. It uses AIM’s next generation A.I. engine for NLP and inference. Abrevio can be used to speed up the process of cancer tumor abstracting, monitor quality indicators, identify cases for studies and clinical trials, and provide researchers with access to richly annotated cancer datasets.