Purdue pioneers AI application and database to advance cancer genetics research

Photo courtesy of the National Cancer Institute’s Integrated Canine Data Commons.

WEST LAFAYETTE, Ind. – Purdue University’s novel use of an artificial intelligence model has revealed that biological pathways leading to cancer in dogs and humans are more similar than previously known. The study demonstrates enhanced value in studying naturally occurring cancer in dogs to learn more about how to defeat cancer in humans.

The findings, recently published in the journal Frontiers of Oncology, also demonstrate the value of large, accessible databases, such as a new database that Purdue is helping to populate, the National Cancer Institute’s Integrated Canine Data Commons (ICDC).

In the study led by Nadia Lanman, research associate professor of comparative pathobiology in Purdue’s College of Veterinary Medicine, a model developed at the Frederick National Laboratory was trained using data from The Cancer Genome Atlas, a large study of many different types of human tumors, including bladder and brain tumors (gliomas). The model was then used to predict the presence of the same types of cancers in dogs by analyzing cancer sequencing data from dogs with brain or bladder cancer. The results showed that human and canine tumors are similar at the genetic level and that the difference between bladder tumors and brain tumors can be discerned by protein-producing genes. The findings provide added evidence that genetic studies of dog cancer can help us learn more about human cancer.

“We took an AI approach, specifically a deep-learning approach, to probe gene expression profiles of cancers that occur in both humans and dogs,” says Lanman, who also is a member of the Purdue Institute for Cancer Research (PICR). “We built two primary tumor classification tools across species. We tested a number of different machine-learning methods and a convolutional neural network called TULIP ended up being the most powerful and accurate approach we tried.”

A convolutional neural network is an algorithm inspired by the way human brains process visual information. It’s effective at finding and analyzing visual patterns using a large dataset of labeled images. It then trains itself to associate certain patterns or features in the images with specific labels or categories.

The effectiveness of such technology underlines the importance of databases and emphasizes that the largest datasets get the most robust and reliable results. The ICDC database was established to be an ever-expanding dataset capable of advancing research on human cancers through comparative genetic analysis with canine cancer.

“ICDC is a big deal because it’s a place where scientists all over the world can deposit and access data on canine cancer,” says Deborah Knapp, the Dolores L. McCall Professor of Comparative Oncology and Distinguished Professor of Comparative Oncology, and one of the study’s co-authors. “Researchers can use ICDC to pull in genetic data from dogs and genetic data from humans and analyze them simultaneously.”

Knapp, a PICR member who also chairs a steering committee for the ICDC, says the database holds great promise for advancing canine and human cancer research.

“It will be serving an even bigger purpose in the future — to group some cancers by their genetic makeup more so than by the organ in which they started. We are not there yet, but it is definitely a goal we can reach in the future. It’s advancing cancer genetics, which is the most important aspect of this.”

Writer/Media contact: Amy Raley, araley@purdue.edu

Sources: Nadia Lanman, natallah@purdue.edu

               Deborah Knapp, knappd@purdue.edu